@appthreat/atom

atom is a novel intermediate representation for applications and a standalone tool that is powered by chen.

https://github.com/appthreat/atom

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

application-analytics code-analysis exploit-prediction intermediate-representation reachability-analysis supply-chain variant-analysis vulnerability-analysis
Last synced: 6 months ago · JSON representation

Repository

atom is a novel intermediate representation for applications and a standalone tool that is powered by chen.

Basic Info
Statistics
  • Stars: 71
  • Watchers: 3
  • Forks: 3
  • Open Issues: 39
  • Releases: 142
Topics
application-analytics code-analysis exploit-prediction intermediate-representation reachability-analysis supply-chain variant-analysis vulnerability-analysis
Created over 2 years ago · Last pushed 6 months ago
Metadata Files
Readme License Codemeta

README.md

atom (⚛)

atom is a novel intermediate representation for applications and a standalone tool powered by the chen library. The intermediate representation (a network with nodes and links) is optimized for operations typically used for application analytics and machine learning, including slicing and vectoring.

Our vision is to make atom useful for many use cases such as:

  • Supply-chain analysis: Generate evidence of external library usage including the flow of data from sources to sinks. atom is used by OWASP cdxgen to improve the precision and comprehensiveness of the generated CycloneDX document.
  • Vulnerability analysis: Describe vulnerabilities with evidence of affected symbols, call paths, and data-flows. Enable variant and reachability analysis at scale.
  • Exploit prediction: Predict exploits using precise representations of vulnerabilities, libraries, and applications.
  • Threat-model and attack vectors generation: Generate precise threat models and attack vectors for applications at scale.
  • Application context detection: Generate context useful for summarization and risk-profile generation (e.g. services, endpoints, and data attributes).
  • Mind-maps for applications: Automate summarization of large and complex applications as a developer tool.

and more.

npm

Languages supported

  • C/C++
  • H (C/C++ Header and pre-processed .i files alone)
  • Java (Requires compilation)
  • Jar
  • Android APK (Requires Android SDK. Set the environment variable ANDROID_HOME or use the container image.)
  • JavaScript
  • TypeScript
  • Python (Supports 3.x to 3.13)
  • PHP (Requires PHP >= 7.4. Supports PHP 7.0 to 8.4 with limited support for PHP 5.x)
  • Ruby (Requires Ruby 3.4.5. Supports Ruby 1.8 - 3.4.x syntax)
  • Scala (WIP)

Installation

atom comprises a scala core with a Node.js wrapper module. It is currently distributed as a npm package.

shell npm install -g @appthreat/atom atom --help

Install cdxgen npm package to generate a Software Bill-of-Materials (SBOM) which is required for reachables slicing.

shell npm install -g @cyclonedx/cdxgen --omit=optional

container usage

```shell docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom --help

podman run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom --help

```

Example for java project.

```shell docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom -l java -o /app/app.atom /app

podman run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom -l java -o /app/app.atom /app

```

atom native-image (Advanced users only)

atom is available as a native image built using graalvm community edition.

shell curl -LO https://github.com/AppThreat/atom/releases/latest/download/atom-amd64 chmod +x atom-amd64 ./atom-amd64 --help

On Windows

pwsh curl -LO https://github.com/AppThreat/atom/releases/latest/download/atom.exe .\atom.exe --help

NOTE: Commands such as astgen, rbastgen, phpastgen, etc. are not bundled into this native image. Install the npm package @appthreat/atom-parsetools to get these commands.

shell npm install -g @appthreat/atom-parsetools which astgen which phpastgen

CLI Usage

``` Usage: atom [parsedeps|data-flow|usages|reachables] [options] [input]

input source file or directory -o, --output output filename. Default app.⚛ or app.atom in windows -s, --slice-outfile export intra-procedural slices as json -l, --language source language --with-data-deps generate the atom with data-dependencies - defaults to false --remove-atom do not persist the atom file - defaults to false --reuse-atom reuse existing atom file - defaults to false -x, --export-atom export the atom file with data-dependencies to graphml - defaults to false --export-dir export directory. Default: atom-exports --file-filter the name of the source file to generate slices from. Uses regex. --method-name-filter filters in slices that go through specific methods by names. Uses regex. --method-parameter-filter filters in slices that go through methods with specific types on the method parameters. Uses regex. --method-annotation-filter filters in slices that go through methods with specific annotations on the methods. Uses regex. --max-num-def maximum number of definitions in per-method data flow calculation - defaults to 2000 Command: parsedeps Extract dependencies from the build file and imports Command: data-flow [options] Extract backward data-flow slices --slice-depth the max depth to traverse the DDG for the data-flow slice - defaults to 7. --sink-filter filters on the sink's code property. Uses regex. Command: usages [options] Extract local variable and parameter usages --min-num-calls the minimum number of calls required for a usage slice - defaults to 1. --include-source includes method source code in the slices - defaults to false. --extract-endpoints extract http endpoints and convert to openapi format using atom-tools - defaults to false. Command: reachables [options] Extract reachable data-flow slices based on automated framework tags --source-tag source tag - defaults to framework-input. Comma-separated values allowed. --sink-tag sink tag - defaults to framework-output. Comma-separated values allowed. --include-crypto includes crypto library flows - defaults to false. --help display this help message ```

Sample Invocations

Generate an atom

```shell

Compile java project

atom -o app.atom -l java . ```

shell atom -o app.atom -l jar <jar file>

shell export ANDROID_HOME=<path to android sdk> atom -o app.atom -l apk <apk file>

Create reachables slice for a java project.

shell cd <path to repo> cdxgen -t java --deep -o bom.json . atom reachables -o app.atom -s reachables.json -l java .

Pass the argument --reuse-atom to slice based on an existing atom file.

shell atom reachables --reuse-atom -o app.atom -s reachables.json -l java .

Example with container-based invocation.

```shell docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom reachables -l java -o /app/app.atom -s /app/reachables.slices.json /app

podman run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom reachables -l java -o /app/app.atom -s /app/reachables.slices.json /app

```

Create usages slice for a java project.

shell atom usages -o app.atom --slice-outfile usages.json -l java .

Example for a Ruby project with container-based invocation.

```shell docker run --rm -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages -l ruby -o /app/app.atom -s /app/usages.slices.json /app

podman run --rm -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages -l ruby -o /app/app.atom -s /app/usages.slices.json /app

```

Pass the argument --platform=linux/amd64, if you face issues with Java or Ruby builds on arm64 architecture.

```shell docker run --rm --platform=linux/amd64 -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages -l ruby -o /app/app.atom -s /app/usages.slices.json /app

podman run --rm --platform=linux/amd64 -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages -l ruby -o /app/app.atom -s /app/usages.slices.json /app

```

For Ruby, there is an alpine-based version available.

```shell docker run --rm -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom-alpine-ruby atom usages --extract-endpoints -l ruby -o /app/app.atom -s /app/usages.slices.json /app

podman run --rm -v /tmp:/tmp -v $(pwd):/app:rw -t ghcr.io/appthreat/atom-alpine-ruby atom usages --extract-endpoints -l ruby -o /app/app.atom -s /app/usages.slices.json /app

```

Create data-flow slice for a java project.

shell atom data-flow -o app.atom --slice-outfile df.json -l java .

Learn more about slices or view some samples

Extract HTTP endpoints in openapi format using atom-tools

atom can automatically invoke atom-tools convert command to extract http endpoints from the usages slices. Pass the argument --extract-endpoints to enable this feature.

shell pip install atom-tools atom usages --extract-endpoints -o app.atom --slice-outfile usages.json -l java .

A file called openapi.json would be created with the endpoints information. Use the environment variable ATOM_TOOLS_OPENAPI_FILENAME to customize the filename.

shell ATOM_TOOLS_OPENAPI_FILENAME=openapi.json atom usages --extract-endpoints -o app.atom --slice-outfile usages.json -l ruby .

Container-based invocation:

```shell docker run --rm -v /tmp:/tmp -e ATOMTOOLSOPENAPI_FILENAME=openapi.json -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages --extract-endpoints -l ruby -o /app/app.atom -s /app/usages.slices.json /app

podman run --rm -v /tmp:/tmp -e ATOMTOOLSOPENAPI_FILENAME=openapi.json -v $(pwd):/app:rw -t ghcr.io/appthreat/atom atom usages --extract-endpoints -l ruby -o /app/app.atom -s /app/usages.slices.json /app

```

Export atom to graphml or dot format

It is possible to export each method along with data dependencies in an atom to graphml or dot format. Simply pass --export to enable this feature.

shell atom -o app.atom -l java --export-atom --export-dir <export dir> <path to application>

The resulting graphml files could be imported into Neo4j or NetworkX for further analysis. Use the argument --export-format for dot format.

shell atom -o app.atom -l java --export-atom --export-format dot --export-dir <export dir> <path to application>

In dot format, individual representations such as ast, cdg, and cfg would also get exported.

To also compute and include data-dependency graph (DDG) information in the exported files, pass --with-data-deps

shell atom -o app.atom -l java --export-atom --export-dir <export dir> --with-data-deps <path to application>

Environment variables

| Variable | Description | | ------------------------------ | --------------------------------------------------------------------------------------------------- | | CHENIGNORETEST_DIRS | Set to true to ignore test directories. Only supported for Python for now. | | CHENPYTHONIGNORE_DIRS | Comma-separated list of directories to ignore for Python. | | CHENDELOMBOKMODE | Delombok mode for the Java frontend (no-delombok, default, types-only, run-delombok). | | CHENINCLUDEPATH | Include directories for the C frontend. Separate paths with : or ;. | | ATOMTOOLSOPENAPI_FORMAT | OpenAPI format for atom-tools. Default: openapi3.1.0; alternative: openapi3.0.1. | | ATOMTOOLSWORK_DIR | Working directory for atom-tools. Defaults to atom input path. | | ATOMSCALASEMWORK_DIR | Working directory for scalasem. Defaults to atom input path. | | ATOMSCALASEMSLICES_FILE | Slices file name. Defaults to semantics.slices.json. | | ATOMJVMARGS | Overrides the JVM arguments, including heap memory values, constructed by the atom Node.js wrapper. | | ATOMJAVAHOME | Java 21 or above to be used by atom. | | PHP_CMD | Overrides the PHP command used by the PHP frontend. | | PHPPARSERBIN | Overrides the php-parse command used by the PHP frontend. | | SCALA_CMD | Overrides the scala command. | | SCALAC_CMD | Overrides the scalac command used by the scala frontend. | | ASTGENIGNOREDIRS | Comma-separated list of directories to ignore by the JavaScript astgen pre-processor command. | | ASTGENIGNOREFILE_PATTERN | File pattern to ignore by the JavaScript astgen pre-processor command. | | JAVA_CMD | Overrides the java command. | | RUBY_CMD | Overrides the Ruby command. |

atom Specification

The intermediate representation used by atom is available under the same open-source license (MIT). The specification is available in protobuf, markdown, and html formats.

The current specification version is 1.0.0

Generating atom files

atom files (app.⚛ or app.atom) are zip files with serialized protobuf data. atom cli is the preferred approach to generate these files. It is possible to author a generator tool from scratch using the proto specification. We offer samples in Python and Deno for interested users. We also offer proto bindings in additional languages which can be found here.

Example code snippet for generating an atom in python.

```python

Create a method fullname property

methodFullName = atom.CpgStructNodeProperty( name=atom.NodePropertyName.FULL_NAME, value=atom.PropertyValue("main") )

Create a method node with the fullname property

method = atom.CpgStructNode( key=1, type=atom.NodeType.METHOD, property=[methodFullName] )

Create an atom with a single node

atom_struct = atom.CpgStruct(node=[method])

Create an atom (app.atom) by serializing this data into a zip file

with ZipFile("app.atom", "w") as zipfile: zipfile.writestr("cpg.proto", bytes(atom_struct)) ```

License

MIT

Developing / Contributing

Install Java 21 Node.js > 21

shell sbt clean stage scalafmt test createDistribution cd wrapper/nodejs bash build.sh && sudo npm install -g .

Using atom with chennai

chennai is the recommended query interface for working with atom.

shell chennai> importatom("/home/almalinux/work/sandbox/apollo/app.atom")

atom tools

Checkout atom-tools for some project ideas involving atom slices.

devenv setup

Install devenv by following the official instructions.

shell devenv shell

Language-specific profile:

```shell

Ruby environment

devenv --option config.profile:string ruby shell

php environment

devenv --option config.profile:string php shell ```

Enterprise support

Enterprise support including custom language development and integration services is available via AppThreat Ltd.

Sponsors

YourKit supports open source projects with innovative and intelligent tools for monitoring and profiling Java and .NET applications. YourKit is the creator of YourKit Java Profiler, YourKit .NET Profiler, and YourKit YouMonitor.

YourKit logo

Owner

  • Name: AppThreat
  • Login: AppThreat
  • Kind: organization
  • Email: hello@appthreat.com
  • Location: United Kingdom

Empower your devs.

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "license": "https://spdx.org/licenses/MIT",
  "codeRepository": "git+https://github.com/AppThreat/atom.git",
  "contIntegration": "https://github.com/AppThreat/atom/actions",
  "downloadUrl": "https://github.com/AppThreat/atom",
  "issueTracker": "https://github.com/AppThreat/atom/issues",
  "name": "atom",
  "version": "2.4.3",
  "description": "Atom is a novel intermediate representation for next-generation code analysis.",
  "applicationCategory": "code-analysis",
  "keywords": [
    "static-analysis",
    "code-analysis",
    "dependency-analysis",
    "code-hierarchy-analysis",
    "static-slicer",
    "reachability-analysis"
  ],
  "programmingLanguage": [
    "Scala 3",
    "Node.js"
  ],
  "runtimePlatform": [
    "JVM"
  ],
  "operatingSystem": [
    "Linux",
    "Windows",
    "MacOS"
  ],
  "softwareRequirements": [
    "Java >= 21",
    "Node.js >= 20",
    "Ruby >= 3.4.0",
    "PHP >= 8.3"
  ],
  "author": [
    {
      "@type": "Person",
      "givenName": "Team",
      "familyName": "AppThreat",
      "email": "cloud@appthreat.com"
    }
  ]
}

GitHub Events

Total
  • Create event: 65
  • Release event: 31
  • Issues event: 18
  • Watch event: 19
  • Delete event: 33
  • Member event: 1
  • Issue comment event: 19
  • Push event: 209
  • Pull request review comment event: 7
  • Pull request review event: 12
  • Pull request event: 75
  • Fork event: 2
Last Year
  • Create event: 65
  • Release event: 31
  • Issues event: 18
  • Watch event: 19
  • Delete event: 33
  • Member event: 1
  • Issue comment event: 19
  • Push event: 209
  • Pull request review comment event: 7
  • Pull request review event: 12
  • Pull request event: 75
  • Fork event: 2

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 342
  • Total Committers: 5
  • Avg Commits per committer: 68.4
  • Development Distribution Score (DDS): 0.044
Past Year
  • Commits: 141
  • Committers: 4
  • Avg Commits per committer: 35.25
  • Development Distribution Score (DDS): 0.021
Top Committers
Name Email Commits
Prabhu Subramanian p****u@a****m 327
David Baker Effendi d****e@s****a 11
Caroline Russell c****e@a****v 2
malice00 m****0 1
Aryan Rajoria 5****a 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 60
  • Total pull requests: 131
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 10 hours
  • Total issue authors: 9
  • Total pull request authors: 5
  • Average comments per issue: 0.57
  • Average comments per pull request: 0.18
  • Merged pull requests: 124
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 14
  • Pull requests: 72
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 5 hours
  • Issue authors: 6
  • Pull request authors: 4
  • Average comments per issue: 0.71
  • Average comments per pull request: 0.22
  • Merged pull requests: 67
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • prabhu (50)
  • Sohit1212 (3)
  • almaz045 (3)
  • Samarjeet93 (2)
  • jsdelfino (2)
  • cerrussell (2)
  • aryan-rajoria (1)
  • DavidBakerEffendi (1)
  • BigBlueHat (1)
  • blablacar12345 (1)
Pull Request Authors
  • prabhu (129)
  • DavidBakerEffendi (9)
  • aryan-rajoria (4)
  • cerrussell (3)
  • malice00 (2)
Top Labels
Issue Labels
enhancement (5) good first issue (5) bug (3) help wanted (2) container (1) macOS M (1)
Pull Request Labels
ruby (18) sponsored (18) scala (4) enhancement (3) security (2) android (2) java (2) ready for qa (2) python (1) good first issue (1) help wanted (1)

Packages

  • Total packages: 3
  • Total downloads:
    • npm 566,653 last-month
  • Total docker downloads: 57
  • Total dependent packages: 3
    (may contain duplicates)
  • Total dependent repositories: 6
    (may contain duplicates)
  • Total versions: 142
  • Total maintainers: 1
npmjs.org: @appthreat/atom

Create atom (⚛) representation for your application, packages and libraries

  • Versions: 127
  • Dependent Packages: 3
  • Dependent Repositories: 6
  • Downloads: 275,719 Last month
  • Docker Downloads: 57
Rankings
Downloads: 0.3%
Dependent repos count: 4.8%
Forks count: 12.0%
Stargazers count: 12.6%
Average: 16.6%
Dependent packages count: 53.3%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: @appthreat/atom-common

Common library for the @appthreat/atom project.

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 145,523 Last month
Rankings
Dependent repos count: 24.7%
Average: 30.1%
Dependent packages count: 35.6%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: @appthreat/atom-parsetools

Parsing tools that complement the @appthreat/atom project.

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 145,411 Last month
Rankings
Dependent repos count: 24.7%
Average: 30.1%
Dependent packages count: 35.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/nodejstests.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • actions/setup-node v3 composite
  • coursier/cache-action v6 composite
.github/workflows/npm-release.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • actions/setup-node v3 composite
  • coursier/cache-action v6 composite
.github/workflows/pr.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • coursier/cache-action v6 composite
.github/workflows/release.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • coursier/cache-action v6 composite
  • softprops/action-gh-release v1 composite
.github/workflows/repotests.yml actions
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • coursier/cache-action v6 composite
wrapper/nodejs/package-lock.json npm
  • 111 dependencies
wrapper/nodejs/package.json npm
  • eslint ^8.50.0 development
  • @babel/parser ^7.23.0
  • typescript ^5.2.2
  • yargs ^17.7.2
specification/samples/python-atomgen/poetry.lock pypi
  • betterproto 1.2.5
  • colorama 0.4.6
  • exceptiongroup 1.1.2
  • grpclib 0.4.5
  • h2 4.1.0
  • hpack 4.0.0
  • hyperframe 6.0.1
  • iniconfig 2.0.0
  • multidict 6.0.4
  • packaging 23.1
  • pluggy 1.2.0
  • pytest 7.4.0
  • stringcase 1.2.0
  • tomli 2.0.1
specification/samples/python-atomgen/pyproject.toml pypi
  • betterproto ^1.2.5
  • python >=3.8.1,<3.12