group34_program_analysis
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: mondk
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Size: 2.24 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
JPAMB: Java Program Analysis Micro Benchmarks
The goal of this benchmark suite is to make a collection of interesting micro-benchmarks to be solved by either dynamic or static analysis.
Rules of the Game
The goal is to build a program analysis that takes a method ID as an argument, and
returns a list of lines, each line consisting of a query and a prediction separated by semicolon ;.
A method ID is the fully qualified name of the class, the method name, ":", and
then the method descriptor,
for example:
jpamb.cases.Simple.assertPositive:(I)V
jpamb.cases.Simple.divideByZero:()I
And the query is one of:
| query | description |
| :----- | :----- |
| assertion error | an execution throws an assertion error |
| ok | an execution runs to completion |
| * | an execution runs forever |
| divide by zero | an execution divides by zero |
| out of bounds | an execution index an array out of bounds |
| null pointer | an execution throws an null pointer exeception |
And the prediction is either a wager (-3, inf) (the number of points you
want to bet on you being right) or a probability (30%, 72%)
Your analysis should look like this:
shell
$> ./analysis "jpamb.cases.Simple.assertPositive:(I)V"
divide by zero;5
ok;25%
A wager is the number of points waged [-inf, inf] on your prediction. A negative wager is against the query, and
a positive is for the query. A failed wager is subtracted from your points, however
a successful wager is converted into points like so:
$$\mathtt{points} = 1 - \frac{1}{\mathtt{wager} + 1}$$
If you are sure that the method being analyzed does not contain an "assertion error", you can wager -200 points. If you are wrong, and the program does exhibit an assertion error, you lose 200 point, but if you are correct, you gain $1 - 1 / 201 = 0.995$ points.
Below are some example values. Note that small wagers equal smaller risk.
| wager | points | | ---: | ---:| | 0.00 | 0.00 | | 0.25 | 0.20 | | 0.50 | 0.33 | | 1.00 | 0.50 | | 3.00 | 0.75 | | 9.00 | 0.90 | | 99.00 | 0.99 | | inf | 1.00 |
Examples of such scripts can be seen in solutions/.
You can also respond with a probability [0%: 100%], which is automatically converted into
the optimal wager. An example of this is in solutions/apriori.py, which uses the distribution
of errors from stats/distribution.csv to gain an advantage (which is cheating :D).
If you are curious, the optimal wager is found by solving the following quadratic function, where $p$ is the probability: $$(1 - p) \cdot \mathtt{wager} = p \cdot \mathtt{points} = p \cdot (1 - \frac{1}{\mathtt{wager} + 1})$$ And dividing by 2 to get the optimal wager: $$\mathtt{wager} = \frac{1 - 2 p }{2 (p - 1)}$$
| prob | wager | | ---: | ---:| | 0% | -inf | | 10% | -8 | | 25% | -2 | | 50% | 0 | | 75% | 2 | | 90% | 8 | | 100% | inf |
Evaluating
To get started evaluating your tool you can run the bin/evaluate.py script, it requires
the click and loguru libraries and python 3.10 or above. You can install these dependencies using pip
in your favorite environment.
```shell $> python -m venv .venv
on unix systems
$> source .venv/bin/activate
or on windows
PS> .venv\Scripts\activate
now install stuff
$> python -m pip install -r requirements.txt -r requirements-treesitter.txt ```
Furthermore, to do good time reporting it uses a C compiler to compile the program timer/sieve.c and
execute it alongside the analyses to calibrate the results.
Essentially, this computes a relative time (in relation to calculating the first 100,000 primes), as well as
an absolute time. Make sure the environment variable CC is set to the name of your compiler, or
that gcc is on your PATH.
First create a YAML file describing your experiment, see the sample.yaml file for an example.
And then to evaluate your analysis you should be able to run:
shell
$> python bin/evaluate.py experiment.yaml -o experiment.json
If you have problems getting started, please file an issue.
Windows
The instructions above should also work for windows, but it is less straight forward. The easy way out of this is to install Linux as a subsystem on your Windows machine. This is supported directly on Windows. This will require you to do all of your development in this environment though.
If you prefer staying in Windows land, here are some tips and pointers:
Sometimes paths needs to be inverted in the examples
/to\.It is extra important to use virtual environments, when using windows, that way you can keep different versions of python separate.
To support compiling with
gccand to make your life easier you should install MSYS2 with mingw-w64 GCC. You can do this by following the guide in the link above (step 6 - 9.).
Debug
You can debug your code by running some of the methods or some of the tools, like this:
shell
$> ./evaluate your-experiment.yaml --filter-methods=Simple --filter-tools=syntaxer -o experiment.json
Also, if you want more debug information you can add multiples -vvv to get more information.
Source code
The source code is located under the src/main/java.
A simple solution that analyze the source code directly using the tree-sitter
library is located at
solutions/syntaxer.py.
Byte code
To write more advanced analysis it makes sense to make use of the byte-code. To
lower the bar to entrance, the byte code of the benchmarks have already been decompiled by the
jvm2json tool.
The codec for the output is described here.
Some sample code for how to get started can be seen in solutions/bytecoder.py.
Developing
Before making a pull-request, please run ./bin/build.py first.
The easiest way to do that is to run use the nix tool to download all dependencies.
shell
nix develop -c ./bin/build.py
Citation
To cite this work, please use the cite bottom on the right.
"# group34programanalysis"
Owner
- Login: mondk
- Kind: user
- Repositories: 2
- Profile: https://github.com/mondk
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Kalhauge" given-names: "Christian Gram" orcid: "https://orcid.org/0000-0003-1947-7928" title: "JPAMB: Java Program Analysis Micro Benchmarks" version: 0.0.2 date-released: 2024-09-10 url: "https://github.com/kalhauge/jpamb"
GitHub Events
Total
Last Year
Dependencies
- junit:junit 4.11 test
- numpy ==2.1.1
- pandas ==2.2.2
- plotly ==5.24.0
- tree-sitter ==0.23.0
- tree-sitter-java ==0.23.1
- PyYAML ==6.0.2
- click ==8.1.7
- loguru ==0.7.2