group34_program_analysis

https://github.com/mondk/group34_program_analysis

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.2%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: mondk
License: bsd-3-clause
Language: Python
Default Branch: main
Size: 2.24 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Citation

JPAMB: Java Program Analysis Micro Benchmarks

The goal of this benchmark suite is to make a collection of interesting micro-benchmarks to be solved by either dynamic or static analysis.

Rules of the Game

The goal is to build a program analysis that takes a method ID as an argument, and returns a list of lines, each line consisting of a query and a prediction separated by semicolon ;. A method ID is the fully qualified name of the class, the method name, ":", and then the method descriptor, for example: jpamb.cases.Simple.assertPositive:(I)V jpamb.cases.Simple.divideByZero:()I

And the query is one of:

And the prediction is either a wager (-3, inf) (the number of points you want to bet on you being right) or a probability (30%, 72%)

Your analysis should look like this:

shell $> ./analysis "jpamb.cases.Simple.assertPositive:(I)V" divide by zero;5 ok;25%

A wager is the number of points waged [-inf, inf] on your prediction. A negative wager is against the query, and a positive is for the query. A failed wager is subtracted from your points, however a successful wager is converted into points like so: $$\mathtt{points} = 1 - \frac{1}{\mathtt{wager} + 1}$$

If you are sure that the method being analyzed does not contain an "assertion error", you can wager -200 points. If you are wrong, and the program does exhibit an assertion error, you lose 200 point, but if you are correct, you gain $1 - 1 / 201 = 0.995$ points.

Below are some example values. Note that small wagers equal smaller risk.

| wager | points | | ---: | ---:| | 0.00 | 0.00 | | 0.25 | 0.20 | | 0.50 | 0.33 | | 1.00 | 0.50 | | 3.00 | 0.75 | | 9.00 | 0.90 | | 99.00 | 0.99 | | inf | 1.00 |

Examples of such scripts can be seen in solutions/.

You can also respond with a probability [0%: 100%], which is automatically converted into the optimal wager. An example of this is in solutions/apriori.py, which uses the distribution of errors from stats/distribution.csv to gain an advantage (which is cheating :D).

If you are curious, the optimal wager is found by solving the following quadratic function, where $p$ is the probability: $$(1 - p) \cdot \mathtt{wager} = p \cdot \mathtt{points} = p \cdot (1 - \frac{1}{\mathtt{wager} + 1})$$ And dividing by 2 to get the optimal wager: $$\mathtt{wager} = \frac{1 - 2 p }{2 (p - 1)}$$

| prob | wager | | ---: | ---:| | 0% | -inf | | 10% | -8 | | 25% | -2 | | 50% | 0 | | 75% | 2 | | 90% | 8 | | 100% | inf |

Evaluating

To get started evaluating your tool you can run the bin/evaluate.py script, it requires the click and loguru libraries and python 3.10 or above. You can install these dependencies using pip in your favorite environment.

```shell $> python -m venv .venv

on unix systems

$> source .venv/bin/activate

or on windows

PS> .venv\Scripts\activate

now install stuff

$> python -m pip install -r requirements.txt -r requirements-treesitter.txt ```

Furthermore, to do good time reporting it uses a C compiler to compile the program timer/sieve.c and execute it alongside the analyses to calibrate the results. Essentially, this computes a relative time (in relation to calculating the first 100,000 primes), as well as an absolute time. Make sure the environment variable CC is set to the name of your compiler, or that gcc is on your PATH.

First create a YAML file describing your experiment, see the sample.yaml file for an example. And then to evaluate your analysis you should be able to run: shell $> python bin/evaluate.py experiment.yaml -o experiment.json

If you have problems getting started, please file an issue.

Windows

The instructions above should also work for windows, but it is less straight forward. The easy way out of this is to install Linux as a subsystem on your Windows machine. This is supported directly on Windows. This will require you to do all of your development in this environment though.

If you prefer staying in Windows land, here are some tips and pointers:

Sometimes paths needs to be inverted in the examples / to \.
It is extra important to use virtual environments, when using windows, that way you can keep different versions of python separate.
To support compiling with gcc and to make your life easier you should install MSYS2 with mingw-w64 GCC. You can do this by following the guide in the link above (step 6 - 9.).

Debug

You can debug your code by running some of the methods or some of the tools, like this:

shell $> ./evaluate your-experiment.yaml --filter-methods=Simple --filter-tools=syntaxer -o experiment.json

Also, if you want more debug information you can add multiples -vvv to get more information.

Source code

The source code is located under the src/main/java. A simple solution that analyze the source code directly using the tree-sitter library is located at solutions/syntaxer.py.

Byte code

To write more advanced analysis it makes sense to make use of the byte-code. To lower the bar to entrance, the byte code of the benchmarks have already been decompiled by the jvm2json tool. The codec for the output is described here.

Some sample code for how to get started can be seen in solutions/bytecoder.py.

Developing

Before making a pull-request, please run ./bin/build.py first. The easiest way to do that is to run use the nix tool to download all dependencies.

shell nix develop -c ./bin/build.py

Citation

To cite this work, please use the cite bottom on the right.

"# group34programanalysis"

Owner

Login: mondk
Kind: user

Repositories: 2
Profile: https://github.com/mondk

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Kalhauge"
  given-names: "Christian Gram"
  orcid: "https://orcid.org/0000-0003-1947-7928"
title: "JPAMB: Java Program Analysis Micro Benchmarks"
version: 0.0.2
date-released: 2024-09-10
url: "https://github.com/kalhauge/jpamb"

GitHub Events

Total

Last Year

Dependencies

pom.xml maven

junit:junit 4.11 test

requirements-stats.txt pypi

numpy ==2.1.1
pandas ==2.2.2
plotly ==5.24.0

requirements-treesitter.txt pypi

tree-sitter ==0.23.0
tree-sitter-java ==0.23.1

requirements.txt pypi

PyYAML ==6.0.2
click ==8.1.7
loguru ==0.7.2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science