osptrack
labelled dataset for simulated package execution with package-analysis
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary
Keywords
Repository
labelled dataset for simulated package execution with package-analysis
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
OSPTrack
labelled dataset for simulated package execution with package-analysis
This work has been accepted at MSR 2025 Data and Tool Showcase Track, will present on 28th, April, 2025
Structure (core)
ana:
- stastical analysis for BKC Dataset and also malicious-packages
- the code to extract metrics.csv and iocs.csv files
- label distribution analysis for labeled dataset
data:
- collection from BKC and also malicious-packages
- places to save bkcmal.csv and pkgmal.csv
- places to save extracted data also final labeled dataset
data_create:
- code to query BigQuery
- code to run simulation
ext:
- code to parse reports (json and csv)
- code to extract features and generate final dataset
run_analysis.sh:
custom shell script to run package-analysis to save results locally and avoid repetitions
Preparation (Environment Setting Up)
- For BigQuqry: ``` # download bigquery key from google cloud # activate the key export GOOGLEAPPLICATIONCREDENTIALS="path/to/your/service-account-file.json" # the key needs to be loaded when querying BigQuery
```
- For running Package-Analysis (only feasible on Ubuntu)
```
git download
sudo apt-get install git
docker
sudo apt-get install -y docker.io
start the docker service
sudo systemctl start docker
golang download
sudo apt-get install golang
direct running --- check whether this tool works locally
how to run local instance
local instance
scripts/run_analysis.sh -ecosystem pypi -package test -local /path/to/test.whl
live instance
scripts/run_analysis.sh -ecosystem pypi -package Django -version 4.1.3
after successfully running one instance
replace the run_analysis.sh with the one provided in this resp --- give 755
```
Running Instructions
```
virtual environment setting up
eval "$(pyenv init -)" eval "$(pyenv virtualenv-init -)"
query data from BigQuery
python3 data_bigquery.py
run simulation by calling package-analysis
sudo python3 simu_run.py
```
Owner
- Name: Wapiti
- Login: Wapiti08
- Kind: user
- Location: Glasgow
- Company: UofG
- Website: https://newt-tan.medium.com/
- Repositories: 59
- Profile: https://github.com/Wapiti08
Building and researching cyber security and machine learning technology
Citation (CITATION.cff)
cff-version: 1.0.0-beta message: "If you use this software, please cite it as below." authors: - family-names: "Tan" given-names: "Zhuoran" orcid: "https://orcid.org/0000-0002-0809-0376" - family-names: "Anagnostopoulos" given-names: "Christos" orcid: "https://orcid.org/0000-0003-1517-6757" - family-names: "Singer" given-names: "Jeremy" orcid: "https://orcid.org/0000-0001-9462-6802" title: "OSPtrack: A Labelled Dataset Targeting Simulated Open-Source Package Execution " version: 1.0.0-beta doi: 10.5281/zenodo.14197321 date-released: 2024-11-21 url: "https://github.com/Wapiti08/OSPTrack"
GitHub Events
Total
- Release event: 2
- Watch event: 3
- Public event: 1
- Push event: 5
- Create event: 1
Last Year
- Release event: 2
- Watch event: 3
- Public event: 1
- Push event: 5
- Create event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- dask ==2024.8.1
- db-dtypes ==1.2.0
- fastparquet ==2024.5.0
- google-cloud-bigquery ==3.25.0
- pandas ==2.2.2
- pyarrow ==17.0.0
- python-dotenv ==1.0.1
- tqdm ==4.66.5