baler-compressor
Repository of Baler, a machine learning based data compression tool
Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org, zenodo.org -
✓Committers with academic emails
13 of 29 committers (44.8%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary
Keywords
Repository
Repository of Baler, a machine learning based data compression tool
Basic Info
- Host: GitHub
- Owner: baler-collaboration
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://github.com/baler-collaboration/baler.github.io
- Size: 175 MB
Statistics
- Stars: 42
- Watchers: 7
- Forks: 49
- Open Issues: 37
- Releases: 4
Topics
Metadata Files
README.md
Introduction
Baler is a tool used to test the feasibility of compressing different types of scientific data using machine learning-based autoencoders. Baler provides you with an easy way to: 1. Train a machine learning model on your data 2. Compress your data with that model. This will also save the compressed file and model 3. Decompress the file using the model at a later time 4. Plot the performance of the compression/decompression
Getting Started
NOTE: For the same performance and version as presented in our Arxiv paper, please use release v1.0.0 and the setup instructions given there. v1.0.0 also has a working docker implementation. We are currently experiencing some performance issues on the main branch compared.
In the links below we offer instructions on how to set up Baler and working tutorial examples to get you started. We offer two ways to run baler: * Python * Docker/Singularity/Apptainer
Contributing
If you wish to contribute, please see the contribution guidelines.
Owner
- Name: Baler-collaboration
- Login: baler-collaboration
- Kind: organization
- Email: alexander.ekman@hep.lu.se
- Location: Sweden
- Website: https://github.com/baler-collaboration/baler
- Repositories: 1
- Profile: https://github.com/baler-collaboration
Machine Learning Based Compression of Scientific Data
GitHub Events
Total
- Issues event: 1
- Watch event: 11
- Issue comment event: 3
- Push event: 1
- Pull request review event: 1
- Pull request review comment event: 1
- Pull request event: 2
- Fork event: 21
Last Year
- Issues event: 1
- Watch event: 11
- Issue comment event: 3
- Push event: 1
- Pull request review event: 1
- Pull request review comment event: 1
- Pull request event: 2
- Fork event: 21
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Alexander Ekman | a****5@g****m | 194 |
| agallen | a****n@c****h | 102 |
| Oliver Woolland | o****d@m****k | 56 |
| Fritjof Bengtsson | 6****b | 55 |
| Fritjof Bengtsson | f****n@g****m | 51 |
| Alexander Ekman | p****n@c****h | 23 |
| singh96aman | s****n@g****m | 22 |
| Marta Camps Santasmasas | m****s@p****k | 15 |
| Mattias Ellert | m****t@p****e | 14 |
| Pratik Jawahar | 5****4 | 12 |
| OliverWoolland | 8****d | 10 |
| Pratik Jawahar | p****a@l****h | 8 |
| Axel Gallén | 7****l | 6 |
| sekkar | s****u@g****m | 5 |
| Caterina | c****i@c****h | 4 |
| Pratik Jawahar | p****a@l****h | 3 |
| Maayan | m****i@c****h | 3 |
| Pratik Jawahar | p****a@l****h | 3 |
| root | r****t@L****E | 3 |
| AkkiG2401 | 7****1 | 3 |
| Pratik Jawahar | p****a@l****h | 2 |
| TamariMaayan | m****i@m****l | 2 |
| Radek Skaroupka | s****k@g****m | 1 |
| ShawnXu613 | 1****3 | 1 |
| malenaduroux | 1****x | 1 |
| Pratik Jawahar | p****a@l****h | 1 |
| Leonid Didukh | l****h@l****h | 1 |
| Leonid Didukh | l****h@g****m | 1 |
| Fritjof Bengtsson | f****n@n****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 116
- Total pull requests: 80
- Average time to close issues: 4 months
- Average time to close pull requests: 7 days
- Total issue authors: 11
- Total pull request authors: 18
- Average comments per issue: 0.66
- Average comments per pull request: 0.61
- Merged pull requests: 61
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 5
- Pull requests: 7
- Average time to close issues: N/A
- Average time to close pull requests: about 13 hours
- Issue authors: 1
- Pull request authors: 4
- Average comments per issue: 0.0
- Average comments per pull request: 0.57
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- exook (63)
- singh96aman (12)
- urania277 (8)
- neogyk (7)
- fritjof-b (5)
- gallenaxel (5)
- nskidmor (3)
- ellert (1)
- OliverWoolland (1)
- TamariMaayan (1)
- PRAkTIKal24 (1)
Pull Request Authors
- exook (13)
- fritjof-b (12)
- singh96aman (11)
- gallenaxel (10)
- neogyk (8)
- sanam2405 (6)
- ellert (5)
- jlsmith-hep (3)
- AkkiG2401 (3)
- kka011098 (2)
- PRAkTIKal24 (2)
- OscarrrFuentes (2)
- nskidmor (2)
- ayushb03 (2)
- TamariMaayan (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 32 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 6
- Total maintainers: 1
pypi.org: baler-compressor
Machine Learning Based Data Compression
- Homepage: https://github.com/baler-collaboration/baler
- Documentation: https://baler-compressor.readthedocs.io/
- License: Apache Software License
-
Latest release: 0.0.6
published about 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/cache v2 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite
- hls4ml ^0.7.1
- matplotlib ^3.6.2
- numpy 1.23.5
- python >=3.8, <3.11
- scikit-learn ^1.2.0
- tensorflow ^2.12.0
- torch >=2.0.0, !=2.0.1
- tqdm ^4.64.1
- cmake ==3.27.0
- colorama ==0.4.6
- contourpy ==1.1.0
- cycler ==0.11.0
- filelock ==3.12.2
- fonttools ==4.41.1
- importlib-resources ==6.0.0
- jinja2 ==3.1.2
- joblib ==1.3.1
- kiwisolver ==1.4.4
- lit ==16.0.6
- markupsafe ==2.1.3
- matplotlib ==3.7.2
- mpmath ==1.3.0
- networkx ==3.1
- numpy ==1.23.5
- nvidia-cublas-cu11 ==11.10.3.66
- nvidia-cuda-cupti-cu11 ==11.7.101
- nvidia-cuda-nvrtc-cu11 ==11.7.99
- nvidia-cuda-runtime-cu11 ==11.7.99
- nvidia-cudnn-cu11 ==8.5.0.96
- nvidia-cufft-cu11 ==10.9.0.58
- nvidia-curand-cu11 ==10.2.10.91
- nvidia-cusolver-cu11 ==11.4.0.1
- nvidia-cusparse-cu11 ==11.7.4.91
- nvidia-nccl-cu11 ==2.14.3
- nvidia-nvtx-cu11 ==11.7.91
- packaging ==23.1
- pillow ==10.0.0
- pyparsing ==3.0.9
- python-dateutil ==2.8.2
- scikit-learn ==1.3.0
- scipy ==1.10.1
- setuptools ==68.0.0
- six ==1.16.0
- sympy ==1.12
- threadpoolctl ==3.2.0
- torch ==2.0.0
- tqdm ==4.65.0
- triton ==2.0.0
- typing-extensions ==4.5.0
- wheel ==0.41.0
- zipp ==3.16.2
- python 3.8-slim build
- 118 dependencies