baler-compressor

Repository of Baler, a machine learning based data compression tool

https://github.com/baler-collaboration/baler

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Committers with academic emails
    13 of 29 committers (44.8%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

computational-fluid-dynamics machine-learning particle-physics
Last synced: 7 months ago · JSON representation

Repository

Repository of Baler, a machine learning based data compression tool

Basic Info
Statistics
  • Stars: 42
  • Watchers: 7
  • Forks: 49
  • Open Issues: 37
  • Releases: 4
Topics
computational-fluid-dynamics machine-learning particle-physics
Created over 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

DOI\ License\ example workflow example workflow

Introduction

Baler is a tool used to test the feasibility of compressing different types of scientific data using machine learning-based autoencoders. Baler provides you with an easy way to: 1. Train a machine learning model on your data 2. Compress your data with that model. This will also save the compressed file and model 3. Decompress the file using the model at a later time 4. Plot the performance of the compression/decompression

Getting Started

NOTE: For the same performance and version as presented in our Arxiv paper, please use release v1.0.0 and the setup instructions given there. v1.0.0 also has a working docker implementation. We are currently experiencing some performance issues on the main branch compared.

In the links below we offer instructions on how to set up Baler and working tutorial examples to get you started. We offer two ways to run baler: * Python * Docker/Singularity/Apptainer

Contributing

If you wish to contribute, please see the contribution guidelines.

Owner

  • Name: Baler-collaboration
  • Login: baler-collaboration
  • Kind: organization
  • Email: alexander.ekman@hep.lu.se
  • Location: Sweden

Machine Learning Based Compression of Scientific Data

GitHub Events

Total
  • Issues event: 1
  • Watch event: 11
  • Issue comment event: 3
  • Push event: 1
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 2
  • Fork event: 21
Last Year
  • Issues event: 1
  • Watch event: 11
  • Issue comment event: 3
  • Push event: 1
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 2
  • Fork event: 21

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 603
  • Total Committers: 29
  • Avg Commits per committer: 20.793
  • Development Distribution Score (DDS): 0.678
Past Year
  • Commits: 534
  • Committers: 28
  • Avg Commits per committer: 19.071
  • Development Distribution Score (DDS): 0.64
Top Committers
Name Email Commits
Alexander Ekman a****5@g****m 194
agallen a****n@c****h 102
Oliver Woolland o****d@m****k 56
Fritjof Bengtsson 6****b 55
Fritjof Bengtsson f****n@g****m 51
Alexander Ekman p****n@c****h 23
singh96aman s****n@g****m 22
Marta Camps Santasmasas m****s@p****k 15
Mattias Ellert m****t@p****e 14
Pratik Jawahar 5****4 12
OliverWoolland 8****d 10
Pratik Jawahar p****a@l****h 8
Axel Gallén 7****l 6
sekkar s****u@g****m 5
Caterina c****i@c****h 4
Pratik Jawahar p****a@l****h 3
Maayan m****i@c****h 3
Pratik Jawahar p****a@l****h 3
root r****t@L****E 3
AkkiG2401 7****1 3
Pratik Jawahar p****a@l****h 2
TamariMaayan m****i@m****l 2
Radek Skaroupka s****k@g****m 1
ShawnXu613 1****3 1
malenaduroux 1****x 1
Pratik Jawahar p****a@l****h 1
Leonid Didukh l****h@l****h 1
Leonid Didukh l****h@g****m 1
Fritjof Bengtsson f****n@n****m 1

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 116
  • Total pull requests: 80
  • Average time to close issues: 4 months
  • Average time to close pull requests: 7 days
  • Total issue authors: 11
  • Total pull request authors: 18
  • Average comments per issue: 0.66
  • Average comments per pull request: 0.61
  • Merged pull requests: 61
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 7
  • Average time to close issues: N/A
  • Average time to close pull requests: about 13 hours
  • Issue authors: 1
  • Pull request authors: 4
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.57
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • exook (63)
  • singh96aman (12)
  • urania277 (8)
  • neogyk (7)
  • fritjof-b (5)
  • gallenaxel (5)
  • nskidmor (3)
  • ellert (1)
  • OliverWoolland (1)
  • TamariMaayan (1)
  • PRAkTIKal24 (1)
Pull Request Authors
  • exook (13)
  • fritjof-b (12)
  • singh96aman (11)
  • gallenaxel (10)
  • neogyk (8)
  • sanam2405 (6)
  • ellert (5)
  • jlsmith-hep (3)
  • AkkiG2401 (3)
  • kka011098 (2)
  • PRAkTIKal24 (2)
  • OscarrrFuentes (2)
  • nskidmor (2)
  • ayushb03 (2)
  • TamariMaayan (1)
Top Labels
Issue Labels
Development (17) High Priority (14) Backburner (13) Good first issue (12) HEP (8) CFD (4) ECO (2) FPGA (2) discussion (1)
Pull Request Labels
HEP (1) documentation (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 32 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 6
  • Total maintainers: 1
pypi.org: baler-compressor

Machine Learning Based Data Compression

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 32 Last month
Rankings
Dependent packages count: 10.0%
Average: 38.7%
Dependent repos count: 67.4%
Maintainers (1)
Last synced: 7 months ago

Dependencies

.github/workflows/test_and_lint.yaml actions
  • actions/cache v2 composite
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
pyproject.toml pypi
  • hls4ml ^0.7.1
  • matplotlib ^3.6.2
  • numpy 1.23.5
  • python >=3.8, <3.11
  • scikit-learn ^1.2.0
  • tensorflow ^2.12.0
  • torch >=2.0.0, !=2.0.1
  • tqdm ^4.64.1
requirements.txt pypi
  • cmake ==3.27.0
  • colorama ==0.4.6
  • contourpy ==1.1.0
  • cycler ==0.11.0
  • filelock ==3.12.2
  • fonttools ==4.41.1
  • importlib-resources ==6.0.0
  • jinja2 ==3.1.2
  • joblib ==1.3.1
  • kiwisolver ==1.4.4
  • lit ==16.0.6
  • markupsafe ==2.1.3
  • matplotlib ==3.7.2
  • mpmath ==1.3.0
  • networkx ==3.1
  • numpy ==1.23.5
  • nvidia-cublas-cu11 ==11.10.3.66
  • nvidia-cuda-cupti-cu11 ==11.7.101
  • nvidia-cuda-nvrtc-cu11 ==11.7.99
  • nvidia-cuda-runtime-cu11 ==11.7.99
  • nvidia-cudnn-cu11 ==8.5.0.96
  • nvidia-cufft-cu11 ==10.9.0.58
  • nvidia-curand-cu11 ==10.2.10.91
  • nvidia-cusolver-cu11 ==11.4.0.1
  • nvidia-cusparse-cu11 ==11.7.4.91
  • nvidia-nccl-cu11 ==2.14.3
  • nvidia-nvtx-cu11 ==11.7.91
  • packaging ==23.1
  • pillow ==10.0.0
  • pyparsing ==3.0.9
  • python-dateutil ==2.8.2
  • scikit-learn ==1.3.0
  • scipy ==1.10.1
  • setuptools ==68.0.0
  • six ==1.16.0
  • sympy ==1.12
  • threadpoolctl ==3.2.0
  • torch ==2.0.0
  • tqdm ==4.65.0
  • triton ==2.0.0
  • typing-extensions ==4.5.0
  • wheel ==0.41.0
  • zipp ==3.16.2
Dockerfile docker
  • python 3.8-slim build
poetry.lock pypi
  • 118 dependencies