JetNet

JetNet: A Python package for accessing open datasets and benchmarking machine learning methods in high energy physics - Published in JOSS (2023)

https://github.com/jet-net/jetnet

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 14 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: arxiv.org, joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 9 committers (11.1%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Mathematics Computer Science - 84% confidence
Last synced: 6 months ago · JSON representation ·

Repository

For developing and reproducing ML + HEP projects.

Basic Info
Statistics
  • Stars: 23
  • Watchers: 2
  • Forks: 15
  • Open Issues: 3
  • Releases: 17
Created over 4 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

For developing and reproducing ML + HEP projects.


JetNetInstallationQuickstartDocumentationContributingCitationReferences


CI Documentation Status Codestyle pre-commit.ci status

PyPI Version PyPI Downloads DOI DOI


JetNet

JetNet is an effort to increase accessibility and reproducibility in jet-based machine learning.

Currently we provide:

  • Easy-to-access and standardised interfaces for the following datasets:
  • Standard implementations of generative evaluation metrics (Ref. [1, 2]), including:
    • Fréchet physics distance (FPD)
    • Kernel physics distance (KPD)
    • Wasserstein-1 (W1)
    • Fréchet ParticleNet Distance (FPND)
    • coverage and minimum matching distance (MMD)
  • Loss functions:
    • Differentiable implementation of the energy mover's distance [3]
  • And more general jet utilities.

Additional functionality is under development, and please reach out if you're interested in contributing!

Installation

JetNet can be installed with pip:

bash pip install jetnet

To use the differentiable EMD loss jetnet.losses.EMDLoss, additional libraries must be installed via

bash pip install "jetnet[emdloss]"

Finally, PyTorch Geometric must be installed independently for the Fréchet ParticleNet Distance metric jetnet.evaluation.fpnd (Installation instructions).

Quickstart

Datasets can be downloaded and accessed quickly, for example:

```python from jetnet.datasets import JetNet, TopTagging

as numpy arrays:

particledata, jetdata = JetNet.getData( jettype=["g", "q"], datadir="./datasets/jetnet/", download=True )

or as a PyTorch dataset:

dataset = TopTagging( jettype="all", datadir="./datasets/toptagging/", split="train", download=True ) ```

Evaluation metrics can be used as such:

python generated_jets = np.random.rand(50000, 30, 3) fpnd_score = jetnet.evaluation.fpnd(generated_jets, jet_type="g")

Loss functions can be initialized and used similarly to standard PyTorch in-built losses such as MSE:

python emd_loss = jetnet.losses.EMDLoss(num_particles=30) loss = emd_loss(real_jets, generated_jets) loss.backward()

Documentation

The full API reference and tutorials are available at jetnet.readthedocs.io. Tutorial notebooks are in the tutorials folder, with more to come.

Contributing

We welcome feedback and contributions! Please feel free to create an issue for bugs or functionality requests, or open pull requests from your forked repo to solve them.

Building and testing locally

Perform an editable installation of the package from inside your forked repo and install the pytest package for unit testing:

bash pip install -e . pip install pytest

Run the test suite to ensure everything is working as expected:

bash pytest tests # tests all datasets pytest tests -m "not slow" # tests only on the JetNet dataset for convenience

Citation

If you use this library for your research, please cite our article in the Journal of Open Source Software:

@article{Kansal_JetNet_2023, author = {Kansal, Raghav and Pareja, Carlos and Hao, Zichun and Duarte, Javier}, doi = {10.21105/joss.05789}, journal = {Journal of Open Source Software}, number = {90}, pages = {5789}, title = {{JetNet: A Python package for accessing open datasets and benchmarking machine learning methods in high energy physics}}, url = {https://joss.theoj.org/papers/10.21105/joss.05789}, volume = {8}, year = {2023} }

Please further cite the following if you use these components of the library.

JetNet dataset or FPND

@inproceedings{Kansal_MPGAN_2021, author = {Kansal, Raghav and Duarte, Javier and Su, Hao and Orzari, Breno and Tomei, Thiago and Pierini, Maurizio and Touranakou, Mary and Vlimant, Jean-Roch and Gunopulos, Dimitrios}, booktitle = "{Advances in Neural Information Processing Systems}", editor = {M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan}, pages = {23858--23871}, publisher = {Curran Associates, Inc.}, title = {Particle Cloud Generation with Message Passing Generative Adversarial Networks}, url = {https://proceedings.neurips.cc/paper_files/paper/2021/file/c8512d142a2d849725f31a9a7a361ab9-Paper.pdf}, volume = {34}, year = {2021}, eprint = {2106.11535}, archivePrefix = {arXiv}, }

FPD or KPD

@article{Kansal_Evaluating_2023, author = {Kansal, Raghav and Li, Anni and Duarte, Javier and Chernyavskaya, Nadezda and Pierini, Maurizio and Orzari, Breno and Tomei, Thiago}, title = {Evaluating generative models in high energy physics}, reportNumber = "FERMILAB-PUB-22-872-CMS-PPD", doi = "10.1103/PhysRevD.107.076017", journal = "{Phys. Rev. D}", volume = "107", number = "7", pages = "076017", year = "2023", eprint = "2211.10295", archivePrefix = "arXiv", }

EMD Loss

Please cite the respective qpth or cvxpy libraries, depending on the method used (qpth by default), as well as the original EMD paper [3].

References

[1] R. Kansal et al., Particle Cloud Generation with Message Passing Generative Adversarial Networks, NeurIPS 2021 [2106.11535].

[2] R. Kansal et al., Evaluating Generative Models in High Energy Physics, Phys. Rev. D 107 (2023) 076017 [2211.10295].

[3] P. T. Komiske, E. M. Metodiev, and J. Thaler, The Metric Space of Collider Events, Phys. Rev. Lett. 123 (2019) 041801 [1902.02346].

Owner

  • Name: jet-net
  • Login: jet-net
  • Kind: organization

JOSS Publication

JetNet: A Python package for accessing open datasets and benchmarking machine learning methods in high energy physics
Published
October 30, 2023
Volume 8, Issue 90, Page 5789
Authors
Raghav Kansal ORCID
UC San Diego, USA, Fermilab, USA
Carlos Pareja ORCID
UC San Diego, USA
Zichun Hao ORCID
California Institute of Technology, USA
Javier Duarte ORCID
UC San Diego, USA
Editor
Matthew Feickert ORCID
Tags
PyTorch high energy physics machine learning jets

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Kansal
  given-names: Raghav
  orcid: "https://orcid.org/0000-0003-2445-1060"
- family-names: Pareja
  given-names: Carlos
  orcid: "https://orcid.org/0000-0002-9022-2349"
- family-names: Hao
  given-names: Zichun
  orcid: "https://orcid.org/0000-0002-5624-4907"
- family-names: Duarte
  given-names: Javier
  orcid: "https://orcid.org/0000-0002-5076-7096"
contact:
- family-names: Kansal
  given-names: Raghav
  orcid: "https://orcid.org/0000-0003-2445-1060"
doi: 10.5281/zenodo.10044601
message: If you use this library for your research, please cite our article in the Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Kansal
    given-names: Raghav
    orcid: "https://orcid.org/0000-0003-2445-1060"
  - family-names: Pareja
    given-names: Carlos
    orcid: "https://orcid.org/0000-0002-9022-2349"
  - family-names: Hao
    given-names: Zichun
    orcid: "https://orcid.org/0000-0002-5624-4907"
  - family-names: Duarte
    given-names: Javier
    orcid: "https://orcid.org/0000-0002-5076-7096"
  date-published: 2023-10-30
  doi: 10.21105/joss.05789
  issn: 2475-9066
  issue: 90
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5789
  title: "JetNet: A Python package for accessing open datasets and
    benchmarking machine learning methods in high energy physics"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05789"
  volume: 8
title: "JetNet: A Python package for accessing open datasets and
  benchmarking machine learning methods in high energy physics"
version: "v0.2.4"

GitHub Events

Total
  • Delete event: 5
  • Push event: 14
  • Pull request event: 8
  • Create event: 3
Last Year
  • Delete event: 5
  • Push event: 14
  • Pull request event: 8
  • Create event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 274
  • Total Committers: 9
  • Avg Commits per committer: 30.444
  • Development Distribution Score (DDS): 0.274
Past Year
  • Commits: 12
  • Committers: 3
  • Avg Commits per committer: 4.0
  • Development Distribution Score (DDS): 0.333
Top Committers
Name Email Commits
rkansal47 r****7@y****n 199
pre-commit-ci[bot] 6****] 28
Javier Duarte j****e@u****u 21
Lint Action l****n@s****m 11
cpareja3025 c****5@g****m 6
Zichun Hao z****0@g****m 3
Joosep Pata j****a@g****m 3
mova m****a 2
Kyle Niemeyer k****r@f****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 10
  • Total pull requests: 79
  • Average time to close issues: 9 days
  • Average time to close pull requests: 24 days
  • Total issue authors: 5
  • Total pull request authors: 8
  • Average comments per issue: 1.4
  • Average comments per pull request: 0.25
  • Merged pull requests: 72
  • Bot issues: 0
  • Bot pull requests: 24
Past Year
  • Issues: 0
  • Pull requests: 11
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 9
Top Authors
Issue Authors
  • rkansal47 (3)
  • mova (2)
  • kaechb (2)
  • matthewfeickert (2)
  • jpata (1)
Pull Request Authors
  • rkansal47 (44)
  • pre-commit-ci[bot] (23)
  • zichunhao (2)
  • mova (2)
  • jmduarte (2)
  • cpareja3025 (2)
  • jpata (1)
Top Labels
Issue Labels
documentation (2) good first issue (1)
Pull Request Labels
enhancement (2) documentation (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 512 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 3
  • Total versions: 28
  • Total maintainers: 1
pypi.org: jetnet

Jets + ML integration

  • Versions: 28
  • Dependent Packages: 0
  • Dependent Repositories: 3
  • Downloads: 512 Last month
Rankings
Dependent repos count: 9.0%
Dependent packages count: 10.0%
Forks count: 10.2%
Average: 11.4%
Stargazers count: 13.6%
Downloads: 14.4%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • autodocsumm ==0.2.7
  • ipykernel *
  • m2r2 *
  • nbsphinx *
  • numpy *
  • readthedocs-sphinx-search ==0.1.0rc3
  • scipy *
  • sphinx ==4.2.0
  • sphinx_rtd_theme ==1.0.0
  • torch *
  • tqdm *
setup.py pypi
  • awkward *
  • coffea *
  • energyflow *
  • h5py *
  • numpy *
  • pandas *
  • requests *
  • scipy *
  • tables *
  • torch *
  • tqdm *
.github/workflows/ci.yml actions
  • actions/checkout v2 composite
  • actions/checkout v3 composite
  • actions/setup-python v1 composite
  • actions/setup-python v4 composite
  • wearerequired/lint-action v2 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish release/v1 composite
pyproject.toml pypi
.github/workflows/JOSS-pdf.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite