dunedn

Deep Learning applications on ProtoDUNE raw data denoising

https://github.com/marcorossi5/dunedn

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 10 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, zenodo.org
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (17.2%) to scientific vocabulary

Keywords

deep-learning pytorch

Last synced: 9 months ago · JSON representation ·

Repository

Deep Learning applications on ProtoDUNE raw data denoising

Basic Info

Host: GitHub
Owner: marcorossi5
License: gpl-3.0
Language: Python
Default Branch: master
Homepage:
Size: 22.9 MB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 6
Releases: 3

Topics

deep-learning pytorch

Created about 6 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation

DUNEdn

If you use this software please cite this paper

bibtex @article{dunedn, author={Rossi, Marco and Vallecorsa, Sofia}, title={Deep Learning Strategies for ProtoDUNE Raw Data Denoising}, journal={Computing and Software for Big Science}, year={2022}, month={Jan}, day={07}, volume={6}, number={1}, numpages={9}, issn={2510-2044}, doi={10.1007/s41781-021-00077-9}, url={https://doi.org/10.1007/s41781-021-00077-9} }

DUNEdn is a denoising algorithm for ProtoDUNE-SP raw data with Neural Networks.

Documentation

The documentation for DUNEdn can be consulted in the readthedocs page: dunedn.readthedocs.io.

Installation

The package can be installed with Python's pip package manager.

From PyPI:

bash pip install dunedn

or manually:

bash git clone https://github.com/marcorossi5/DUNEdn.git cd DUNEdn pip install .

This process will copy the DUNEdn program to your environment python path.

Requirements

DUNEdn requires the following packages:

python3
numpy
pytorch
torchvision
matplotlib
hyperopt

Download large dataset files

Large files like dataset samples and models checkpoints are not included in the repository, but are available on Zenodo.

The download_dataset.sh convenience script automates the download of those files, populating the saved_models and examples directories with data to reproduce the results presented in arXiv:2103.01596.

Launch the following command to start the job:

bash bash ./download_dataset.sh

Running the code

In order to launch the code

bash dunedn <subcommand> [options]

Valid subcommands are: preprocess|train|inference.
Use dunedn <subcommand> --help to print the correspondent help message.
For example, the help message for train subcommand is:

```bash $ dunedn train --help usage: dunedn train [-h] [--output OUTPUT] [--force] configcard

Train model loading settings from configcard.

positional arguments: configcard yaml configcard path

optional arguments: -h, --help show this help message and exit --output OUTPUT output folder --force overwrite existing files if present ```

Configuration cards

Models' parameter settings are stored in configcards. The configcards folder contains some examples. These can be extended providing the path to user defined cards directly to the command line interface.

Setting the DUNEDN_SEARCH_PATH environment variable it is possible to let DUNEdn looking for configcards into different directories automatically. More on the search behavior can be found at the get_configcard_path function's docstring in the utils/ultis.py file.

Preprocess a dataset

At first, a dataset directory should have the following structure:

where each evts folder contains a collection of ProtoDUNE events stored as raw digits (numpy array format).

It is possible to generate the correspondent dataset to train an USCG or a GCNN network with the command:

bash dunedn preprocess <configcard.yaml> --dir_name <dataset directory>

This will modify the dataset directory tree in the following way:

Training a model

After specifying parameters inside a configuration card, leverage DUNEdn to train the correspondent model with:

bash dunedn train <configcard.yaml>

The output directory is set by default to output. Optionally, the DUNEDN_OUTPUT_PATH environment variable could be set to override this choice.

Inference

bash dunedn inference -i <input.npy> -o <output.npy> -m <modeltype> [--model_path <checkpoint.pth>]

DUNEdn inference takes the input.npy array and forwards it to the desired model modeltype. The output array is saved to output.npy.

If a checkpoint directory path is given with the optional --model_path flag, a saved model checkpoint could be loaded for inference.
The checkpoint directory should have the following structure:

On the other hand, if --model_path is not specified, an un-trained network is issued.

Benchmark

The paper results can be reproduced through the computedenoisingperformance.py benchmark.
Please, see the script's docstring for further information.

Owner

Name: Marco Rossi
Login: marcorossi5
Kind: user
Company: CERN openlab

Repositories: 4
Profile: https://github.com/marcorossi5

Doctoral student in Physics at University of Milan and CERN openlab

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: DUNEdn
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Marco
    family-names: Rossi
    email: rssmrc.11@gmail.com
    affiliation: CERN openlab
    orcid: 'https://orcid.org/0000-0002-7882-2798'
identifiers:
  - type: doi
    value: 10.5281/zenodo.5821521
    description: >-
      The package DOI for all the versions. Resolves
      to the latest version of the work.
repository-code: 'https://github.com/marcorossi5/DUNEdn'
url: 'https://dunedn.readthedocs.io/en/latest/'
abstract: >-
  In this work, we investigate different machine
  learning-based strategies for denoising raw
  simulation data from the ProtoDUNE experiment. The
  ProtoDUNE detector is hosted by CERN and it aims to
  test and calibrate the technologies for DUNE, a
  forthcoming experiment in neutrino physics. The
  reconstruction workchain consists of converting
  digital detector signals into physical high-level
  quantities. We address the first step in
  reconstruction, namely raw data denoising,
  leveraging deep learning algorithms. We design two
  architectures based on graph neural networks,
  aiming to enhance the receptive field of basic
  convolutional neural networks. We benchmark this
  approach against traditional algorithms implemented
  by the DUNE collaboration. We test the capabilities
  of graph neural network hardware accelerator setups
  to speed up training and inference processes.
keywords:
  - Deep learning
  - ProtoDUNE
  - Denoising
license: GPL-3.0
version: 2.0.0
date-released: '2022-06-13'

GitHub Events

Total

Last Year

Committers

Last synced: 11 months ago

All Time

Total Commits: 393
Total Committers: 2
Avg Commits per committer: 196.5
Development Distribution Score (DDS): 0.005

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
marcorossi5	r**1@g**m	391
Marco Rossi	r**o@i**h	2

Committer Domains (Top 20 + Academic)

ibmminsky-3.cern.ch: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 5
Total pull requests: 12
Average time to close issues: 7 months
Average time to close pull requests: 2 months
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.4
Average comments per pull request: 0.58
Merged pull requests: 7
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

marcorossi5 (5)

Pull Request Authors

marcorossi5 (12)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 12 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 1
Total maintainers: 1

pypi.org: dunedn

ProtoDUNE raw data denoising with DL

Homepage: https://github.com/marcorossi5/DUNEdn.git
Documentation: https://dunedn.readthedocs.io/
License: gpl-3.0
Latest release: 1.0.1
published over 4 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 12 Last month

Rankings

Dependent packages count: 10.1%

Dependent repos count: 21.6%

Forks count: 29.8%

Average: 31.1%

Stargazers count: 38.8%

Downloads: 55.3%

Maintainers (1)

romarco

Last synced: 9 months ago

Dependencies

.github/workflows/pytest.yaml actions

actions/checkout v1 composite
conda-incubator/setup-miniconda v2 composite

.github/workflows/pythonpublish.yaml actions

actions/checkout v1 composite
actions/setup-python v1 composite

pyproject.toml pypi

dunedn

Science Score: 77.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

DUNEdn

Documentation

Installation

Requirements

Download large dataset files

Running the code

Configuration cards

Preprocess a dataset

Training a model

Inference

Benchmark

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: dunedn

Rankings

Maintainers (1)

Dependencies