dunedn
Deep Learning applications on ProtoDUNE raw data denoising
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 10 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, zenodo.org -
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.2%) to scientific vocabulary
Keywords
Repository
Deep Learning applications on ProtoDUNE raw data denoising
Basic Info
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 6
- Releases: 3
Topics
Metadata Files
README.md
DUNEdn
If you use this software please cite this paper
bibtex
@article{dunedn,
author={Rossi, Marco
and Vallecorsa, Sofia},
title={Deep Learning Strategies for ProtoDUNE Raw Data Denoising},
journal={Computing and Software for Big Science},
year={2022},
month={Jan},
day={07},
volume={6},
number={1},
numpages={9},
issn={2510-2044},
doi={10.1007/s41781-021-00077-9},
url={https://doi.org/10.1007/s41781-021-00077-9}
}
DUNEdn is a denoising algorithm for ProtoDUNE-SP raw data with Neural Networks.
Documentation
The documentation for DUNEdn can be consulted in the readthedocs page: dunedn.readthedocs.io.
Installation
The package can be installed with Python's pip package manager.
From PyPI:
bash
pip install dunedn
or manually:
bash
git clone https://github.com/marcorossi5/DUNEdn.git
cd DUNEdn
pip install .
This process will copy the DUNEdn program to your environment python path.
Requirements
DUNEdn requires the following packages:
- python3
- numpy
- pytorch
- torchvision
- matplotlib
- hyperopt
Download large dataset files
Large files like dataset samples and models checkpoints are not included in the repository, but are available on Zenodo.
The download_dataset.sh convenience script automates the download of those
files, populating the saved_models and examples
directories with data to reproduce the results presented in
arXiv:2103.01596.
Launch the following command to start the job:
bash
bash ./download_dataset.sh
Running the code
In order to launch the code
bash
dunedn <subcommand> [options]
Valid subcommands are: preprocess|train|inference.
Use dunedn <subcommand> --help to print the correspondent help message.
For example, the help message for train subcommand is:
```bash $ dunedn train --help usage: dunedn train [-h] [--output OUTPUT] [--force] configcard
Train model loading settings from configcard.
positional arguments: configcard yaml configcard path
optional arguments: -h, --help show this help message and exit --output OUTPUT output folder --force overwrite existing files if present ```
Configuration cards
Models' parameter settings are stored in configcards. The configcards folder contains some examples. These can be extended providing the path to user defined cards directly to the command line interface.
Setting the DUNEDN_SEARCH_PATH environment variable it is possible to let DUNEdn
looking for configcards into different directories automatically. More on the
search behavior can be found at the get_configcard_path function's docstring
in the utils/ultis.py file.
Preprocess a dataset
At first, a dataset directory should have the following structure:
text
dataset directory tree structure:
dataset_dir
|-- train
| |--- evts
|-- val
| |--- evts
|-- test
| |--- evts
where each evts folder contains a collection of ProtoDUNE events stored as raw
digits (numpy array format).
It is possible to generate the correspondent dataset to train an USCG or a GCNN network with the command:
bash
dunedn preprocess <configcard.yaml> --dir_name <dataset directory>
This will modify the dataset directory tree in the following way:
txt
dataset directory tree structure:
dataset_dir
|-- train
| |--- evts
| |-- planes (preprocess product)
| |-- crops (preprocess product)
|-- val
| |--- evts
| |--- planes (preprocess product)
|-- test
| |--- evts
| |--- planes (preprocess product)
Training a model
After specifying parameters inside a configuration card, leverage DUNEdn to train the correspondent model with:
bash
dunedn train <configcard.yaml>
The output directory is set by default to output. Optionally, the
DUNEDN_OUTPUT_PATH environment variable could be set to override this choice.
Inference
bash
dunedn inference -i <input.npy> -o <output.npy> -m <modeltype> [--model_path <checkpoint.pth>]
DUNEdn inference takes the input.npy array and forwards it to the desired model
modeltype. The output array is saved to output.npy.
If a checkpoint directory path is given with the optional --model_path flag, a
saved model checkpoint could be loaded for inference.
The checkpoint directory should have the following structure:
text
model_path
|-- collection
| |-- <ckpt directory name>_dn_collection.pth
|-- induction
| |-- <ckpt directory name>_dn_induction.pth
On the other hand, if --model_path is not specified, an un-trained network is issued.
Benchmark
The paper results can be reproduced through the
computedenoisingperformance.py benchmark.
Please, see the script's docstring for further information.
Owner
- Name: Marco Rossi
- Login: marcorossi5
- Kind: user
- Company: CERN openlab
- Repositories: 4
- Profile: https://github.com/marcorossi5
Doctoral student in Physics at University of Milan and CERN openlab
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: DUNEdn
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Marco
family-names: Rossi
email: rssmrc.11@gmail.com
affiliation: CERN openlab
orcid: 'https://orcid.org/0000-0002-7882-2798'
identifiers:
- type: doi
value: 10.5281/zenodo.5821521
description: >-
The package DOI for all the versions. Resolves
to the latest version of the work.
repository-code: 'https://github.com/marcorossi5/DUNEdn'
url: 'https://dunedn.readthedocs.io/en/latest/'
abstract: >-
In this work, we investigate different machine
learning-based strategies for denoising raw
simulation data from the ProtoDUNE experiment. The
ProtoDUNE detector is hosted by CERN and it aims to
test and calibrate the technologies for DUNE, a
forthcoming experiment in neutrino physics. The
reconstruction workchain consists of converting
digital detector signals into physical high-level
quantities. We address the first step in
reconstruction, namely raw data denoising,
leveraging deep learning algorithms. We design two
architectures based on graph neural networks,
aiming to enhance the receptive field of basic
convolutional neural networks. We benchmark this
approach against traditional algorithms implemented
by the DUNE collaboration. We test the capabilities
of graph neural network hardware accelerator setups
to speed up training and inference processes.
keywords:
- Deep learning
- ProtoDUNE
- Denoising
license: GPL-3.0
version: 2.0.0
date-released: '2022-06-13'
GitHub Events
Total
Last Year
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| marcorossi5 | r****1@g****m | 391 |
| Marco Rossi | r****o@i****h | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 5
- Total pull requests: 12
- Average time to close issues: 7 months
- Average time to close pull requests: 2 months
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.4
- Average comments per pull request: 0.58
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- marcorossi5 (5)
Pull Request Authors
- marcorossi5 (12)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 12 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 1
- Total maintainers: 1
pypi.org: dunedn
ProtoDUNE raw data denoising with DL
- Homepage: https://github.com/marcorossi5/DUNEdn.git
- Documentation: https://dunedn.readthedocs.io/
- License: gpl-3.0
-
Latest release: 1.0.1
published about 4 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v1 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v1 composite
- actions/setup-python v1 composite