master-thesis-public

:book: My master thesis at the University of Firenze

https://github.com/mbarbetti/master-thesis-public

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, iop.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Keywords

deep-learning flash-simulation gan generative-adversarial-network lhcb-experiment machine-learning master-degree master-degree-thesis master-thesis particle-physics thesis ultrafast-simulation

Last synced: 6 months ago · JSON representation ·

Repository

:book: My master thesis at the University of Firenze

Basic Info

Host: GitHub
Owner: mbarbetti
Default Branch: main
Homepage: https://cds.cern.ch/record/2826210
Size: 10.2 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

deep-learning flash-simulation gan generative-adversarial-network lhcb-experiment machine-learning master-degree master-degree-thesis master-thesis particle-physics thesis ultrafast-simulation

Created over 2 years ago · Last pushed over 2 years ago

Metadata Files

Readme Citation

Master Thesis

Report number : CERN-THESIS-2020-416

Title

[ENG] Techniques for parametric simulation with deep neural networks and implementation for the LHCb experiment at CERN and its future upgrades
[ITA] Tecniche di simulazione parametrica con reti neurali profonde e loro implementazione per l'esperimento LHCb al CERN e le sue future evoluzioni

Info

Supervisor: Lucio Anderlini – Lucio.Anderlini@fi.infn.it
Co-supervisor: Piegiulio Lenzi – Piergiulio.Lenzi@unifi.it
Graduation day: June 12, 2020
Graduation score: 110/110 cum laude

Abstract

The LHCb experiment [1] is one of the four detectors along the accelerator ring of the LargeHadron Collider (LHC) at CERN, and is designed for the study of heavy flavour physics in $pp$ collisions. Its primary goal is to look for indirect evidences of phenomena beyond the Standard Model in $CP$-violation and in rare decays of $b$- and $c$-hadrons. The upgraded LHCb detector will get back to data taking in 2021 with the LHC Run 3, relying on an increase of its statistical power by at least an order of magnitude thanks to a fully software trigger system [2]. This will allow to reach unprecedented accuracy as long as the Collaboration will be able to provide simulated samples as large. As a direct consequence, the production of such simulated samples will dominate the computing effort of the experiment [3]. Reproducing accurately all the physics processes from the $pp$ collisions to the radiation-matter interactions within the detectors (the full simulation approach) is already now incapable to sustain the analysis demands of the various LHCb physics groups, and it is therefore necessary to adopt faster solutions to take full advantage of the upgraded detector [3]. Ultra-fast simulation needs lower computing resources, renouncing to reproduce radiation-matter interactions and parameterizing directly the high-level response of the detector. The LHCb subsystems are based on various physics processes, also very different, that make building high-level parameterizations non-trivial: for example, the Particle Identification (PID) system combines information from RICH, calorimeter and muon detectors. This task can be carried out effectively by Generative Adversarial Networks (GAN), a powerful class of deep learning algorithms able to reproduce highly faithful and diverse probability distributions thanks to a generative model learned directly from data [4].

A large part of this thesis has concerned the development and implementation of state-of-the-art GAN algorithms [5] to provide the high-level response of the PID subsystems of LHCb. Following what done in Ref. [6] where GANs are successfully used to reproduce the outputs of the RICH system to traversing particles, I have generalized and formalized such results building generative models based on neural networks able to parameterize faithfully the high-level response of the Global PID system of LHCb. These neural networks were trained over the calibration samples collected in 2016, in order to provide datasets composed by an unbiased selection of long-lived particles. I have modified the learning procedure to subtract statistically the residual background within the training data, and I have developed an independent algorithm capable to measure the quality of the generated samples. This strategy has allowed to build models capable not only to parameterize the high-level response of specific detectors (such as RICH detectors and muon system) to different particles traversing them, but also to reproduce the distributions of variables resulting from the combination of various detector responses (the Global PID system). Therefore, given a few basic information such as the particle type, its kinematics and the total number of tracks within the detector, the trained models are able to synthesize accurately a wide range of probability distributions representing the response obtained from a single detector or from their combination.

The second important personal contribution is related to the design and development of the mambah framework, a Python package aimed to provide and manage user friendly data structures for High Energy Physics applications. All mambah objects were designed to take full advantage of a batch-grained framework, using the most modern softwares for parallel computing and exploiting efficiently hardware accelerators, such as GPUs or FPGAs. Within the mambah project, I have dealt with the implementation of database management functions and to the design of a simulation framework based on mambah and named mambah.sim. The mambah.sim module allows to selectively generate particles with the kinematics set by the $pp$ collision and to propagate them within the detector defining custom parameterization functions for efficiencies and resolutions: I have developed the parameterization for thePID system of LHCb. I have proved the correctness of the implemented models, showing the generalization capabilities of GANs in describing decay channels different from the one of training. Lastly, I have shown that the samples produced by mambah.sim are competitive with the full simulated ones, while ensuring a significant reduction of the computing cost.

References

LHCb Collaboration, A. A. Alves Jr. et al., The LHCb Detector at the LHC, JINST 3 (2008) S08005
LHCb Collaboration, LHCb Trigger and Online Upgrade Technical Design Report, LHCB-TDR-016, CERN, 2014
LHCb Collaboration, Upgrade Software and Computing, LHCB-TDR-017, CERN, 2018
I. J. Goodfellow et al., Generative Adversarial Networks, arXiv:1406.2661
M. G. Bellemare et al., The Cramer Distance as a Solution to Biased Wasserstein Gradients, arXiv:1705.10743
A. Maevskiy et al., Fast Data-Driven Simulation of Cherenkov Detectors Using Generative Adversarial Networks, arXiv:1905.11825

Cite me

Are you referring to my research project? Please cite me!

M. Barbetti, Techniques for parametric simulation with deep neural networks and implementation for the LHCb experiment at CERN and its future upgrades, Master's thesis, University of Florence, 2020

bibtex @mastersthesis{Barbetti:2826210, author = "Barbetti, Matteo", title = "{Techniques for parametric simulation with deep neural networks and implementation for the LHCb experiment at CERN and its future upgrades}", school = "University of Florence", year = "2020", url = "https://cds.cern.ch/record/2826210", }

Owner

Name: Matteo Barbetti
Login: mbarbetti
Kind: user
Location: Firenze, Italy
Company: University of Florence

Website: mbarbetti.github.io
Twitter: mbarbetz
Repositories: 11
Profile: https://github.com/mbarbetti

PhD student in Smart Computing @ UniFi

Citation (CITATION.bib)

@mastersthesis{Barbetti:2826210,
    author = "Barbetti, Matteo",
    title  = "{Techniques for parametric simulation with deep neural networks and implementation for the LHCb experiment at CERN and its future upgrades}",
    school = "University of Florence",
    year   = "2020",
    url    = "https://cds.cern.ch/record/2826210",
}

GitHub Events

Total

Last Year

Committers

Last synced: 8 months ago

All Time

Total Commits: 4
Total Committers: 1
Avg Commits per committer: 4.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Matteo Barbetti	m**4@g**m	4

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

master-thesis-public

Science Score: 41.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Master Thesis

Title

Info

Abstract

References

Cite me

Owner

Citation (CITATION.bib)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels