https://github.com/cern-it-innovation/gqc

Guided Quantum Compression (GQC) network for simultaneous dimensionality reduction and classification of high-dimensional data.

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
✓
Academic publication links
Links to: iop.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.5%) to scientific vocabulary

Keywords

deep-learning machine-learning neural-networks quantum-computing quantum-machine-learning

Last synced: 5 months ago · JSON representation

Repository

Guided Quantum Compression (GQC) network for simultaneous dimensionality reduction and classification of high-dimensional data.

Basic Info

Host: GitHub
Owner: CERN-IT-INNOVATION
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 24.5 MB

Statistics

Stars: 4
Watchers: 3
Forks: 5
Open Issues: 0
Releases: 0

Topics

deep-learning machine-learning neural-networks quantum-computing quantum-machine-learning

Created over 5 years ago · Last pushed 12 months ago

Metadata Files

Readme License

Guided Quantum Compression for Higgs Identification

Many data sets are too complex for currently available quantum computers. Consequently, quantum machine learning applications conventionally resort to dimensionality reduction algorithms, e.g., auto-encoders, before passing data through the quantum models. We show that using a classical auto-encoder as an independent preprocessing step can significantly decrease the classification performance of a quantum machine learning algorithm. To ameliorate this issue, we design an architecture that unifies the preprocessing and quantum classification algorithms into a single trainable model: the guided quantum compression model. The utility of this model is demonstrated by using it to identify the Higgs boson in proton-proton collisions at the LHC, where the conventional approach proves ineffective. Conversely, the guided quantum compression model excels at solving this classification problem, achieving a good accuracy. Additionally, the model developed herein shows better performance compared to the classical benchmark when using only low-level kinematic features.

This repository represents the source code of the following paper Guided quantum compression for high dimensional data classification

If you plan to use or take part of the code, please cite the usage: @article{Belis_2024, title={Guided quantum compression for high dimensional data classification}, volume={5}, ISSN={2632-2153}, url={http://dx.doi.org/10.1088/2632-2153/ad5fdd}, DOI={10.1088/2632-2153/ad5fdd}, number={3}, journal={Machine Learning: Science and Technology}, publisher={IOP Publishing}, author={Belis, Vasilis and Odagiu, Patrick and Grossi, Michele and Reiter, Florentin and Dissertori, Günther and Vallecorsa, Sofia}, year={2024}, month=jul, pages={035010} }

Installing Dependencies

We strongly recommend using conda to install the dependencies for this repo. If you have 'conda', go into the folder with the code you want to run, then create an environment from the .yml file in that folder. Activate the environment. Now you can run the code! Go to the Running the code section. for further instructions.

If you do not want to use conda, here is a list of the packages you would need to install:

Pre-processing * numpy * pandas * pytables * matplotlib * scikit-learn

Auto-encoders * numpy * matplotlib * scikit-learn * pytorch (follow instruction here) * torchinfo * pykeops * g++ compiler version >= 7 * cudatoolkit version >= 10
* geomloss

Pennylane VQC * numpy * matplotlib * scikit-learn * pytorch (follow instruction here) * torchinfo * pykeops * g++ compiler version >= 7 * cudatoolkit version >= 10
* geomloss * pennylane * pennylane-qiskit * pennylane-lightning[gpu] * NVidia cuQuantum SDK

The pykeops package is required to run the Sinkhorn auto-encoder. However, it is a tricky package to manage, so make sure that you have a gcc and a g++ compiler in your path that is compatible with the version of cuda you are running. We recommend using conda for exactly this reason, since conda sets certain environment variables such that everything is configured correctly and pykeops can compile using cuda.

If you encounter any bugs, please contact us at the email addresses listed on this repository.

Running the Code

The data preprocessing scripts are ran from inside the preprocessing folder. These scripts were customised for the specific data set that the authors are using. For access to this data, please contact us.

The preprocessing scripts produce normalised numpy arrays saved to three different files for training, validation, and testing.

The scripts to launch the autoencoder training on the data are in the bin folder. Look for the run.snip files to see the basic run cases for the code and customise from there.

Owner

Name: CERN-IT-INNOVATION
Login: CERN-IT-INNOVATION
Kind: organization

Repositories: 6
Profile: https://github.com/CERN-IT-INNOVATION

GitHub Events

Total

Watch event: 1
Push event: 3
Fork event: 3

Last Year

Watch event: 1
Push event: 3
Fork event: 3

Dependencies

setup.py pypi

cmake >=3.21.4
geomloss >=0.2.4
matplotlib >=3.4.3
numpy >=1.21.4
optuna >=2.10.0
pandas >=1.3.4
pyekops >=1.5
pyparsing <3
qiskit >=0.32.0
qiskit-machine-learning >=0.2.1
scikit-learn >=1.0.1
tables >=3.6.1
torch >=1.10.0
torchaudio >=0.10.0
torchinfo >=1.5.3
torchvision >=0.11.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cern-it-innovation/gqc

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Guided Quantum Compression for Higgs Identification

Installing Dependencies

Running the Code

Owner

GitHub Events

Total

Last Year

Dependencies