itwinai-mnist-plugin
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: matbun
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 182 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Pure torch example on MNIST dataset
Integration author(s): Matteo Bunino (CERN)
In this simple use case integration we demostrate how to use itwinai for a set of simple use cases based on the popular MNIST dataset.
Training a CNN classifier
It is possible to launch the training of a CNN classifier on the MNIST dataset using the YAML configuration file describing the whole training workflow.
```bash
Run the whole training pipeline
itwinai exec-pipeline --config-name config.yaml ```
Notice that the training "pipeline" starts by downloading the dataset if not available locally.
Since on some HPC systems there is no internet connection on the compute nodes, it is
advisable to run the dataloading step on the login node to download the dataset and, later,
the whole pipeline on the compute nodes. To do that, you can use the pipe_steps option as
below:
```bash
Download dataset and exit
itwinai exec-pipeline --config-name config.yaml +pipesteps=[dataloadingstep]
Run the whole pipeline
itwinai exec-pipeline --config-name config.yaml ```
[!NOTE] Setting
HYDRA_FULL_ERROR=1environment variable can be convenient when debugging errors that originate during the instantiation of the pipeline.
View training logs on MLFLow server (if activated from the configuration):
bash
mlflow ui --backend-store-uri mllogs/mlflow/
Hyper-parameter optimization
The CNN classifier can undergo hyper-parameter optimization (HPO) to find the hyper-parameters, such as learning rate and batch size, that result in the best validation performances.
To do so, it is enough to correctly set the search_space and the tune_config in the trainer
configuration in the config.yaml file.
Please refer to the Ray's official documentation to know more about
RunConfig,
TuneConfig,
ScalingConfig,
and search spaces.
Inference
Now you can use the trained model to make predictions on the MNIST dataset.
Notice that the inference is defined by using a different pipeline in the config.yaml file.
By default, the training_pipeline is executed, but you can run other piplines by explicitly
setting the +pipe_key option.
Create sample dataset
python from dataloader import InferenceMNIST InferenceMNIST.generate_jpg_sample('mnist-sample-data/', 10)Generate a dummy pre-trained neural network
python import torch from model import Net dummy_nn = Net() torch.save(dummy_nn, 'mnist-pre-trained.pth')Run inference command. This will generate a "mnist-predictions" folder containing a CSV file with the predictions as rows.
bash itwinai exec-pipeline --config-name config.yaml +pipe_key=inference_pipeline
Note the same entry point as for training.
Training a GAN
In this use case you can also find an example on how to train a Generative Adversarial Network
(GAN). All you need to do is specify that you wish to use the GAN by setting the +pipe_key
option.
```bash
Train a GAN
itwinai exec-pipeline --config-name config.yaml +pipekey=trainingpipeline_gan ```
Docker image
Build from project root with
```bash
Local
docker buildx build -t itwinai:0.0.1-mnist-torch-0.1 -f use-cases/mnist/torch/Dockerfile .
Ghcr.io
docker buildx build -t ghcr.io/intertwin-eu/itwinai:0.0.1-mnist-torch-0.1 -f use-cases/mnist/torch/Dockerfile . docker push ghcr.io/intertwin-eu/itwinai:0.0.1-mnist-torch-0.1 ```
Training with Docker container
bash
docker run -it --rm --name running-inference \
-v "$PWD":/usr/data ghcr.io/intertwin-eu/itwinai:0.01-mnist-torch-0.1 \
/bin/bash -c "itwinai exec-pipeline \
--config-path /usr/src/app \
+pipe_key=training_pipeline \
dataset_root=/usr/data/mnist-dataset "
Inference with Docker container
From wherever a sample of MNIST jpg images is available (folder called 'mnist-sample-data/'):
text
├── $PWD
│ ├── mnist-sample-data
| │ ├── digit_0.jpg
| │ ├── digit_1.jpg
| │ ├── digit_2.jpg
...
| │ ├── digit_N.jpg
bash
docker run -it --rm --name running-inference \
-v "$PWD":/usr/data ghcr.io/intertwin-eu/itwinai:0.01-mnist-torch-0.1 \
/bin/bash -c "itwinai exec-pipeline \
--config-path /usr/src/app \
+pipe_key=inference_pipeline \
test_data_path=/usr/data/mnist-sample-data \
inference_model_mlflow_uri=/usr/src/app/mnist-pre-trained.pth \
predictions_dir=/usr/data/mnist-predictions "
This command will store the results in a folder called "mnist-predictions":
text
├── $PWD
│ ├── mnist-predictions
| │ ├── predictions.csv
Owner
- Name: Matteo Bunino
- Login: matbun
- Kind: user
- Website: matbun.github.io
- Repositories: 1
- Profile: https://github.com/matbun
Fellow @ CERN Openlab. Former data Science student @ {EURECOM, Polytechnic University of Turin}
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: itwinai-plugin-template
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Matteo
family-names: Bunino
email: matteo.bunino@cern.ch
affiliation: CERN
orcid: 'https://orcid.org/0009-0008-5100-9300'
repository-code: 'https://github.com/interTwin-eu/itwinai-plugin-template'
url: 'https://itwinai.readthedocs.io/'
abstract: AI on cloud and HPC made simple for science
keywords:
- Artificial intelligence
- Machine learning
- Digital twins
- Climate research
- Physics research
license: Apache-2.0
GitHub Events
Total
- Delete event: 1
- Member event: 1
- Push event: 2
- Pull request event: 2
- Create event: 2
Last Year
- Delete event: 1
- Member event: 1
- Push event: 2
- Pull request event: 2
- Create event: 2
Dependencies
- actions/checkout v4 composite
- gaurav-nelson/github-action-markdown-link-check v1 composite
- actions/checkout v4 composite
- github/super-linter/slim v7 composite
- actions/checkout v4 composite
- eosc-synergy/sqaaas-assessment-action v2 composite
- eosc-synergy/sqaaas-step-action v1 composite
- ghcr.io/intertwin-eu/itwinai torch-slim-latest build
- itwinai [torch] @ git+https://github.com/interTwin-eu/itwinai.git@main
- pytest >=8.3.4
- 158 dependencies