Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary
Repository
Federated Learning with StreamFlow
Basic Info
- Host: GitHub
- Owner: alpha-unito
- License: lgpl-3.0
- Language: Python
- Default Branch: master
- Size: 25.4 KB
Statistics
- Stars: 2
- Watchers: 6
- Forks: 3
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Federated Learning with StreamFlow
This repository contains a StreamFlow Federated Learning (FL) pipeline based on PyTorch. The workflow trains a VGG16 model with Group Normalization over two datasets:
The workflow is described with an extended version of CWL that introduces support for the Loop construct, necessary to describe the training-aggregate iteration of FL workloads.
Datasets have been placed onto two different HPC facilities:
- MNIST has been trained on the EPITO cluster at the University of Torino (1 80-core Arm Neoverse N1, 512GB RAM, and 2 NVIDIA A100 GPU per node);
- SVHN has been trained on the CINECA MARCONI100 cluster in Bologna (2 16-core IBM POWER9 AC922, 256GB RAM, and 4 NVIDIA V100 GPUs per node).
Since HPC worker nodes cannot access the Internet through outbound connections, this workload cannot be managed by FL frameworks that require direct bidirectional connections between worker and aggregator nodes. Conversely, StreamFlow relies on a pull-based data transfer mechanism that overcomes this limitation.
To also perform a direct comparison between StreamFlow and the Intel OpenFL framework, the pipeline has also been executed over two VMs (8 cores, 32GB RAM, 1 NVIDIA T4 GPU each) hosted on the HPC4AI Cloud at the University of Torino, acting as workers. Conversely, the aggregation plane has always been placed on Cloud.
If you want to cite this work, please use the reference below:
bibtex
@inproceedings{22:ml4astro,
location = {Catania, Italy},
author = {Iacopo Colonnelli and
Bruno Casella and
Gianluca Mittone and
Yasir Arfat and
Barbara Cantalupo and
Roberto Esposito and
Alberto Riccardo Martinelli and
Doriana Medi\'{c} and
Marco Aldinucci},
booktitle = {Astrophysics and Space Science Proceedings},
doi = {10.1007/978-3-031-34167-0_39},
editor = {Filomena Bufano and
Simone Riggi and
Eva Sciacca and
Francesco Schillir\`{o}},
isbn = {978-3-031-34167-0},
pages = {193--199},
publisher = {Springer},
address = {Cham, Switzerland},
title = {Federated Learning meets {HPC} and cloud},
volume = {60},
year = {2023}
}
Usage
To run the experiment as is, clone this repository on the aggregator node and use the following commands:
bash
python -m venv venv
source venv/bin/activate
pip install "streamflow==0.2.0.dev2"
pip install -r requirements.txt
streamflow run streamflow.yml
Reproducing the experiments in the same environment requires access to both HPC facilities and the HPC4AI Cloud. However, interested users can run the same pipeline on their preferred infrastructure by changing the deployments definitions in the streamflow.yml file and the corresponding Slurm/SSH scripts inside the environments folder.
Also, note that the Python dependencies listed in the requirements.txt file should be manually installed in any involved location (both the workers and the aggregator), and the datasets are supposed to be already present in the worker nodes.
Contributors
Iacopo Colonnelli iacopo.colonnelli@unito.it
Bruno Casella bruno.casella@unito.it
Marco Aldinucci marco.aldinucci@unito.it
Owner
- Name: Parallel programming: Alpha group
- Login: alpha-unito
- Kind: organization
- Location: Torino, IT
- Website: http://alpha.di.unito.it
- Repositories: 9
- Profile: https://github.com/alpha-unito
Parallel Computing research cluster, Department of Computer Science, University of Torino
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you want to cite StreamFlow FL, please refer to the article below."
authors:
- family-names: "Colonnelli"
given-names: "Iacopo"
orcid: "https://orcid.org/0000-0001-9290-2017"
title: "Federated Learning meets HPC and cloud"
version: 0.1
url: "https://github.com/alpha-unito/xffl"
preferred-citation:
type: conference-paper
authors:
- family-names: "Colonnelli"
given-names: "Iacopo"
orcid: "https://orcid.org/0000-0001-9290-2017"
- family-names: "Casella"
given-names: "Bruno"
orcid: "https://orcid.org/0000-0002-9513-6087"
- family-names: "Mittone"
given-names: "Gianluca"
orcid: "https://orcid.org/0000-0002-1887-6911"
- family-names: "Arfat"
given-names: "Yasir"
orcid: "https://orcid.org/0000-0002-6330-0399"
- family-names: "Cantalupo"
given-names: "Barbara"
orcid: "https://orcid.org/0000-0001-7575-3902"
- family-names: "Esposito"
given-names: "Roberto"
orcid: "https://orcid.org/0000-0003-4708-6860"
- family-names: "Martinelli"
given-names: "Alberto Riccardo"
orcid: "https://orcid.org/0000-0002-3707-7015"
- family-names: "Medić"
given-names: "Doriana"
orcid: "https://orcid.org/0000-0002-7163-5375"
- family-names: "Aldinucci"
given-names: "Marco"
orcid: "https://orcid.org/0000-0001-8788-0829"
doi: 10.1007/978-3-031-34167-0_39
collection-title: "Astrophysics and Space Science Proceedings"
editors:
- family-names: "Bufano"
given-names: "Filomena"
orcid: "https://orcid.org/0000-0002-3429-2481"
- family-names: "Riggi"
given-names: "Simone"
orcid: "https://orcid.org/0000-0001-6368-8330"
- family-names: "Sciacca"
given-names: "Eva"
orcid: "https://orcid.org/0000-0002-5574-2787"
- family-names: "Schillirò"
given-names: "Francesco"
orcid: "https://orcid.org/0000-0001-5106-2277"
isbn: 978-3-031-34167-0
publisher:
name: Springer
city: Cham
country: CH
start: 193
end: 199
title: "Federated Learning meets HPC and cloud"
volume: 60
year: 2023
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- Babel ==2.9.1
- Jinja2 ==3.1.1
- Keras-Preprocessing ==1.1.2
- Markdown ==3.3.6
- MarkupSafe ==2.1.1
- Pillow ==9.2.0
- PyYAML ==6.0
- Pygments ==2.11.2
- Pympler ==1.0.1
- Send2Trash ==1.8.0
- Werkzeug ==2.1.0
- absl-py ==1.0.0
- anyio ==3.5.0
- argon2-cffi ==21.3.0
- argon2-cffi-bindings ==21.2.0
- asttokens ==2.0.5
- astunparse ==1.6.3
- attrs ==21.4.0
- backcall ==0.2.0
- beautifulsoup4 ==4.10.0
- bleach ==4.1.0
- brotlipy ==0.7.0
- cachetools ==5.0.0
- certifi ==2021.10.8
- click ==8.0.1
- cloudpickle ==2.0.0
- colorama ==0.4.4
- commonmark ==0.9.1
- cycler ==0.11.0
- debugpy ==1.6.0
- decorator ==5.1.1
- defusedxml ==0.7.1
- docker ==5.0.3
- dynaconf ==3.1.7
- entrypoints ==0.4
- executing ==0.8.3
- flatten-json ==0.1.13
- fonttools ==4.34.4
- gast ==0.3.3
- google-auth ==2.6.2
- google-auth-oauthlib ==0.4.6
- google-pasta ==0.2.0
- grpcio ==1.34.1
- grpcio-tools ==1.34.1
- h5py ==2.10.0
- importlib-metadata ==4.11.3
- importlib-resources ==5.6.0
- ipykernel ==6.11.0
- ipython ==8.2.0
- ipython-genutils ==0.2.0
- jedi ==0.18.1
- joblib ==1.1.0
- json5 ==0.9.6
- jsonschema ==4.4.0
- jupyter-client ==7.2.1
- jupyter-core ==4.9.2
- jupyter-server ==1.16.0
- jupyterlab ==3.3.2
- jupyterlab-pygments ==0.1.2
- jupyterlab-server ==2.12.0
- keras ==2.8.0
- kiwisolver ==1.4.4
- matplotlib ==3.5.2
- matplotlib-inline ==0.1.3
- mistune ==0.8.4
- mkl-fft ==1.3.1
- mkl-service ==2.4.0
- nbclassic ==0.3.7
- nbclient ==0.5.13
- nbconvert ==6.4.5
- nbformat ==5.2.0
- nest-asyncio ==1.5.4
- notebook ==6.4.10
- notebook-shim ==0.1.0
- numpy ==1.18.5
- oauthlib ==3.2.0
- opencv-python ==4.6.0.66
- openfl ==1.3
- opt-einsum ==3.3.0
- packaging ==21.3
- pandas ==1.4.1
- pandocfilters ==1.5.0
- parso ==0.8.3
- pexpect ==4.8.0
- pickleshare ==0.7.5
- pip ==22.1.2
- prometheus-client ==0.13.1
- prompt-toolkit ==3.0.28
- protobuf ==3.19.4
- psutil ==5.9.0
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pyparsing ==3.0.7
- pyrsistent ==0.18.1
- python-dateutil ==2.8.2
- pytz ==2022.1
- pyzmq ==22.3.0
- requests-oauthlib ==1.3.1
- rich ==9.1.0
- rsa ==4.8
- scikit-learn ==1.0.2
- scipy ==1.8.0
- seaborn ==0.11.2
- setuptools ==61.2.0
- sniffio ==1.2.0
- soupsieve ==2.3.1
- stack-data ==0.2.0
- tensorboard ==2.8.0
- tensorboard-data-server ==0.6.1
- tensorboard-plugin-wit ==1.8.1
- tensorboardX ==2.5
- tensorflow ==2.3.1
- tensorflow-estimator ==2.3.0
- termcolor ==1.1.0
- terminado ==0.13.3
- testpath ==0.6.0
- threadpoolctl ==3.1.0
- torch ==1.11.0
- torchaudio ==0.11.0
- torchsummary ==1.5.1
- torchvision ==0.12.0
- tornado ==6.1
- tqdm ==4.63.1
- traitlets ==5.1.1
- typing-extensions ==3.10.0.2
- wcwidth ==0.2.5
- webencodings ==0.5.1
- websocket-client ==1.3.2
- wheel ==0.37.1
- wrapt ==1.14.0
- zipp ==3.7.0
- scipy ==1.9.
- torch ==1.12.
- torchvision ==0.13.