dynamicfl

Towards Fairness-aware and Privacy-preserving Enhanced Collaborative Learning for Healthcare

https://github.com/paridis-11/dynamicfl

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Towards Fairness-aware and Privacy-preserving Enhanced Collaborative Learning for Healthcare

Basic Info
  • Host: GitHub
  • Owner: paridis-11
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 2.52 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

DynamicFL

Towards Fairness-aware and Privacy-preserving Enhanced Collaborative Learning for Healthcare [ Under Review]

image

Getting Started

1. Data

This project uses three publicly available datasets. Please follow the steps below to download and prepare the datasets:

Dataset 1: ChestXray

  1. Download the dataset from Kaggle.
  2. Create a folder named data in the root directory of this project.
  3. Extract the downloaded dataset into the data/covid directory.
    • After extraction, the folder structure should look like:
      data/ COVID-19/ COVID/ Normal/ Lung_Opacity/ Viral Pneumonia/
  4. Inside each of the class folders (COVID, NORMAL, LUNG_OPACITY, VIRAL_PNEUMONIA), delete the masks subfolder because this project focuses only on classification tasks and does not involve segmentation.

Dataset 2 & Dataset 3

  1. Download the second and third datasets from Zenodo.
  2. Extract these datasets into the data directory.

2. Install Dependencies

To set up the environment, ensure you have Python and Anaconda installed, then install the required libraries using the requirements.txt file, 12.1 cuda environment is recommended:

```bash conda create -n dyfl python=3.9 conda activate dyfl pip install -r requirements.txt pip install --upgrade wandb protobuf

```

3. Usage

3.1 Training with Different ViT Architectures

You can train the model with different ViT (Vision Transformer) architectures by specifying the --model parameter. Below are some examples:

```python

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_tiny --device cuda:0

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_small --device cuda:0

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_1B --device cuda:0

```

3.2 Training with Non-IID Data Partitions

You can simulate non-IID (non-Independent and Identically Distributed) data by using the --partition parameter and setting the Dirichlet distribution parameter (--dir). Below are some examples:

```python

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0 --partition dir --dir 0.1

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0 --partition dir --dir 0.3

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0 --partition dir --dir 0.5

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0 --partition dir --dir 0.7

python dyflvit.py --dataset bloodcell --nodenum 30 --model Vit_Base --device cuda:0 --partition dir --dir 0.9 ```

3.3 Training with Different Device Ratios

You can allocate computational resources across devices in specific ratios using the --device_ratio parameter. Below are some examples:

```python

python dyflvit.py --dataset bloodcell --nodenum 30 --model VitBase --device cuda:0 --deviceratio 7:2:1

python dyflvit.py --dataset bloodcell --nodenum 30 --model VitBase --device cuda:0 --deviceratio 5:2:3

python dyflvit.py --dataset bloodcell --nodenum 30 --model VitBase --device cuda:0 --deviceratio 4:1:5

python dyflvit.py --dataset bloodcell --nodenum 30 --model VitBase --device cuda:0 --deviceratio 4:3:3

python dyflvit.py --dataset bloodcell --nodenum 30 --model VitBase --device cuda:0 --deviceratio 3:6:1

```

3.4 Training with Different Device Counts

You can vary the number of devices (or nodes) in your distributed setup using the --node_num parameter. Below are some examples:

```python

python dyflvit.py --dataset bloodcell --nodenum 90 --model Vit_Base --device cuda:0

python dyflvit.py --dataset bloodcell --nodenum 180 --model Vit_Base --device cuda:0

python dyflvit.py --dataset bloodcell --nodenum 360 --model Vit_Base --device cuda:0

```

3.5 Running on CPU Only (High Memory Requirement)

If you want to run only on CPU, you can set the --device parameter to cpu. Note: Running on CPU requires very high RAM, especially for larger ViT models. It is recommended to use ViT-Tiny for testing on CPU, with at least 128GB of RAM.

```python

python dyflvit.py --dataset bloodcell --device cpu --model vittiny --wandb 0

```

4. Contact

For any questions or issues, please open an issue on the GitHub repository.

Owner

  • Name: paridis
  • Login: paridis-11
  • Kind: user

Citation (CITATION.cff)

message: "If you use this software, please cite it as below."
authors:Feilong Zhang
- 
orcid: https://orcid.org/0000-0002-6335-2543

title: Towards Fairness-aware and Privacy-preserving Enhanced Collaborative Learning for Healthcare
version: NC_Final
date-released: 2025-02-28

GitHub Events

Total
  • Release event: 1
  • Watch event: 1
  • Push event: 49
  • Create event: 1
Last Year
  • Release event: 1
  • Watch event: 1
  • Push event: 49
  • Create event: 1