https://github.com/aiot-mlsys-lab/meit

[ACL 2025 Findings🔥] Official implementation of "Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation"

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

[ACL 2025 Findings🔥] Official implementation of "Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation"

Basic Info

Host: GitHub
Owner: AIoT-MLSys-Lab
Language: Python
Default Branch: main
Homepage:
Size: 1.65 MB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 2 years ago · Last pushed 12 months ago

Metadata Files

Readme

MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation (ACL 2025 Findings🔥)

Easy steps for efficient implementations

Step 1: download data and preprocess data

1.download data from google drive to your linux device: Google drive link: #############.
2.proprocess data: get into the 'config.yaml' file to set up the link of downloaded data:

``` if mimic: dataset: datasetname: 'mimic' ## this is for mimic dataset 21k ecgpath: 'xxxx' # add your image file path here textpath: 'xxxtrain.csv'

if ptbxl: dataset: datasetname: 'ptbxl' ## this is for PTB-XL dataset 21k ecgpath: '/fs/scratch/PAS2473/ptb-xl-a-large-publicly-available-electrocardiography-dataset-1.0.3/' # add your image file path here textpath: '/users/PAS2473/brucewan666/ECG/ECG/instructdata/RGenptbxl_train.csv' ```

3.run preprocess data:

``` get into preprocess_ecg.py set the path of yours (an example of ptbxl):

buildinstructdataset(ecgname='ptbxl',savepath='/users/PAS2473/brucewan666/ECG/ECG/instructdata/ptbxlecg_train.jsonl') # mimic

```

4.set up environment：

pip install -r requirements.txt

5.run ecginstructiontuning data and inference with only one ECG_instruction, give an example of mimic-ecg data:

``` export CUDAVISIBLEDEVICES=0

MODELSIZE=7B NUMGPUS=1 BATCHSIZEPERGPU=16 TOTALBATCHSIZE=64 # 144 50277 GRADIENTACCSTEPS=$(($TOTALBATCHSIZE/$NUMGPUS/$BATCHSIZEPERGPU)) echo "Training llama model ${MODELSIZE} using $NUMGPUS GPUs, $BATCHSIZEPERGPU batch size per GPU, $GRADIENTACCSTEPS gradient accumulation steps" # --usedeepspeed \ # --deepspeedconfigfile /home/wan.512/ECGLLMs/open-instruct/dsconfigs/stage3nooffloadingaccelerate.conf \

Lora training

accelerate launch --mainprocessport 31225 \ --mixedprecision bf16 \ --nummachines 1 \ --numprocesses $NUMGPUS \ /users/PAS2473/brucewan666/ECG/ECG/finetuneecgllmwithloramimic.py \ --modelnameorpath meta-llama/Llama-2-7b-hf \ --uselora \ --lorarank 64 \ --loraalpha 128 \ --loradropout 0.1 \ --tokenizername meta-llama/Llama-2-7b-hf \ --useslowtokenizer \ --trainfile /users/PAS2473/brucewan666/ECG/ECG/instructdata/mimicecg.jsonl \ --maxseqlength 128 \ --preprocessingnumworkers 16 \ --checkpointingsteps epoch \ --perdevicetrainbatchsize $BATCHSIZEPERGPU \ --gradientaccumulationsteps $GRADIENTACCSTEPS \ --learningrate 2e-5 \ --lrschedulertype linear \ --warmupratio 0.03 \ --weightdecay 0. \ --numtrainepochs 5 \ --outputdir /fs/scratch/PAS2473/zhongweisaveckpt/gpt2largelorackpt \ --withtracking \ --reportto tensorboard \ --useecgllm \ --devratio 0.1 \ --valtestratio 0.1 \ --loggingsteps 100 \ --evalstep 3200 \ --teststep 4000 \ --llmtype llama2 \ --cachedir /fs/scratch/PAS2473/zhongweimodels

```

Owner

Name: OSU AIoT-MLSys Lab
Login: AIoT-MLSys-Lab
Kind: organization
Location: United States of America

Website: https://aiot-mlsys-lab.github.io/
Repositories: 15
Profile: https://github.com/AIoT-MLSys-Lab

GitHub Events

Total

Issues event: 2
Watch event: 2
Push event: 1
Public event: 1

Last Year

Issues event: 2
Watch event: 2
Push event: 1
Public event: 1

Dependencies

requirements.txt pypi

accelerate ==0.31.0
alpaca-eval ==0.6.2
antlr4-python3-runtime ==4.11.0
autoflake *
beaker-py *
bitsandbytes >=0.41.1
black *
datasets *
deepspeed ==0.15.0
einops *
evaluate >=0.4.0
fire *
flake8 *
flash-attn ==2.6.3
flask *
gradio >=3.50.2
hf_transfer *
immutabledict *
isort *
jsonlines *
langdetect *
mpmath ==1.3.0
nltk ==3.8.1
openai >=1.0.0
openpyxl *
packaging *
peft >=0.11.1
protobuf *
pytest *
rouge_score *
scipy *
sentencepiece *
sympy ==1.12.0
tensorboard *
termcolor *
tiktoken *
tokenizers ==0.19.1
torch ==2.4.0
transformers ==4.43.4
unidic-lite *
vllm >=0.5.4
wandb *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/aiot-mlsys-lab/meit

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation (ACL 2025 Findings🔥)

Easy steps for efficient implementations

Step 1: download data and preprocess data

Lora training

Owner

GitHub Events

Total

Last Year

Dependencies