Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.1%) to scientific vocabulary
Repository
FMEngine [PyTorch version]
Basic Info
- Host: GitHub
- Owner: LorrinWWW
- Language: Python
- Default Branch: init
- Homepage: https://docs.yao.sh/docs/projects/fmengine/
- Size: 142 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
FMEngine
Training preparation
- Prepare checkpoints. As the first step, you will need to split a large model checkpoint into smaller pieces for each layer. This can be done by running the following command:
bash
python scripts/conversions/llama/from_hf.py \
--model_name_or_path meta-llama/Llama-2-7b-hf \
--output_dir path_to_outdir/llama2-7b \
--mp_world_size 1
You can download pre-configured checkpoints here: Google Drive.
- Prepare datasets. We now only supports
.jsonlformat, which is a list of json objects, each of which contains atextfield. For example, a sample of the dataset can be:
json
{"text": "I love this movie!"}
{"text": "I hate this movie!"}
{"text": "I don't know."}
Training
In /scripts, we show some examples of training scripts, for example, to finetune a pythia-2.8b model, you can run the following command:
bash
deepspeed --num_gpus 4 --num_nodes 1 starter.py \
--output_dir .cache/models \
--init_ckpt /pretrained_weights/pythia-160m-deduped \
--data_path /datasets/quantitative_natural_instructions/train/all.train.jsonl \
--max_seq_len 1024 \
--train_steps 1000 \
--eval_steps 10 \
--save_steps 100 \
--log_steps 1 \
--pipe_parallel_size 1 \
--model_parallel_size 1 \
--use_flash_attn true \
--deepspeed_config ./configs/pythia.json
You are also advised to read ./configs/pythia.json for the deepspeed configuration, which convers the learning rate, batch size, etc.
Supported Models
(we only tried finetuning but not pretraining - but it should work)
| Model | #Params | #Layers | #Heads | #Dim | Pretrained Checkpoint | Flash Attention | | --- | --- | --- | --- | --- | --- | --- | | Pythia-160M | 85M | 12 | 12 | 768 | Download | Yes | | Pythia-1.4B | 1.2B | 24 | 16 | 2048 | Download | Yes | | Pythia-2.8B | 2.5B | 32 | 32 | 2560 | Download | Yes | | OpenLlama-3B | tba | tba | tba | tba | Download | Yes |
Multi-host training
We support multi-host training with deepspeed. To run multi-host training, you need to install pdsh first, by running the following command:
bash
git clone https://github.com/chaos/pdsh.git
cd pdsh
./configure --enable-static-modules --without-rsh --with-ssh --without-ssh-connect-timeout-option --prefix=/your/preferred/path
make
make install
If you have root access, it might be easier.
References
Owner
- Name: Jue WANG
- Login: LorrinWWW
- Kind: user
- Location: Hangzhou
- Company: Zhejiang University
- Website: https://juewang.me/about/
- Repositories: 3
- Profile: https://github.com/LorrinWWW
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Yao" given-names: "Xiaozhe" orcid: "https://orcid.org/0000-0002-4661-533X" title: "FMEngine: Library for Training/Serving Foundation Models" version: 0.0.1 doi: 10.5281/zenodo.8314779 date-released: 2023-09-04 url: "https://github.com/eth-easl/fmengine"
GitHub Events
Total
Last Year
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Xiaozhe Yao | a****o@g****m | 54 |
| Jue Wang | j****e@c****i | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 3 minutes
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- LorrinWWW (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- nvcr.io/nvidia/pytorch 23.08-py3 build
- Cython *
- accelerate *
- datasets *
- deepspeed *
- diffusers *
- evaluate *
- loguru *
- numpy *
- pandas *
- peft *
- scikit-build *
- scikit-learn *
- sentencepiece *
- tabulate *
- tokenizers *
- transformers *
- wandb *