gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Science Score: 64.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
✓Committers with academic emails
8 of 130 committers (6.2%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.8%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Basic Info
- Host: GitHub
- Owner: EleutherAI
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://www.eleuther.ai/
- Size: 114 MB
Statistics
- Stars: 7,266
- Watchers: 127
- Forks: 1,072
- Open Issues: 85
- Releases: 3
Topics
Metadata Files
README-MUP.md
How to use Mup (https://github.com/microsoft/mup)
Add mup neox args to your config
```
mup
"use-mup": true,
"save-base-shapes": false, # this only needs to be enabled once in order to generate the base-shapes-file on each rank
"base-shapes-file": "base-shapes", # load base shapes from this file
"coord-check": false, # generate coord check plots to verify mup's implementation in neox
mup hp search
"mup-init-scale": 1.0,
"mup-attn-temp": 1.0,
"mup-output-temp": 1.0,
"mup-embedding-mult": 1.0,
"mup-rp-embedding-mult": 1.0, ```
Generate base shapes
- Set use-mup to true
- Set save-base-shapes to true
- Run once. gpt-neox will instantiate a base model and a delta model, then save one file per rank named
. . gpt-neox will exit immediately. - Set save-base-shapes to false
Generate coord check plots (optional)
- Keep use-mup true
- Set coord-check to true
- Run once. gpt-neox will output jpg images similar to https://github.com/microsoft/mutransformers/blob/main/README.md#coord-check. gpt-neox will exit immediately
- Set coord-check to false
Tune mup hyperparameters and LR
The values under mup hp search were added and correspond to appendix F.4 from https://arxiv.org/pdf/2203.03466.pdf. These and LR are tuned with a random search using the scaled-up config (tested with 6-7B.yml) but with hidden-size set to the value from the scaled-down config (125M.yml).
Transfer
With the best LR set and the best mup HPs set, revert the value of hidden-size in the scaled-up config and run again.
Owner
- Name: EleutherAI
- Login: EleutherAI
- Kind: organization
- Email: contact@eleuther.ai
- Location: The Internet
- Website: www.eleuther.ai
- Repositories: 51
- Profile: https://github.com/EleutherAI
Citation (CITATION.cff)
# YAML 1.2
---
authors:
- affiliation: EleutherAI
family-names: Andonian
given-names: Alex
- affiliation: EleutherAI
family-names: Anthony
given-names: Quentin
- affiliation: EleutherAI
family-names: Biderman
given-names: Stella
- affiliation: EleutherAI
family-names: Black
given-names: Sid
- affiliation: EleutherAI
family-names: Gali
given-names: Preetham
- affiliation: EleutherAI
family-names: Gao
given-names: Leo
- affiliation: EleutherAI
family-names: Hallahan
given-names: Eric
- affiliation: EleutherAI
family-names: Levy-Kramer
given-names: Josh
- affiliation: EleutherAI
family-names: Leahy
given-names: Connor
- affiliation: EleutherAI
family-names: Nestler
given-names: Lucas
- affiliation: EleutherAI
family-names: Parker
given-names: Kip
- affiliation: EleutherAI
family-names: Pieler
given-names: Michael
- affiliation: EleutherAI
family-names: Phang
given-names: Jason
- affiliation: EleutherAI
family-names: Purohit
given-names: Shivanshu
- affiliation: EleutherAI
family-names: Schoelkopf
given-names: Hailey
- affiliation: EleutherAI
family-names: Stander
given-names: Dashiell
- affiliation: EleutherAI
family-names: Songz
given-names: Tri
- affiliation: EleutherAI
family-names: Tigges
given-names: Curt
- affiliation: EleutherAI
family-names: Thérien
given-names: Benjamin
- affiliation: EleutherAI
family-names: Wang
given-names: Phil
- affiliation: EleutherAI
family-names: Weinbach
given-names: Samuel
cff-version: "1.1.0"
keywords:
- "Transformers"
- "Massive language model"
- "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "2.0.0"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...
GitHub Events
Total
- Create event: 27
- Commit comment event: 1
- Issues event: 45
- Watch event: 407
- Delete event: 15
- Issue comment event: 75
- Push event: 86
- Pull request review comment event: 25
- Pull request review event: 46
- Pull request event: 62
- Fork event: 102
Last Year
- Create event: 27
- Commit comment event: 1
- Issues event: 45
- Watch event: 407
- Delete event: 15
- Issue comment event: 75
- Push event: 86
- Pull request review comment event: 25
- Pull request review event: 46
- Pull request event: 62
- Fork event: 102
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Stella Biderman | s****n@g****m | 395 |
| sdtblck | 4****k | 312 |
| Josh Levy-Kramer | j****h@l****k | 222 |
| Samuel Weinbach | s****h@g****m | 206 |
| sid | s****k@a****e | 120 |
| github-actions | g****s@g****m | 73 |
| Quentin Anthony | q****y@y****m | 52 |
| Hailey Schoelkopf | 6****f | 45 |
| trisongz | t****i@s****m | 32 |
| Dashiell Stander | d****r@p****m | 32 |
| Leo Gao | 5****2 | 29 |
| Shivanshu Purohit | 4****t | 27 |
| dmahan93 | 4****3 | 15 |
| Jacob Hatef | 7****f | 15 |
| Phil Wang | l****s@g****m | 14 |
| jack | j****r@a****e | 14 |
| Xu Song | x****p@g****m | 11 |
| yang | 7****g | 10 |
| Aurelion | 3****e | 9 |
| Kyle1668 | k****1@g****m | 9 |
| Samuel Weinbach | s****h@g****m | 9 |
| Eric Hallahan | e****c@h****e | 9 |
| AI-WAIFU | 6****U | 9 |
| haileyschoelkopf | h****f@y****u | 9 |
| Michael Pieler | M****r@G****m | 8 |
| curt-tigges | ct@c****m | 8 |
| connor | c****5@g****m | 7 |
| Jason Phang | j****g@n****u | 7 |
| jaimemcc | 9****l | 7 |
| Satpal Singh Rathore | s****e@g****m | 7 |
| and 100 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 79
- Total pull requests: 208
- Average time to close issues: about 2 months
- Average time to close pull requests: about 1 month
- Total issue authors: 38
- Total pull request authors: 52
- Average comments per issue: 2.14
- Average comments per pull request: 0.91
- Merged pull requests: 146
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 26
- Pull requests: 82
- Average time to close issues: 22 days
- Average time to close pull requests: 14 days
- Issue authors: 17
- Pull request authors: 20
- Average comments per issue: 0.96
- Average comments per pull request: 0.67
- Merged pull requests: 54
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- StellaAthena (18)
- Quentin-Anthony (14)
- sdtblck (8)
- mackmake (4)
- fxnie (4)
- exnx (4)
- iPRET (3)
- jahatef (3)
- anthony-dipofi (2)
- tf-nv (2)
- Kyle1668 (2)
- lieh1203 (2)
- tijmen (2)
- srivassid (2)
- Carolingliang (2)
Pull Request Authors
- Quentin-Anthony (42)
- dmahan93 (34)
- jahatef (31)
- StellaAthena (26)
- sdtblck (17)
- AI-WAIFU (17)
- aurelion-source (15)
- haileyschoelkopf (14)
- yang (13)
- lucidrains (11)
- R0n12 (11)
- segyges (11)
- bclyang (10)
- jaimemcc-intel (8)
- DayOfThePenguin (8)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- autopep8 ==1.5.6 development
- clang-format ==13.0.1 development
- pre-commit * development
- pytest ==6.2.3 development
- pytest-cov ==2.11.1 development
- pytest-forked ==1.3.0 development
- pytest-xdist * development
- transformers * development
- cupy-cuda111 ==8.6.0
- triton ==0.4.2
- tensorboard ==2.5.0
- deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
- einops ==0.3.0
- ftfy ==6.0.1
- lm_dataformat ==0.0.20
- lm_eval ==0.2.0
- mpi4py ==3.0.3
- numpy ==1.22.0
- pybind11 ==2.6.2
- regex *
- sentencepiece *
- six *
- tokenizers ==0.10.2
- transformers *
- wandb ==0.10.28
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v2 composite
- crazy-max/ghaction-docker-meta v1 composite
- docker/build-push-action v2 composite
- docker/login-action v1 composite
- docker/setup-buildx-action v1 composite
- docker/setup-qemu-action v1 composite
- actions/checkout v2 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite
- pre-commit/action v2.0.3 composite
- nvidia/cuda 11.1.1-devel-ubuntu20.04 build
- flash-attn ==0.2.2
- boto3 *
- hf-transfer >=0.1.3
- wandb >=0.10.28
- actions/checkout v2 composite
- actions/upload-artifact v3 composite