gpt-neox-llama
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: LorrinWWW
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 45.7 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README-MUP.md
How to use Mup (https://github.com/microsoft/mup)
Add mup neox args to your config
```
mup
"use-mup": true,
"save-base-shapes": false, # this only needs to be enabled once in order to generate the base-shapes-file on each rank
"base-shapes-file": "base-shapes", # load base shapes from this file
"coord-check": false, # generate coord check plots to verify mup's implementation in neox
mup hp search
"mup-init-scale": 1.0,
"mup-attn-temp": 1.0,
"mup-output-temp": 1.0,
"mup-embedding-mult": 1.0,
"mup-rp-embedding-mult": 1.0, ```
Generate base shapes
- Set use-mup to true
- Set save-base-shapes to true
- Run once. gpt-neox will instantiate a base model and a delta model, then save one file per rank named
. . gpt-neox will exit immediately. - Set save-base-shapes to false
Generate coord check plots (optional)
- Keep use-mup true
- Set coord-check to true
- Run once. gpt-neox will output jpg images similar to https://github.com/microsoft/mutransformers/blob/main/README.md#coord-check. gpt-neox will exit immediately
- Set coord-check to false
Tune mup hyperparameters and LR
The values under mup hp search were added and correspond to appendix F.4 from https://arxiv.org/pdf/2203.03466.pdf. These and LR are tuned with a random search using the scaled-up config (tested with 6-7B.yml) but with hidden-size set to the value from the scaled-down config (125M.yml).
Transfer
With the best LR set and the best mup HPs set, revert the value of hidden-size in the scaled-up config and run again.
Owner
- Name: Jue WANG
- Login: LorrinWWW
- Kind: user
- Location: Hangzhou
- Company: Zhejiang University
- Website: https://juewang.me/about/
- Repositories: 3
- Profile: https://github.com/LorrinWWW
Citation (CITATION.cff)
# YAML 1.2
---
authors:
- affiliation: EleutherAI
family-names: Andonian
given-names: Alex
- affiliation: EleutherAI
family-names: Biderman
given-names: Stella
- affiliation: EleutherAI
family-names: Black
given-names: Sid
- affiliation: EleutherAI
family-names: Gali
given-names: Preetham
- affiliation: EleutherAI
family-names: Gao
given-names: Leo
- affiliation: EleutherAI
family-names: Hallahan
given-names: Eric
- affiliation: EleutherAI
family-names: Levy-Kramer
given-names: Josh
- affiliation: EleutherAI
family-names: Leahy
given-names: Connor
- affiliation: EleutherAI
family-names: Nestler
given-names: Lucas
- affiliation: EleutherAI
family-names: Parker
given-names: Kip
- affiliation: EleutherAI
family-names: Pieler
given-names: Michael
- affiliation: EleutherAI
family-names: Purohit
given-names: Shivanshu
- affiliation: EleutherAI
family-names: Songz
given-names: Tri
- affiliation: EleutherAI
family-names: Phil
given-names: Wang
- affiliation: EleutherAI
family-names: Weinbach
given-names: Samuel
cff-version: "1.1.0"
keywords:
- "Transformers"
- "Massive language model"
- "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...
GitHub Events
Total
Last Year
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jue WANG | z****e@g****m | 2 |
| Jue Wang | j****e@c****i | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0