llm-gpt-neox

https://github.com/tilde-nlp/llm-gpt-neox

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: tilde-nlp
License: apache-2.0
Language: Python
Default Branch: main
Size: 73 MB

Statistics

Stars: 0
Watchers: 4
Forks: 1
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed 10 months ago

Metadata Files

Readme Contributing License Citation Codeowners

How to use Mup (https://github.com/microsoft/mup)

Add mup neox args to your config

```

mup

"use-mup": true,

"save-base-shapes": false, # this only needs to be enabled once in order to generate the base-shapes-file on each rank

"base-shapes-file": "base-shapes", # load base shapes from this file

"coord-check": false, # generate coord check plots to verify mup's implementation in neox

mup hp search

"mup-init-scale": 1.0,

"mup-attn-temp": 1.0,

"mup-output-temp": 1.0,

"mup-embedding-mult": 1.0,

"mup-rp-embedding-mult": 1.0, ```

Generate base shapes

Set use-mup to true
Set save-base-shapes to true
Run once. gpt-neox will instantiate a base model and a delta model, then save one file per rank named .. gpt-neox will exit immediately.
Set save-base-shapes to false

Generate coord check plots (optional)

Keep use-mup true
Set coord-check to true
Run once. gpt-neox will output jpg images similar to https://github.com/microsoft/mutransformers/blob/main/README.md#coord-check. gpt-neox will exit immediately
Set coord-check to false

Tune mup hyperparameters and LR

The values under mup hp search were added and correspond to appendix F.4 from https://arxiv.org/pdf/2203.03466.pdf. These and LR are tuned with a random search using the scaled-up config (tested with 6-7B.yml) but with hidden-size set to the value from the scaled-down config (125M.yml).

Transfer

With the best LR set and the best mup HPs set, revert the value of hidden-size in the scaled-up config and run again.

Owner

Name: Tilde
Login: tilde-nlp
Kind: organization
Location: Riga, Latvia

Website: http://www.tilde.com
Repositories: 58
Profile: https://github.com/tilde-nlp

Citation (CITATION.cff)

# YAML 1.2
---
authors:
  - affiliation: EleutherAI
    family-names: Andonian
    given-names: Alex
  - affiliation: EleutherAI
    family-names: Anthony
    given-names: Quentin
  - affiliation: EleutherAI
    family-names: Biderman
    given-names: Stella
  - affiliation: EleutherAI
    family-names: Black
    given-names: Sid
  - affiliation: EleutherAI
    family-names: Gali
    given-names: Preetham
  - affiliation: EleutherAI
    family-names: Gao
    given-names: Leo
  - affiliation: EleutherAI
    family-names: Hallahan
    given-names: Eric
  - affiliation: EleutherAI
    family-names: Levy-Kramer
    given-names: Josh
  - affiliation: EleutherAI
    family-names: Leahy
    given-names: Connor
  - affiliation: EleutherAI
    family-names: Nestler
    given-names: Lucas
  - affiliation: EleutherAI
    family-names: Parker
    given-names: Kip
  - affiliation: EleutherAI
    family-names: Pieler
    given-names: Michael
  - affiliation: EleutherAI
    family-names: Phang
    given-names: Jason
  - affiliation: EleutherAI
    family-names: Purohit
    given-names: Shivanshu
  - affiliation: EleutherAI
    family-names: Schoelkopf
    given-names: Hailey
  - affiliation: EleutherAI
    family-names: Stander
    given-names: Dashiell
  - affiliation: EleutherAI
    family-names: Songz
    given-names: Tri
  - affiliation: EleutherAI
    family-names: Tigges
    given-names: Curt
  - affiliation: EleutherAI
    family-names: Thérien
    given-names: Benjamin
  - affiliation: EleutherAI
    family-names: Wang
    given-names: Phil
  - affiliation: EleutherAI
    family-names: Weinbach
    given-names: Samuel
cff-version: "1.1.0"
keywords:
  - "Transformers"
  - "Massive language model"
  - "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "2.0.0"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...

GitHub Events

Total

Delete event: 3
Issue comment event: 1
Member event: 2
Push event: 158
Pull request review event: 5
Pull request review comment event: 3
Pull request event: 7
Fork event: 1
Create event: 5

Last Year

Delete event: 3
Issue comment event: 1
Member event: 2
Push event: 158
Pull request review event: 5
Pull request review comment event: 3
Pull request event: 7
Fork event: 1
Create event: 5

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 5
Average time to close issues: N/A
Average time to close pull requests: 3 days
Total issue authors: 0
Total pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 5
Average time to close issues: N/A
Average time to close pull requests: 3 days
Issue authors: 0
Pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

SilverSulfide (4)
iPRET (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

.github/workflows/.cpu_ci_on_pr.yml actions

./tests/cpu_tests * composite
actions/checkout v4 composite

.github/workflows/coverity_scan.yml actions

actions/checkout v2 composite
actions/upload-artifact v3 composite

.github/workflows/cpu_ci.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/cpu_ci_dispatch.yml actions

./tests/cpu_tests * composite
actions/checkout v4 composite

.github/workflows/docker_build.yml actions

actions/checkout v2 composite
crazy-max/ghaction-docker-meta v1 composite
docker/build-push-action v2 composite
docker/login-action v1 composite
docker/setup-buildx-action v1 composite
docker/setup-qemu-action v1 composite

.github/workflows/pull_request.yml actions

actions/checkout v2 composite
actions/checkout v3 composite
actions/setup-python v4 composite
docker/build-push-action v2 composite
docker/setup-buildx-action v1 composite
pre-commit/action v2.0.3 composite

tests/cpu_tests/action.yml actions

actions/checkout v4 composite
actions/setup-python v4 composite

Dockerfile docker

nvcr.io/nvidia/pytorch 24.02-py3 build

docker-compose-dockerhub.yml docker

leogao2/gpt-neox main

docker-compose.yml docker

gpt-neox latest

tests/cpu_tests/docker-compose.yml docker

gpt-neox latest

megatron/fused_kernels/setup.py pypi

requirements/pyproject.toml pypi

requirements/requirements-apex-pip.txt pypi

pip ==23.3.2

requirements/requirements-comet.txt pypi

comet_ml >=3.45.0

requirements/requirements-dev.txt pypi

autopep8 >=1.5.6 development
clang-format >=13.0.1 development
packaging >=23.0 development
pre-commit >=2.17.0 development
pytest >=6.2.3 development
pytest-cov >=2.11.1 development
pytest-forked >=1.3.0 development
pytest-html ==4.1.1 development
pytest-xdist * development
toml >=0.10.2 development

requirements/requirements-flashattention.txt pypi

flash-attn ==2.5.6

requirements/requirements-mamba.txt pypi

causal_conv1d >=1.1.0
einops *
mamba_ssm >=1.2.0.post1

requirements/requirements-onebitadam.txt pypi

cupy-cuda111 >=8.6.0

requirements/requirements-s3.txt pypi

boto3 *
hf-transfer >=0.1.3

requirements/requirements-sparseattention.txt pypi

triton ==2.1.0

requirements/requirements-tensorboard.txt pypi

tensorboard ==2.13.0

requirements/requirements-transformerengine.txt pypi

requirements/requirements-wandb.txt pypi

wandb >=0.10.28

requirements/requirements.txt pypi

ftfy >=6.0.1
huggingface_hub >=0.11.0
jinja2 ==3.1.4
lm_eval >=0.4.0,<=0.4.1
mpi4py >=3.0.3
numpy <2.0
pybind11 >=2.6.2
regex *
sentencepiece *
six *
tiktoken >=0.1.2
tokenizers >=0.12.1
transformers ==4.38.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science