kerple

https://github.com/chijames/kerple

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: chijames
License: apache-2.0
Language: Python
Default Branch: main
Size: 212 KB

Statistics

Stars: 19
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 1

Created over 3 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation Codeowners

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

PyTorch implementation of the paper KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation accepted at NeurIPS 2022. This repository is adapted from the awesome gpt-neox library.

Important Changes and Information

This repository was developed based on commit 450b58c4ad7f36c319ca0b2f089c7349f34d8c3b of gpt-neox. We bump it to commit 738b87e73775e2cef4ea0a898b655f5d717cb8a0 to include some (irrelevant to this project) bug fixes. We only keep the main branch.
We remove the .github/ folder as it is not needed in our experiments.
The original gpt-neox readme is renamed as READMEgptneox.md.
The config files used in our experiments are stored in kerple_configs/.
The two proposed positional embeddings are called ParallelKerplePower and ParallelKerpleLog in this repository. A simple grep will point you to our implementation.

Installation

Please refer to the original readme READMEgptneox.md for details. We use the Host Setup without fused kernels.

Data Preparation

Warning: These datasets are huge! Please make sure you have at least 250 GB of disk space before download them all.

We use the three preconfigured datasets in the orignal gpt-neox repository: python prepare_data.py -d ./data openwebtext2 python prepare_data.py -d ./data arxiv python prepare_data.py -d ./data github Please refer to the original readme READMEgptneox.md for details.

Config Preparation

python generate_ymls.py

Training

bash train.sh

Testing

bash test.sh

Pretrained Models

We release 6 pretrained checkpoints: kerple_log and kerple_power pretrained on the above three datasets.

Please navigate to Releases to download the checkpoints.
You can right click on the filename, copy link address, and use wget to download it directly in a command line environment.
Once the files are downloaded, unzip them and leave them in the current directory.
Run test.sh, and the extrapolation performance should be very close to the numbers reported in Table 3 of the paper.

Owner

Login: chijames
Kind: user

Repositories: 3
Profile: https://github.com/chijames

Citation (CITATION.cff)

# YAML 1.2
---
authors:
  - affiliation: EleutherAI
    family-names: Andonian
    given-names: Alex
  - affiliation: EleutherAI
    family-names: Biderman
    given-names: Stella
  - affiliation: EleutherAI
    family-names: Black
    given-names: Sid
  - affiliation: EleutherAI
    family-names: Gali
    given-names: Preetham
  - affiliation: EleutherAI
    family-names: Gao
    given-names: Leo
  - affiliation: EleutherAI
    family-names: Hallahan
    given-names: Eric
  - affiliation: EleutherAI
    family-names: Levy-Kramer
    given-names: Josh
  - affiliation: EleutherAI
    family-names: Leahy
    given-names: Connor
  - affiliation: EleutherAI
    family-names: Nestler
    given-names: Lucas
  - affiliation: EleutherAI
    family-names: Parker
    given-names: Kip
  - affiliation: EleutherAI
    family-names: Pieler
    given-names: Michael
  - affiliation: EleutherAI
    family-names: Purohit
    given-names: Shivanshu
  - affiliation: EleutherAI
    family-names: Songz
    given-names: Tri
  - affiliation: EleutherAI
    family-names: Phil
    given-names: Wang
  - affiliation: EleutherAI
    family-names: Weinbach
    given-names: Samuel
cff-version: "1.1.0"
keywords:
  - "Transformers"
  - "Massive language model"
  - "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...

GitHub Events

Total

Watch event: 3

Last Year

Watch event: 3

Dependencies

Dockerfile docker

nvidia/cuda 11.1.1-devel-ubuntu20.04 build

requirements/requirements-dev.txt pypi

autopep8 ==1.5.6 development
clang-format ==13.0.1 development
pre-commit * development
pytest ==6.2.3 development
pytest-cov ==2.11.1 development
pytest-forked ==1.3.0 development
pytest-xdist * development
transformers * development

requirements/requirements-onebitadam.txt pypi

cupy-cuda111 ==8.6.0

requirements/requirements-sparseattention.txt pypi

triton ==0.4.2

requirements/requirements-tensorboard.txt pypi

tensorboard ==2.5.0

requirements/requirements.txt pypi

deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
einops ==0.3.0
ftfy ==6.0.1
lm_eval ==0.2.0
mpi4py ==3.0.3
numpy ==1.22.0
pybind11 ==2.6.2
regex *
sentencepiece *
six *
tokenizers ==0.10.2
transformers *
wandb ==0.10.28

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science