Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: chijames
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 212 KB
Statistics
  • Stars: 19
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation Codeowners

README.md

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

PyTorch implementation of the paper KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation accepted at NeurIPS 2022. This repository is adapted from the awesome gpt-neox library.

Important Changes and Information

  1. This repository was developed based on commit 450b58c4ad7f36c319ca0b2f089c7349f34d8c3b of gpt-neox. We bump it to commit 738b87e73775e2cef4ea0a898b655f5d717cb8a0 to include some (irrelevant to this project) bug fixes. We only keep the main branch.
  2. We remove the .github/ folder as it is not needed in our experiments.
  3. The original gpt-neox readme is renamed as READMEgptneox.md.
  4. The config files used in our experiments are stored in kerple_configs/.
  5. The two proposed positional embeddings are called ParallelKerplePower and ParallelKerpleLog in this repository. A simple grep will point you to our implementation.

Installation

Please refer to the original readme READMEgptneox.md for details. We use the Host Setup without fused kernels.

Data Preparation

Warning: These datasets are huge! Please make sure you have at least 250 GB of disk space before download them all.

We use the three preconfigured datasets in the orignal gpt-neox repository: python prepare_data.py -d ./data openwebtext2 python prepare_data.py -d ./data arxiv python prepare_data.py -d ./data github Please refer to the original readme READMEgptneox.md for details.

Config Preparation

python generate_ymls.py

Training

bash train.sh

Testing

bash test.sh

Pretrained Models

We release 6 pretrained checkpoints: kerple_log and kerple_power pretrained on the above three datasets.

  1. Please navigate to Releases to download the checkpoints.
  2. You can right click on the filename, copy link address, and use wget to download it directly in a command line environment.
  3. Once the files are downloaded, unzip them and leave them in the current directory.
  4. Run test.sh, and the extrapolation performance should be very close to the numbers reported in Table 3 of the paper.

Owner

  • Login: chijames
  • Kind: user

Citation (CITATION.cff)

# YAML 1.2
---
authors:
  - affiliation: EleutherAI
    family-names: Andonian
    given-names: Alex
  - affiliation: EleutherAI
    family-names: Biderman
    given-names: Stella
  - affiliation: EleutherAI
    family-names: Black
    given-names: Sid
  - affiliation: EleutherAI
    family-names: Gali
    given-names: Preetham
  - affiliation: EleutherAI
    family-names: Gao
    given-names: Leo
  - affiliation: EleutherAI
    family-names: Hallahan
    given-names: Eric
  - affiliation: EleutherAI
    family-names: Levy-Kramer
    given-names: Josh
  - affiliation: EleutherAI
    family-names: Leahy
    given-names: Connor
  - affiliation: EleutherAI
    family-names: Nestler
    given-names: Lucas
  - affiliation: EleutherAI
    family-names: Parker
    given-names: Kip
  - affiliation: EleutherAI
    family-names: Pieler
    given-names: Michael
  - affiliation: EleutherAI
    family-names: Purohit
    given-names: Shivanshu
  - affiliation: EleutherAI
    family-names: Songz
    given-names: Tri
  - affiliation: EleutherAI
    family-names: Phil
    given-names: Wang
  - affiliation: EleutherAI
    family-names: Weinbach
    given-names: Samuel
cff-version: "1.1.0"
keywords:
  - "Transformers"
  - "Massive language model"
  - "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Dependencies

Dockerfile docker
  • nvidia/cuda 11.1.1-devel-ubuntu20.04 build
requirements/requirements-dev.txt pypi
  • autopep8 ==1.5.6 development
  • clang-format ==13.0.1 development
  • pre-commit * development
  • pytest ==6.2.3 development
  • pytest-cov ==2.11.1 development
  • pytest-forked ==1.3.0 development
  • pytest-xdist * development
  • transformers * development
requirements/requirements-onebitadam.txt pypi
  • cupy-cuda111 ==8.6.0
requirements/requirements-sparseattention.txt pypi
  • triton ==0.4.2
requirements/requirements-tensorboard.txt pypi
  • tensorboard ==2.5.0
requirements/requirements.txt pypi
  • deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
  • einops ==0.3.0
  • ftfy ==6.0.1
  • lm_eval ==0.2.0
  • mpi4py ==3.0.3
  • numpy ==1.22.0
  • pybind11 ==2.6.2
  • regex *
  • sentencepiece *
  • six *
  • tokenizers ==0.10.2
  • transformers *
  • wandb ==0.10.28