mep

Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach

https://github.com/vangogh0318/mep

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach

Basic Info

Host: GitHub
Owner: vangogh0318
License: apache-2.0
Language: Python
Default Branch: main
Size: 1.04 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation Codeowners

Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach

PyTorch implementation of the paper Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach . This repository is adapted from the awesome gpt-neox library.

Important Changes and Information

This repository was developed based on commit 450b58c4ad7f36c319ca0b2f089c7349f34d8c3b of gpt-neox. We bump it to commit 738b87e73775e2cef4ea0a898b655f5d717cb8a0 to include some (irrelevant to this project) bug fixes. We only keep the main branch. (see https://github.com/chijames/KERPLE)
We remove the .github/ folder as it is not needed in our experiments.
The original gpt-neox readme is renamed as READMEgptneox.md.
The config files used in our experiments are stored in mep_configs/.

Installation

Please refer to the original readme READMEgptneox.md for details. We use the Host Setup without fused kernels.

Data Preparation

Warning: These datasets are huge! Please make sure you have at least 250 GB of disk space before download them all.

We use the three preconfigured datasets in the orignal gpt-neox repository: ``` python preparedata.py -d ./data openwebtext2 python preparedata.py -d ./data arxiv python prepare_data.py -d ./data github

datasets: openwebtext2: https://openwebtext2.readthedocs.io arxiv: https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T/blob/main/urls/arxiv.txt github: https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T/blob/main/urls/github.txt ``` Please refer to the original readme READMEgptneox.md for details.

Training

bash train.sh

Testing

bash test.sh

Main classes

ParallelSNOPE ParallelSNOPEKerpleLog

Owner

Login: vangogh0318
Kind: user

Repositories: 188
Profile: https://github.com/vangogh0318

Citation (CITATION.cff)

# YAML 1.2
---
authors:
  - affiliation: EleutherAI
    family-names: Andonian
    given-names: Alex
  - affiliation: EleutherAI
    family-names: Biderman
    given-names: Stella
  - affiliation: EleutherAI
    family-names: Black
    given-names: Sid
  - affiliation: EleutherAI
    family-names: Gali
    given-names: Preetham
  - affiliation: EleutherAI
    family-names: Gao
    given-names: Leo
  - affiliation: EleutherAI
    family-names: Hallahan
    given-names: Eric
  - affiliation: EleutherAI
    family-names: Levy-Kramer
    given-names: Josh
  - affiliation: EleutherAI
    family-names: Leahy
    given-names: Connor
  - affiliation: EleutherAI
    family-names: Nestler
    given-names: Lucas
  - affiliation: EleutherAI
    family-names: Parker
    given-names: Kip
  - affiliation: EleutherAI
    family-names: Pieler
    given-names: Michael
  - affiliation: EleutherAI
    family-names: Purohit
    given-names: Shivanshu
  - affiliation: EleutherAI
    family-names: Songz
    given-names: Tri
  - affiliation: EleutherAI
    family-names: Phil
    given-names: Wang
  - affiliation: EleutherAI
    family-names: Weinbach
    given-names: Samuel
cff-version: "1.1.0"
keywords:
  - "Transformers"
  - "Massive language model"
  - "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...

GitHub Events

Total

Last Year

Dependencies

Dockerfile docker

nvidia/cuda 11.1.1-devel-ubuntu20.04 build

megatron/fused_kernels/setup.py pypi

requirements/requirements-dev.txt pypi

autopep8 ==1.5.6 development
clang-format ==13.0.1 development
pre-commit * development
pytest ==6.2.3 development
pytest-cov ==2.11.1 development
pytest-forked ==1.3.0 development
pytest-xdist * development
transformers * development

requirements/requirements-my.txt pypi

deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75

requirements/requirements-onebitadam.txt pypi

cupy-cuda111 ==8.6.0

requirements/requirements-sparseattention.txt pypi

triton ==0.4.2

requirements/requirements-tensorboard.txt pypi

tensorboard ==2.5.0

requirements/requirements.txt pypi

deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
einops ==0.3.0
ftfy ==6.0.1
lm_eval ==0.2.0
mpi4py ==3.0.3
numpy ==1.22.0
pybind11 ==2.6.2
regex *
sentencepiece *
six *
tokenizers ==0.10.2
transformers *
wandb ==0.10.28

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mep

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach

Important Changes and Information

Installation

Data Preparation

Training

Testing

Main classes

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies