mep
Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Repository
Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach
Basic Info
- Host: GitHub
- Owner: vangogh0318
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 1.04 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach
PyTorch implementation of the paper Multiple Kernel Learning Enhances Relative Positional Encoding Length Extrapolation: The MEP Approach . This repository is adapted from the awesome gpt-neox library.
Important Changes and Information
- This repository was developed based on commit 450b58c4ad7f36c319ca0b2f089c7349f34d8c3b of gpt-neox. We bump it to commit 738b87e73775e2cef4ea0a898b655f5d717cb8a0 to include some (irrelevant to this project) bug fixes. We only keep the main branch. (see https://github.com/chijames/KERPLE)
- We remove the .github/ folder as it is not needed in our experiments.
- The original gpt-neox readme is renamed as READMEgptneox.md.
- The config files used in our experiments are stored in mep_configs/.
Installation
Please refer to the original readme READMEgptneox.md for details. We use the Host Setup without fused kernels.
Data Preparation
Warning: These datasets are huge! Please make sure you have at least 250 GB of disk space before download them all.
We use the three preconfigured datasets in the orignal gpt-neox repository: ``` python preparedata.py -d ./data openwebtext2 python preparedata.py -d ./data arxiv python prepare_data.py -d ./data github
datasets: openwebtext2: https://openwebtext2.readthedocs.io arxiv: https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T/blob/main/urls/arxiv.txt github: https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T/blob/main/urls/github.txt ``` Please refer to the original readme READMEgptneox.md for details.
Training
bash train.sh
Testing
bash test.sh
Main classes
ParallelSNOPE ParallelSNOPEKerpleLog
Owner
- Login: vangogh0318
- Kind: user
- Repositories: 188
- Profile: https://github.com/vangogh0318
Citation (CITATION.cff)
# YAML 1.2
---
authors:
- affiliation: EleutherAI
family-names: Andonian
given-names: Alex
- affiliation: EleutherAI
family-names: Biderman
given-names: Stella
- affiliation: EleutherAI
family-names: Black
given-names: Sid
- affiliation: EleutherAI
family-names: Gali
given-names: Preetham
- affiliation: EleutherAI
family-names: Gao
given-names: Leo
- affiliation: EleutherAI
family-names: Hallahan
given-names: Eric
- affiliation: EleutherAI
family-names: Levy-Kramer
given-names: Josh
- affiliation: EleutherAI
family-names: Leahy
given-names: Connor
- affiliation: EleutherAI
family-names: Nestler
given-names: Lucas
- affiliation: EleutherAI
family-names: Parker
given-names: Kip
- affiliation: EleutherAI
family-names: Pieler
given-names: Michael
- affiliation: EleutherAI
family-names: Purohit
given-names: Shivanshu
- affiliation: EleutherAI
family-names: Songz
given-names: Tri
- affiliation: EleutherAI
family-names: Phil
given-names: Wang
- affiliation: EleutherAI
family-names: Weinbach
given-names: Samuel
cff-version: "1.1.0"
keywords:
- "Transformers"
- "Massive language model"
- "Autoregressive language model"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://www.github.com/eleutherai/gpt-neox"
title: "GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch"
version: "0.0.1"
doi: "10.5281/zenodo.5879544"
date-released: 2021-08-23
...
GitHub Events
Total
Last Year
Dependencies
- nvidia/cuda 11.1.1-devel-ubuntu20.04 build
- autopep8 ==1.5.6 development
- clang-format ==13.0.1 development
- pre-commit * development
- pytest ==6.2.3 development
- pytest-cov ==2.11.1 development
- pytest-forked ==1.3.0 development
- pytest-xdist * development
- transformers * development
- deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
- cupy-cuda111 ==8.6.0
- triton ==0.4.2
- tensorboard ==2.5.0
- deepspeed eb7f5cff36678625d23db8a8fe78b4a93e5d2c75
- einops ==0.3.0
- ftfy ==6.0.1
- lm_eval ==0.2.0
- mpi4py ==3.0.3
- numpy ==1.22.0
- pybind11 ==2.6.2
- regex *
- sentencepiece *
- six *
- tokenizers ==0.10.2
- transformers *
- wandb ==0.10.28