longmem

Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".

https://github.com/victorwz/longmem

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.9%) to scientific vocabulary

Keywords

large-language-models long-context-modeling long-term-memory
Last synced: 8 months ago · JSON representation ·

Repository

Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".

Basic Info
Statistics
  • Stars: 789
  • Watchers: 24
  • Forks: 70
  • Open Issues: 12
  • Releases: 0
Topics
large-language-models long-context-modeling long-term-memory
Created almost 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

LongMem

Official implementation of our paper "Augmenting Language Models with Long-Term Memory".

Please cite our paper if you find this repository interesting or helpful: bibtex @article{LongMem, title={Augmenting Language Models with Long-Term Memory}, author={Wang, Weizhi and Dong, Li and Cheng, Hao and Liu, Xiaodong and Yan, Xifeng and Gao, Jianfeng and Wei, Furu}, journal={arXiv preprint arXiv:2306.07174}, year={2023} }

Environment Setup

  • torch: Please follow torch official installation guide. We recommend torch>=1.8.0. Please select the torch-gpu version which is consistent with your cuda driver version.

  • Faiss-GPU: For Nvidia V100 GPUs, simply install via pip install faiss-gpu. For Nvidia A100, A6000 GPUs, please run conda install faiss-gpu cudatoolkit=11.0 -c pytorch. The A100 GPU is not officially supported by faiss-gpu, sometimes it will lead to errors, you can refer to this git issue of faiss for help.

  • fairseq: pip install --editable ./fairseq Then the revised fairseq and dependency packages will be installed. We strongly recommend you to use python 3.8 for stability.

  • other packages: pip install -r requirements.txt

Project Structure

Memory-Augmented Adaptation Training

Data collection and Preprocessing

Please download the Pile from official release. Each sub-dataset in the Pile is organized as various jsonline splits. You can refer to preprocess/filter_shard_tnlg.py fpr how we sample the training set and binalize following standard fairseq preprocessing process.

Memory-Augmented Adaptation Training: bash train_scripts/train_longmem.sh

Evaluation

Please firstly download the checkpoints for pre-trained GPT2-medium model and LongMem model to checkpoints/.

Memory-Augmented In-Context Learning

```

Evaluate gpt2 baseline

python evalscripts/evallongmemicl.py --path /path/to/gpt2pretrained_model

Evaluate LongMem model

python evalscripts/evallongmemicl.py --path /path/to/longmemmodel --pretrained-model-path /path/to/gpt2pretrainedmodel ```

Credits

LongMem is developed based on fairseq. Thanks to the team from eleuther.ai who constructed the largest high-quality corpora, the Pile.

Owner

  • Name: Weizhi Wang
  • Login: Victorwz
  • Kind: user
  • Company: University of California, Santa Barbara

CS Ph.D. Student @UCSB

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Wang"
    given-names: "Weizhi"
  - family-names: "Dong"
    given-names: "Li"
  - family-names: "Cheng"
    given-names: "Hao"
  - family-names: "Liu"
    given-names: "Xiaodong"
  - family-names: "Yan"
    given-names: "Xifeng"
  - family-names: "Gao"
    given-names: "Jianfeng"
  - family-names: "Wei"
    given-names: "Furu"
title: "Augmenting Language Models with Long-Term Memory"
doi: "https://doi.org/10.48550/arXiv.2306.07174"
repository-code: "https://github.com/Victorwz/LongMem"
license: "Apache-2.0"
date-released: 2023
abstract: "Official implementation of our paper 'Augmenting Language Models with Long-Term Memory'."

GitHub Events

Total
  • Watch event: 44
  • Fork event: 5
Last Year
  • Watch event: 44
  • Fork event: 5