https://github.com/aiot-mlsys-lab/d2o
[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary
Repository
[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Basic Info
Statistics
- Stars: 18
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
$D_{2}O$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
The code for ICLR 2025 paper: D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models.
📃 [Paper] • 💻 [Github] • 🤗 [Huggingface]
If you find our project helpful, please give us a star ⭐ on GitHub to stay updated.
Setup Environment
We recommend using Anaconda to create a new environment and install the required packages. You can create a new environment and install the required packages using the following commands:
bash
pip install -r requirements.txt
conda create -n d2o_v2 python=3.10
conda activate d2o_v2
pip install --upgrade pip # enable PEP 660 support
Quick Step to Run the Code
You can run the inference code using the following command to run the Longbench sample: ```bash CUDAVISIBLEDEVICES=0 python runpredlongbenchsample.py --modelnameorpath meta-llama/Meta-Llama-3-8B \ --cachedir /yourhfhomepath \ --used2o True \ --modeltype llama3 \ --hhratio 0.1 \ --recentratio 0.1 \ --actionname d2o_0.2 \ --e True
``
-cachedirstores your model weights.
-used2ospecifies the execution strategy name.
-hhratiorefers to important tokens in our main paper.
-recentratio` represents the proportion of the window closest to the generated token.
Then, evaluate the results:
bash
python eval_long_bench.py --model Meta-Llama-3-8B_d2o_0.2 --e
For tasks related to lm-evaluation-harness GitHub Repository,
we recommend using the latest version by running:
bash
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
Then, follow the installation instructions provided in the repository and execute our algorithm accordingly.
Citation
bibtex
@article{wan2024d2o,
title={D2o: Dynamic discriminative operations for efficient generative inference of large language models},
author={Wan, Zhongwei and Wu, Xinjian and Zhang, Yu and Xin, Yi and Tao, Chaofan and Zhu, Zhihong and Wang, Xin and Luo, Siqi and Xiong, Jing and Zhang, Mi},
journal={arXiv preprint arXiv:2406.13035},
year={2024}
}
or
bibtex
@inproceedings{wan2025text,
title={$$\backslash$text $\{$D$\}$ \_ $\{$2$\}$$\backslash$text $\{$O$\}$ $: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models},
author={Wan, Zhongwei and Wu, Xinjian and Zhang, Yu and Xin, Yi and Tao, Chaofan and Zhu, Zhihong and Wang, Xin and Luo, Siqi and Xiong, Jing and Wang, Longyue and others},
booktitle={The Thirteenth International Conference on Learning Representations}
}
Owner
- Name: OSU AIoT-MLSys Lab
- Login: AIoT-MLSys-Lab
- Kind: organization
- Location: United States of America
- Website: https://aiot-mlsys-lab.github.io/
- Repositories: 15
- Profile: https://github.com/AIoT-MLSys-Lab
GitHub Events
Total
- Watch event: 10
- Push event: 3
- Fork event: 2
Last Year
- Watch event: 10
- Push event: 3
- Fork event: 2
Dependencies
- mkdocs ==1.4.2
- mkdocs-include-markdown-plugin ==4.0.0
- mkdocs-macros-plugin ==0.7.0
- mkdocstrings ==0.19.0
- black * development
- flake8 * development
- mypy * development
- pre-commit * development
- pytest * development
- 2captcha-python ==1.1.3
- Cython ==0.29.32
- Jinja2 ==3.1.2
- Mako ==1.2.3
- MarkupSafe ==2.1.1
- Pillow ==9.3.0
- PySocks ==1.7.1
- PyYAML ==6.0
- absl-py ==1.2.0
- aiodns ==3.0.0
- aiohttp ==3.8.3
- aiohttp-retry ==2.8.3
- aiosignal ==1.2.0
- aleph-alpha-client ==2.14.0
- anthropic ==0.2.5
- async-generator ==1.10
- async-timeout ==4.0.2
- attrs ==22.1.0
- beautifulsoup4 ==4.11.1
- bert-score ==0.3.11
- bitarray ==2.7.3
- black ==22.10.0
- blanc ==0.2.7
- blis ==0.7.8
- boto3 ==1.24.89
- botocore ==1.27.89
- bottle ==0.12.23
- cachetools ==5.2.0
- catalogue ==2.0.8
- cattrs ==22.2.0
- certifi ==2022.12.7
- cffi ==1.15.1
- cfgv ==3.3.1
- charset-normalizer ==2.1.1
- click ==8.0.4
- colorama ==0.4.5
- contourpy ==1.0.5
- cycler ==0.11.0
- cymem ==2.0.6
- dacite ==1.6.0
- datasets ==2.5.2
- dill ==0.3.5.1
- distlib ==0.3.6
- emoji ==2.1.0
- et-xmlfile ==1.1.0
- exceptiongroup ==1.1.0
- filelock ==3.8.0
- flake8 ==5.0.4
- fonttools ==4.37.4
- frozenlist ==1.3.1
- fsspec ==2022.8.2
- gdown ==4.4.0
- gevent ==21.12.0
- gin-config ==0.5.0
- google-api-core ==2.10.1
- google-api-python-client ==2.64.0
- google-auth ==2.12.0
- google-auth-httplib2 ==0.1.0
- googleapis-common-protos ==1.56.4
- greenlet ==1.1.3
- gunicorn ==20.1.0
- h11 ==0.14.0
- httplib2 ==0.20.4
- huggingface-hub ==0.11.0
- icetk ==0.0.4
- identify ==2.5.6
- idna ==3.4
- importlib-metadata ==6.0.0
- importlib-resources ==5.10.0
- iniconfig ==1.1.1
- jmespath ==1.0.1
- joblib ==1.2.0
- jsonlines ==3.1.0
- kiwisolver ==1.4.4
- langcodes ==3.3.0
- llvmlite ==0.39.1
- lxml ==4.9.1
- matplotlib ==3.6.0
- mccabe ==0.7.0
- moverscore ==1.0.3
- mpmath ==1.2.1
- multidict ==6.0.2
- multiprocess ==0.70.13
- murmurhash ==1.0.8
- mypy ==0.982
- mypy-extensions ==0.4.3
- networkx ==2.8.7
- nltk ==3.7
- nodeenv ==1.7.0
- numba ==0.56.4
- numpy ==1.23.3
- openai ==0.27.0
- openpyxl ==3.0.10
- outcome ==1.2.0
- packaging ==21.3
- pandas ==1.5.0
- pandas-stubs ==1.5.0.221003
- parameterized ==0.8.1
- pathspec ==0.10.1
- pathy ==0.6.2
- platformdirs ==2.5.2
- pluggy ==1.0.0
- portalocker ==2.5.1
- pre-commit ==2.20.0
- preshed ==3.0.7
- protobuf ==3.20.2
- psutil ==5.9.2
- pyarrow ==9.0.0
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pycares ==4.3.0
- pycodestyle ==2.9.1
- pycparser ==2.21
- pydantic ==1.8.2
- pyemd ==0.5.1
- pyext ==0.7
- pyflakes ==2.5.0
- pyhocon ==0.3.59
- pymongo ==4.2.0
- pyparsing ==2.4.7
- pytest ==7.2.0
- python-dateutil ==2.8.2
- pytorch-pretrained-bert ==0.6.2
- pytrec-eval ==0.5
- pytz ==2022.4
- regex ==2022.9.13
- requests ==2.28.1
- responses ==0.18.0
- retrying ==1.3.3
- revChatGPT ==0.1.1
- rouge-score ==0.1.2
- rsa ==4.9
- s3transfer ==0.6.0
- sacrebleu ==2.2.1
- sacremoses ==0.0.53
- scikit-learn ==1.1.2
- scipy ==1.9.1
- selenium ==4.8.0
- sentencepiece ==0.1.97
- six ==1.16.0
- sklearn ==0.0
- smart-open ==5.2.1
- sniffio ==1.3.0
- sortedcontainers ==2.4.0
- soupsieve ==2.3.2.post1
- spacy ==3.2.4
- spacy-legacy ==3.0.10
- spacy-loggers ==1.0.3
- sqlitedict ==1.7.0
- srsly ==2.4.4
- stanza ==1.4.2
- summ-eval ==0.892
- surge-api ==1.1.0
- sympy ==1.11.1
- tabulate ==0.9.0
- thinc ==8.0.17
- threadpoolctl ==3.1.0
- tiktoken ==0.3.3
- tls-client ==0.1.8
- tokenizers ==0.13.2
- toml ==0.10.2
- tomli ==2.0.1
- torch ==1.12.1
- torchvision ==0.13.1
- tqdm ==4.64.1
- transformers ==4.28.1
- trio ==0.22.0
- trio-websocket ==0.9.2
- typer ==0.4.2
- types-pytz ==2022.4.0.0
- types-redis ==4.3.21.1
- types-requests ==2.28.11.2
- types-tabulate ==0.9.0.0
- types-urllib3 ==1.26.25
- typing ==3.7.4.3
- typing_extensions ==4.4.0
- uncertainty-calibration ==0.1.3
- undetected-chromedriver ==3.2.1
- uritemplate ==4.1.1
- urllib3 ==1.26.12
- virtualenv ==20.16.5
- wasabi ==0.10.1
- websocket-client ==1.3.2
- websockets ==10.4
- wsproto ==1.2.0
- xlrd ==2.0.1
- xxhash ==3.0.0
- yarl ==1.8.1
- zipp ==3.11.0
- zope.event ==4.5.0
- zope.interface ==5.4.0
- zstandard ==0.18.0
- Mako *
- aleph-alpha-client *
- anthropic *
- bottle *
- cattrs *
- colorcet *
- dacite *
- datasets *
- gdown *
- google-api-python-client *
- gunicorn *
- icetk *
- importlib-resources *
- jsonlines *
- matplotlib *
- nltk *
- numba *
- numpy *
- openai *
- protobuf *
- pyext *
- pyhocon *
- pymongo *
- pytrec_eval ==0.5
- retrying *
- revChatGPT *
- rouge-score *
- sacrebleu *
- scikit-learn *
- scipy *
- seaborn *
- sentencepiece *
- spacy *
- sqlitedict *
- summ-eval *
- surge-api *
- sympy *
- tiktoken *
- tokenizers *
- torch *
- torchvision *
- tqdm *
- transformers *
- uncertainty-calibration *
- websocket-client *
- xlrd *
- zstandard *
- accelerate >=0.21.0
- datasets >=2.0.0
- evaluate *
- evaluate >=0.4.0
- jsonlines *
- numexpr *
- peft >=0.2.0
- pybind11 >=2.6.2
- pytablewriter *
- rouge-score >=0.0.4
- sacrebleu >=1.5.0
- scikit-learn >=0.24.1
- sqlitedict *
- torch >=1.8
- tqdm-multiprocess *
- transformers >=4.1
- zstandard *
- accelerate >=0.21.0
- datasets >=2.0.0
- evaluate *
- evaluate >=0.4.0
- jsonlines *
- numexpr *
- peft >=0.2.0
- pybind11 >=2.6.2
- pytablewriter *
- rouge-score >=0.0.4
- sacrebleu >=1.5.0
- scikit-learn >=0.24.1
- sqlitedict *
- torch >=1.8
- tqdm-multiprocess *
- transformers >=4.1
- zstandard *
- DataProperty ==1.0.1
- Jinja2 ==3.1.2
- MarkupSafe ==2.1.3
- PyYAML ==6.0.1
- Pygments ==2.17.2
- absl-py ==2.0.0
- accelerate ==0.25.0
- aiofiles ==23.2.1
- aiohttp ==3.9.1
- aiosignal ==1.3.1
- altair ==5.2.0
- annotated-types ==0.6.0
- anyio ==4.3.0
- asttokens ==2.4.1
- async-timeout ==4.0.3
- attributedict ==0.3.0
- attrs ==23.2.0
- bitsandbytes ==0.43.0
- blessings ==1.7
- cachetools ==5.3.2
- certifi ==2023.11.17
- chardet ==5.2.0
- charset-normalizer ==3.3.2
- click ==8.1.7
- codecov ==2.1.13
- colorama ==0.4.6
- coloredlogs ==15.0.1
- colour-runner ==0.1.1
- contourpy ==1.2.0
- coverage ==7.4.0
- cycler ==0.12.1
- datasets ==2.16.1
- decorator ==5.1.1
- deepdiff ==6.7.1
- deepspeed ==0.12.6
- dill ==0.3.7
- distlib ==0.3.8
- distro ==1.9.0
- einops ==0.7.0
- evaluate ==0.4.1
- exceptiongroup ==1.2.0
- executing ==2.0.1
- fastapi ==0.110.0
- ffmpy ==0.3.2
- filelock ==3.13.1
- flash-attn ==2.5.6
- fonttools ==4.50.0
- frozenlist ==1.4.1
- fsspec ==2023.10.0
- fuzzywuzzy ==0.18.0
- gradio ==3.35.2
- gradio_client ==0.2.9
- h11 ==0.14.0
- hjson ==3.1.0
- httpcore ==1.0.4
- httpx ==0.27.0
- huggingface-hub ==0.20.2
- humanfriendly ==10.0
- idna ==3.6
- inspecta ==0.1.3
- ipdb ==0.13.13
- ipython ==8.19.0
- jedi ==0.19.1
- jieba ==0.42.1
- joblib ==1.3.2
- jsonlines ==4.0.0
- jsonschema ==4.21.1
- jsonschema-specifications ==2023.12.1
- kiwisolver ==1.4.5
- linkify-it-py ==2.0.3
- lxml ==5.0.1
- markdown-it-py ==2.2.0
- matplotlib ==3.8.3
- matplotlib-inline ==0.1.6
- mbstrdecoder ==1.1.3
- mdit-py-plugins ==0.3.3
- mdurl ==0.1.2
- mpmath ==1.3.0
- multidict ==6.0.4
- multiprocess ==0.70.15
- networkx ==3.2.1
- ninja ==1.11.1.1
- nltk ==3.8.1
- numexpr ==2.8.8
- numpy ==1.26.3
- nvidia-cublas-cu12 ==12.1.3.1
- nvidia-cuda-cupti-cu12 ==12.1.105
- nvidia-cuda-nvrtc-cu12 ==12.1.105
- nvidia-cuda-runtime-cu12 ==12.1.105
- nvidia-cudnn-cu12 ==8.9.2.26
- nvidia-cufft-cu12 ==11.0.2.54
- nvidia-curand-cu12 ==10.3.2.106
- nvidia-cusolver-cu12 ==11.4.5.107
- nvidia-cusparse-cu12 ==12.1.0.106
- nvidia-nccl-cu12 ==2.18.1
- nvidia-nvjitlink-cu12 ==12.3.101
- nvidia-nvtx-cu12 ==12.1.105
- openai ==1.14.2
- ordered-set ==4.1.0
- orjson ==3.9.15
- packaging ==24.0
- packaging ==23.2
- pandas ==2.1.4
- parso ==0.8.3
- pathvalidate ==3.2.0
- peft ==0.7.1
- pexpect ==4.9.0
- pillow ==10.2.0
- platformdirs ==4.1.0
- pluggy ==1.3.0
- portalocker ==2.8.2
- prompt-toolkit ==3.0.43
- protobuf ==4.25.1
- psutil ==5.9.7
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- py-cpuinfo ==9.0.0
- pyarrow ==14.0.2
- pyarrow-hotfix ==0.6
- pybind11 ==2.11.1
- pycountry ==23.12.11
- pydantic ==1.10.14
- pydantic_core ==2.14.6
- pydub ==0.25.1
- pynvml ==11.5.0
- pyparsing ==3.1.2
- pyproject-api ==1.6.1
- pytablewriter ==1.2.0
- python-dateutil ==2.8.2
- python-multipart ==0.0.9
- pytz ==2023.3.post1
- referencing ==0.34.0
- regex ==2023.12.25
- requests ==2.31.0
- responses ==0.18.0
- rootpath ==0.1.1
- rouge ==1.0.1
- rouge-score ==0.1.2
- rpds-py ==0.18.0
- sacrebleu ==1.5.0
- safetensors ==0.4.1
- scikit-learn ==1.3.2
- scipy ==1.11.4
- semantic-version ==2.10.0
- sentencepiece ==0.1.99
- six ==1.16.0
- sniffio ==1.3.1
- sqlitedict ==2.1.0
- stack-data ==0.6.3
- starlette ==0.36.3
- sympy ==1.12
- tabledata ==1.3.3
- tabulate ==0.9.0
- tcolorpy ==0.1.4
- termcolor ==2.4.0
- texttable ==1.7.0
- threadpoolctl ==3.2.0
- tokenizers ==0.15.0
- toml ==0.10.2
- tomli ==2.0.1
- toolz ==0.12.1
- torch ==2.1.2
- torchvision ==0.16.2
- tox ==4.11.4
- tqdm ==4.66.1
- tqdm-multiprocess ==0.0.11
- traitlets ==5.14.1
- transformers ==4.37.2
- triton ==2.1.0
- typepy ==1.3.2
- typing_extensions ==4.9.0
- tzdata ==2023.4
- uc-micro-py ==1.0.3
- urllib3 ==2.1.0
- uvicorn ==0.29.0
- virtualenv ==20.25.0
- wcwidth ==0.2.13
- websockets ==12.0
- xxhash ==3.4.1
- yarl ==1.9.4
- zstandard ==0.22.0