https://github.com/aiot-mlsys-lab/d2o

[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models

https://github.com/aiot-mlsys-lab/d2o

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models

Basic Info
  • Host: GitHub
  • Owner: AIoT-MLSys-Lab
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 23.7 MB
Statistics
  • Stars: 18
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed 11 months ago
Metadata Files
Readme License

README.md

$D_{2}O$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models

The code for ICLR 2025 paper: D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models.

📃 [Paper] • 💻 [Github] • 🤗 [Huggingface]

If you find our project helpful, please give us a star ⭐ on GitHub to stay updated.


Setup Environment

We recommend using Anaconda to create a new environment and install the required packages. You can create a new environment and install the required packages using the following commands: bash pip install -r requirements.txt conda create -n d2o_v2 python=3.10 conda activate d2o_v2 pip install --upgrade pip # enable PEP 660 support

Quick Step to Run the Code

You can run the inference code using the following command to run the Longbench sample: ```bash CUDAVISIBLEDEVICES=0 python runpredlongbenchsample.py --modelnameorpath meta-llama/Meta-Llama-3-8B \ --cachedir /yourhfhomepath \ --used2o True \ --modeltype llama3 \ --hhratio 0.1 \ --recentratio 0.1 \ --actionname d2o_0.2 \ --e True

`` -cachedirstores your model weights. -used2ospecifies the execution strategy name. -hhratiorefers to important tokens in our main paper. -recentratio` represents the proportion of the window closest to the generated token.

Then, evaluate the results: bash python eval_long_bench.py --model Meta-Llama-3-8B_d2o_0.2 --e

For tasks related to lm-evaluation-harness GitHub Repository,
we recommend using the latest version by running:

bash git clone https://github.com/EleutherAI/lm-evaluation-harness.git Then, follow the installation instructions provided in the repository and execute our algorithm accordingly.

Citation

bibtex @article{wan2024d2o, title={D2o: Dynamic discriminative operations for efficient generative inference of large language models}, author={Wan, Zhongwei and Wu, Xinjian and Zhang, Yu and Xin, Yi and Tao, Chaofan and Zhu, Zhihong and Wang, Xin and Luo, Siqi and Xiong, Jing and Zhang, Mi}, journal={arXiv preprint arXiv:2406.13035}, year={2024} }

or

bibtex @inproceedings{wan2025text, title={$$\backslash$text $\{$D$\}$ \_ $\{$2$\}$$\backslash$text $\{$O$\}$ $: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models}, author={Wan, Zhongwei and Wu, Xinjian and Zhang, Yu and Xin, Yi and Tao, Chaofan and Zhu, Zhihong and Wang, Xin and Luo, Siqi and Xiong, Jing and Wang, Longyue and others}, booktitle={The Thirteenth International Conference on Learning Representations} }

Owner

  • Name: OSU AIoT-MLSys Lab
  • Login: AIoT-MLSys-Lab
  • Kind: organization
  • Location: United States of America

GitHub Events

Total
  • Watch event: 10
  • Push event: 3
  • Fork event: 2
Last Year
  • Watch event: 10
  • Push event: 3
  • Fork event: 2

Dependencies

LLM_merge_new/helm/docs/requirements.txt pypi
  • mkdocs ==1.4.2
  • mkdocs-include-markdown-plugin ==4.0.0
  • mkdocs-macros-plugin ==0.7.0
  • mkdocstrings ==0.19.0
LLM_merge_new/helm/pyproject.toml pypi
LLM_merge_new/helm/requirements-dev.txt pypi
  • black * development
  • flake8 * development
  • mypy * development
  • pre-commit * development
  • pytest * development
LLM_merge_new/helm/requirements-freeze.txt pypi
  • 2captcha-python ==1.1.3
  • Cython ==0.29.32
  • Jinja2 ==3.1.2
  • Mako ==1.2.3
  • MarkupSafe ==2.1.1
  • Pillow ==9.3.0
  • PySocks ==1.7.1
  • PyYAML ==6.0
  • absl-py ==1.2.0
  • aiodns ==3.0.0
  • aiohttp ==3.8.3
  • aiohttp-retry ==2.8.3
  • aiosignal ==1.2.0
  • aleph-alpha-client ==2.14.0
  • anthropic ==0.2.5
  • async-generator ==1.10
  • async-timeout ==4.0.2
  • attrs ==22.1.0
  • beautifulsoup4 ==4.11.1
  • bert-score ==0.3.11
  • bitarray ==2.7.3
  • black ==22.10.0
  • blanc ==0.2.7
  • blis ==0.7.8
  • boto3 ==1.24.89
  • botocore ==1.27.89
  • bottle ==0.12.23
  • cachetools ==5.2.0
  • catalogue ==2.0.8
  • cattrs ==22.2.0
  • certifi ==2022.12.7
  • cffi ==1.15.1
  • cfgv ==3.3.1
  • charset-normalizer ==2.1.1
  • click ==8.0.4
  • colorama ==0.4.5
  • contourpy ==1.0.5
  • cycler ==0.11.0
  • cymem ==2.0.6
  • dacite ==1.6.0
  • datasets ==2.5.2
  • dill ==0.3.5.1
  • distlib ==0.3.6
  • emoji ==2.1.0
  • et-xmlfile ==1.1.0
  • exceptiongroup ==1.1.0
  • filelock ==3.8.0
  • flake8 ==5.0.4
  • fonttools ==4.37.4
  • frozenlist ==1.3.1
  • fsspec ==2022.8.2
  • gdown ==4.4.0
  • gevent ==21.12.0
  • gin-config ==0.5.0
  • google-api-core ==2.10.1
  • google-api-python-client ==2.64.0
  • google-auth ==2.12.0
  • google-auth-httplib2 ==0.1.0
  • googleapis-common-protos ==1.56.4
  • greenlet ==1.1.3
  • gunicorn ==20.1.0
  • h11 ==0.14.0
  • httplib2 ==0.20.4
  • huggingface-hub ==0.11.0
  • icetk ==0.0.4
  • identify ==2.5.6
  • idna ==3.4
  • importlib-metadata ==6.0.0
  • importlib-resources ==5.10.0
  • iniconfig ==1.1.1
  • jmespath ==1.0.1
  • joblib ==1.2.0
  • jsonlines ==3.1.0
  • kiwisolver ==1.4.4
  • langcodes ==3.3.0
  • llvmlite ==0.39.1
  • lxml ==4.9.1
  • matplotlib ==3.6.0
  • mccabe ==0.7.0
  • moverscore ==1.0.3
  • mpmath ==1.2.1
  • multidict ==6.0.2
  • multiprocess ==0.70.13
  • murmurhash ==1.0.8
  • mypy ==0.982
  • mypy-extensions ==0.4.3
  • networkx ==2.8.7
  • nltk ==3.7
  • nodeenv ==1.7.0
  • numba ==0.56.4
  • numpy ==1.23.3
  • openai ==0.27.0
  • openpyxl ==3.0.10
  • outcome ==1.2.0
  • packaging ==21.3
  • pandas ==1.5.0
  • pandas-stubs ==1.5.0.221003
  • parameterized ==0.8.1
  • pathspec ==0.10.1
  • pathy ==0.6.2
  • platformdirs ==2.5.2
  • pluggy ==1.0.0
  • portalocker ==2.5.1
  • pre-commit ==2.20.0
  • preshed ==3.0.7
  • protobuf ==3.20.2
  • psutil ==5.9.2
  • pyarrow ==9.0.0
  • pyasn1 ==0.4.8
  • pyasn1-modules ==0.2.8
  • pycares ==4.3.0
  • pycodestyle ==2.9.1
  • pycparser ==2.21
  • pydantic ==1.8.2
  • pyemd ==0.5.1
  • pyext ==0.7
  • pyflakes ==2.5.0
  • pyhocon ==0.3.59
  • pymongo ==4.2.0
  • pyparsing ==2.4.7
  • pytest ==7.2.0
  • python-dateutil ==2.8.2
  • pytorch-pretrained-bert ==0.6.2
  • pytrec-eval ==0.5
  • pytz ==2022.4
  • regex ==2022.9.13
  • requests ==2.28.1
  • responses ==0.18.0
  • retrying ==1.3.3
  • revChatGPT ==0.1.1
  • rouge-score ==0.1.2
  • rsa ==4.9
  • s3transfer ==0.6.0
  • sacrebleu ==2.2.1
  • sacremoses ==0.0.53
  • scikit-learn ==1.1.2
  • scipy ==1.9.1
  • selenium ==4.8.0
  • sentencepiece ==0.1.97
  • six ==1.16.0
  • sklearn ==0.0
  • smart-open ==5.2.1
  • sniffio ==1.3.0
  • sortedcontainers ==2.4.0
  • soupsieve ==2.3.2.post1
  • spacy ==3.2.4
  • spacy-legacy ==3.0.10
  • spacy-loggers ==1.0.3
  • sqlitedict ==1.7.0
  • srsly ==2.4.4
  • stanza ==1.4.2
  • summ-eval ==0.892
  • surge-api ==1.1.0
  • sympy ==1.11.1
  • tabulate ==0.9.0
  • thinc ==8.0.17
  • threadpoolctl ==3.1.0
  • tiktoken ==0.3.3
  • tls-client ==0.1.8
  • tokenizers ==0.13.2
  • toml ==0.10.2
  • tomli ==2.0.1
  • torch ==1.12.1
  • torchvision ==0.13.1
  • tqdm ==4.64.1
  • transformers ==4.28.1
  • trio ==0.22.0
  • trio-websocket ==0.9.2
  • typer ==0.4.2
  • types-pytz ==2022.4.0.0
  • types-redis ==4.3.21.1
  • types-requests ==2.28.11.2
  • types-tabulate ==0.9.0.0
  • types-urllib3 ==1.26.25
  • typing ==3.7.4.3
  • typing_extensions ==4.4.0
  • uncertainty-calibration ==0.1.3
  • undetected-chromedriver ==3.2.1
  • uritemplate ==4.1.1
  • urllib3 ==1.26.12
  • virtualenv ==20.16.5
  • wasabi ==0.10.1
  • websocket-client ==1.3.2
  • websockets ==10.4
  • wsproto ==1.2.0
  • xlrd ==2.0.1
  • xxhash ==3.0.0
  • yarl ==1.8.1
  • zipp ==3.11.0
  • zope.event ==4.5.0
  • zope.interface ==5.4.0
  • zstandard ==0.18.0
LLM_merge_new/helm/requirements.txt pypi
  • Mako *
  • aleph-alpha-client *
  • anthropic *
  • bottle *
  • cattrs *
  • colorcet *
  • dacite *
  • datasets *
  • gdown *
  • google-api-python-client *
  • gunicorn *
  • icetk *
  • importlib-resources *
  • jsonlines *
  • matplotlib *
  • nltk *
  • numba *
  • numpy *
  • openai *
  • protobuf *
  • pyext *
  • pyhocon *
  • pymongo *
  • pytrec_eval ==0.5
  • retrying *
  • revChatGPT *
  • rouge-score *
  • sacrebleu *
  • scikit-learn *
  • scipy *
  • seaborn *
  • sentencepiece *
  • spacy *
  • sqlitedict *
  • summ-eval *
  • surge-api *
  • sympy *
  • tiktoken *
  • tokenizers *
  • torch *
  • torchvision *
  • tqdm *
  • transformers *
  • uncertainty-calibration *
  • websocket-client *
  • xlrd *
  • zstandard *
LLM_merge_new/helm/setup.py pypi
LLM_merge_new/lm-evaluation-harness/pyproject.toml pypi
  • accelerate >=0.21.0
  • datasets >=2.0.0
  • evaluate *
  • evaluate >=0.4.0
  • jsonlines *
  • numexpr *
  • peft >=0.2.0
  • pybind11 >=2.6.2
  • pytablewriter *
  • rouge-score >=0.0.4
  • sacrebleu >=1.5.0
  • scikit-learn >=0.24.1
  • sqlitedict *
  • torch >=1.8
  • tqdm-multiprocess *
  • transformers >=4.1
  • zstandard *
LLM_merge_new/lm-evaluation-harness/requirements.txt pypi
LLM_merge_new/lm-evaluation-harness/setup.py pypi
LLM_merge_new/lm-evaluation-harness/src/lm-eval/pyproject.toml pypi
  • accelerate >=0.21.0
  • datasets >=2.0.0
  • evaluate *
  • evaluate >=0.4.0
  • jsonlines *
  • numexpr *
  • peft >=0.2.0
  • pybind11 >=2.6.2
  • pytablewriter *
  • rouge-score >=0.0.4
  • sacrebleu >=1.5.0
  • scikit-learn >=0.24.1
  • sqlitedict *
  • torch >=1.8
  • tqdm-multiprocess *
  • transformers >=4.1
  • zstandard *
LLM_merge_new/lm-evaluation-harness/src/lm-eval/requirements.txt pypi
LLM_merge_new/lm-evaluation-harness/src/lm-eval/setup.py pypi
requirements.txt pypi
  • DataProperty ==1.0.1
  • Jinja2 ==3.1.2
  • MarkupSafe ==2.1.3
  • PyYAML ==6.0.1
  • Pygments ==2.17.2
  • absl-py ==2.0.0
  • accelerate ==0.25.0
  • aiofiles ==23.2.1
  • aiohttp ==3.9.1
  • aiosignal ==1.3.1
  • altair ==5.2.0
  • annotated-types ==0.6.0
  • anyio ==4.3.0
  • asttokens ==2.4.1
  • async-timeout ==4.0.3
  • attributedict ==0.3.0
  • attrs ==23.2.0
  • bitsandbytes ==0.43.0
  • blessings ==1.7
  • cachetools ==5.3.2
  • certifi ==2023.11.17
  • chardet ==5.2.0
  • charset-normalizer ==3.3.2
  • click ==8.1.7
  • codecov ==2.1.13
  • colorama ==0.4.6
  • coloredlogs ==15.0.1
  • colour-runner ==0.1.1
  • contourpy ==1.2.0
  • coverage ==7.4.0
  • cycler ==0.12.1
  • datasets ==2.16.1
  • decorator ==5.1.1
  • deepdiff ==6.7.1
  • deepspeed ==0.12.6
  • dill ==0.3.7
  • distlib ==0.3.8
  • distro ==1.9.0
  • einops ==0.7.0
  • evaluate ==0.4.1
  • exceptiongroup ==1.2.0
  • executing ==2.0.1
  • fastapi ==0.110.0
  • ffmpy ==0.3.2
  • filelock ==3.13.1
  • flash-attn ==2.5.6
  • fonttools ==4.50.0
  • frozenlist ==1.4.1
  • fsspec ==2023.10.0
  • fuzzywuzzy ==0.18.0
  • gradio ==3.35.2
  • gradio_client ==0.2.9
  • h11 ==0.14.0
  • hjson ==3.1.0
  • httpcore ==1.0.4
  • httpx ==0.27.0
  • huggingface-hub ==0.20.2
  • humanfriendly ==10.0
  • idna ==3.6
  • inspecta ==0.1.3
  • ipdb ==0.13.13
  • ipython ==8.19.0
  • jedi ==0.19.1
  • jieba ==0.42.1
  • joblib ==1.3.2
  • jsonlines ==4.0.0
  • jsonschema ==4.21.1
  • jsonschema-specifications ==2023.12.1
  • kiwisolver ==1.4.5
  • linkify-it-py ==2.0.3
  • lxml ==5.0.1
  • markdown-it-py ==2.2.0
  • matplotlib ==3.8.3
  • matplotlib-inline ==0.1.6
  • mbstrdecoder ==1.1.3
  • mdit-py-plugins ==0.3.3
  • mdurl ==0.1.2
  • mpmath ==1.3.0
  • multidict ==6.0.4
  • multiprocess ==0.70.15
  • networkx ==3.2.1
  • ninja ==1.11.1.1
  • nltk ==3.8.1
  • numexpr ==2.8.8
  • numpy ==1.26.3
  • nvidia-cublas-cu12 ==12.1.3.1
  • nvidia-cuda-cupti-cu12 ==12.1.105
  • nvidia-cuda-nvrtc-cu12 ==12.1.105
  • nvidia-cuda-runtime-cu12 ==12.1.105
  • nvidia-cudnn-cu12 ==8.9.2.26
  • nvidia-cufft-cu12 ==11.0.2.54
  • nvidia-curand-cu12 ==10.3.2.106
  • nvidia-cusolver-cu12 ==11.4.5.107
  • nvidia-cusparse-cu12 ==12.1.0.106
  • nvidia-nccl-cu12 ==2.18.1
  • nvidia-nvjitlink-cu12 ==12.3.101
  • nvidia-nvtx-cu12 ==12.1.105
  • openai ==1.14.2
  • ordered-set ==4.1.0
  • orjson ==3.9.15
  • packaging ==24.0
  • packaging ==23.2
  • pandas ==2.1.4
  • parso ==0.8.3
  • pathvalidate ==3.2.0
  • peft ==0.7.1
  • pexpect ==4.9.0
  • pillow ==10.2.0
  • platformdirs ==4.1.0
  • pluggy ==1.3.0
  • portalocker ==2.8.2
  • prompt-toolkit ==3.0.43
  • protobuf ==4.25.1
  • psutil ==5.9.7
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • py-cpuinfo ==9.0.0
  • pyarrow ==14.0.2
  • pyarrow-hotfix ==0.6
  • pybind11 ==2.11.1
  • pycountry ==23.12.11
  • pydantic ==1.10.14
  • pydantic_core ==2.14.6
  • pydub ==0.25.1
  • pynvml ==11.5.0
  • pyparsing ==3.1.2
  • pyproject-api ==1.6.1
  • pytablewriter ==1.2.0
  • python-dateutil ==2.8.2
  • python-multipart ==0.0.9
  • pytz ==2023.3.post1
  • referencing ==0.34.0
  • regex ==2023.12.25
  • requests ==2.31.0
  • responses ==0.18.0
  • rootpath ==0.1.1
  • rouge ==1.0.1
  • rouge-score ==0.1.2
  • rpds-py ==0.18.0
  • sacrebleu ==1.5.0
  • safetensors ==0.4.1
  • scikit-learn ==1.3.2
  • scipy ==1.11.4
  • semantic-version ==2.10.0
  • sentencepiece ==0.1.99
  • six ==1.16.0
  • sniffio ==1.3.1
  • sqlitedict ==2.1.0
  • stack-data ==0.6.3
  • starlette ==0.36.3
  • sympy ==1.12
  • tabledata ==1.3.3
  • tabulate ==0.9.0
  • tcolorpy ==0.1.4
  • termcolor ==2.4.0
  • texttable ==1.7.0
  • threadpoolctl ==3.2.0
  • tokenizers ==0.15.0
  • toml ==0.10.2
  • tomli ==2.0.1
  • toolz ==0.12.1
  • torch ==2.1.2
  • torchvision ==0.16.2
  • tox ==4.11.4
  • tqdm ==4.66.1
  • tqdm-multiprocess ==0.0.11
  • traitlets ==5.14.1
  • transformers ==4.37.2
  • triton ==2.1.0
  • typepy ==1.3.2
  • typing_extensions ==4.9.0
  • tzdata ==2023.4
  • uc-micro-py ==1.0.3
  • urllib3 ==2.1.0
  • uvicorn ==0.29.0
  • virtualenv ==20.25.0
  • wcwidth ==0.2.13
  • websockets ==12.0
  • xxhash ==3.4.1
  • yarl ==1.9.4
  • zstandard ==0.22.0