https://github.com/amazon-science/dq-bart

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)

https://github.com/amazon-science/dq-bart

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)

Basic Info
  • Host: GitHub
  • Owner: amazon-science
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 29.3 KB
Statistics
  • Stars: 50
  • Watchers: 3
  • Forks: 10
  • Open Issues: 1
  • Releases: 0
Created over 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization

This repository contains the authors' implementation of the ACL 2022 paper "DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization."

Requirements

  • Install PyTorch from the official website.
  • Install dependencies via pip install -r requirements.txt.
  • The teacher model should be available locally, e.g., downloading manually from the huggingface model hub.

Sample Command

  • The following command will train an 8-8-8 3-1 model on CNN/DailyMail dataset. You may use accelerate for distributed training. bash python3 run_summarization_no_trainer.py \ --model_name_or_path ainize/bart-base-cnn \ --dataset_name cnn_dailymail \ --dataset_config_name 3.0.0 \ --pred_distill \ --intermediate_distill \ --num_train_epochs 20 \ --weight_bits 8 \ --do_train \ --do_test \ --distill_encoder 3 \ --distill_decoder 1 \ --learning_rate 3e-5 ## Citation You may cite our work using @inproceedings{li2022dqbart, title={DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization}, author={Li, Zheng and Wang, Zijian and Tan, Ming and Nallapati, Ramesh and Bhatia, Parminder and Arnold, Andrew and Xiang, Bing and Roth, Dan}, booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)}, pages={203--211}, year={2022} }

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Owner

  • Name: Amazon Science
  • Login: amazon-science
  • Kind: organization

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 7
  • Total pull requests: 2
  • Average time to close issues: 25 days
  • Average time to close pull requests: 2 months
  • Total issue authors: 6
  • Total pull request authors: 2
  • Average comments per issue: 1.86
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • raypretam (2)
  • wnntju (1)
  • rana-alshaikh (1)
  • sanchit-gandhi (1)
  • jmdu99 (1)
  • sinking-point (1)
Pull Request Authors
  • dependabot[bot] (1)
  • Gooogr (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (1)

Dependencies

requirements.txt pypi
  • accelerate ==0.5.1
  • datasets ==1.18.4
  • nltk *
  • rouge_score *
  • sacrebleu ==2.0
  • setuptools <50
  • tensorboard *
  • transformers ==4.17.0
  • wandb *