https://github.com/amazon-science/dq-bart
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)
Basic Info
Statistics
- Stars: 50
- Watchers: 3
- Forks: 10
- Open Issues: 1
- Releases: 0
Created over 4 years ago
· Last pushed about 3 years ago
Metadata Files
Readme
Contributing
License
Code of conduct
README.md
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
This repository contains the authors' implementation of the ACL 2022 paper "DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization."
Requirements
- Install PyTorch from the official website.
- Install dependencies via
pip install -r requirements.txt. - The teacher model should be available locally, e.g., downloading manually from the huggingface model hub.
Sample Command
- The following command will train an
8-8-8 3-1model on CNN/DailyMail dataset. You may use accelerate for distributed training.bash python3 run_summarization_no_trainer.py \ --model_name_or_path ainize/bart-base-cnn \ --dataset_name cnn_dailymail \ --dataset_config_name 3.0.0 \ --pred_distill \ --intermediate_distill \ --num_train_epochs 20 \ --weight_bits 8 \ --do_train \ --do_test \ --distill_encoder 3 \ --distill_decoder 1 \ --learning_rate 3e-5## Citation You may cite our work using@inproceedings{li2022dqbart, title={DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization}, author={Li, Zheng and Wang, Zijian and Tan, Ming and Nallapati, Ramesh and Bhatia, Parminder and Arnold, Andrew and Xiang, Bing and Roth, Dan}, booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)}, pages={203--211}, year={2022} }
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 7
- Total pull requests: 2
- Average time to close issues: 25 days
- Average time to close pull requests: 2 months
- Total issue authors: 6
- Total pull request authors: 2
- Average comments per issue: 1.86
- Average comments per pull request: 0.5
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- raypretam (2)
- wnntju (1)
- rana-alshaikh (1)
- sanchit-gandhi (1)
- jmdu99 (1)
- sinking-point (1)
Pull Request Authors
- dependabot[bot] (1)
- Gooogr (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (1)
Dependencies
requirements.txt
pypi
- accelerate ==0.5.1
- datasets ==1.18.4
- nltk *
- rouge_score *
- sacrebleu ==2.0
- setuptools <50
- tensorboard *
- transformers ==4.17.0
- wandb *