https://github.com/amazon-science/adasum
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: amazon-science
- License: other
- Language: Python
- Default Branch: main
- Size: 439 KB
Statistics
- Stars: 5
- Watchers: 8
- Forks: 1
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Few-shot Fine-tuning for Opinion Summarization
This repository contains the main codebase for the corresponding NAACL findings paper. In this work, we explored in-domain information storage to adapters by pre-training them on customer reviews via the leave-one-out objective. Further, we fine-tune the pre-trained adapters on a handful of summaries. This method yields state-of-the-art results in terms of ROUGE scores and reduces semantic mistakes in generated summaries.
1. Conda environment
In this project, we used conda for environments. To re-create the environment, use the command below.
conda env create --file environment.yml
Then, activate it:
conda activate adasum
2. FAIRSEQ
The codebase relies on FAIRSEQ, which can be downloaded and installed in a parent folder as follows.
``` git clone https://github.com/pytorch/fairseq.git mv fairseq fairseqlib cd fairseqlib
git reset --hard 81046fc pip install --editable ./ ```
Please make sure you use the correct commit to avoid incompatibility issues. Also, set the global variable.
export MKL_THREADING_LAYER=GNU
3. Folder structure
The main codebase is stored at adasum.
- artifacts: checkpoints and model generated summaries (checkpoints need to be download separately);
- data: contains pre-training and fine-tuning datasets (see pre-processing folder for instructions on how to obtain data);
- adasum: fairseq files for adasum and adaqsum models;
- preprocessing: scripts for data pre-processing;
- shared: files shared between adasum and preprocessing scripts.
4. Citation
@inproceedings{brazinskas-etal-2022-efficient,
title = "Efficient Few-Shot Fine-Tuning for Opinion Summarization",
author = "Brazinskas, Arthur and
Nallapati, Ramesh and
Bansal, Mohit and
Dreyer, Markus",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022",
month = jul,
year = "2022",
address = "Seattle, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-naacl.113",
pages = "1509--1523"
}
5. Security
See CONTRIBUTING for more information.
6. License
This project is licensed under the CC-BY-NC-4.0 License.
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 10
- Total pull requests: 4
- Average time to close issues: 5 days
- Average time to close pull requests: about 18 hours
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 0.6
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- avshalomman (3)
- Zypressen021 (2)
- pulkitbv (1)
Pull Request Authors
- abrazinskas (3)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- antlr4-python3-runtime ==4.8
- cffi ==1.14.5
- click ==8.0.1
- cython ==0.29.23
- filelock ==3.0.12
- hydra-core ==1.0.6
- importlib-metadata ==4.5.0
- importlib-resources ==5.1.4
- joblib ==1.0.1
- nltk ==3.6.2
- omegaconf ==2.0.6
- pandas ==1.2.5
- portalocker ==2.0.0
- psutil ==5.8.0
- pycparser ==2.20
- pyrouge ==0.1.3
- python-dateutil ==2.8.1
- pytz ==2021.1
- pyyaml ==5.4.1
- regex ==2021.4.4
- sacrebleu ==1.5.1
- scipy ==1.7.1
- tqdm ==4.61.1
- zipp ==3.4.1