https://github.com/amazon-science/ceret-llm-refine
This is the code repository for paper CERET: Cost-Effective Extrinsic Refinement for Text Generation (NAACL 2024).
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary
Repository
This is the code repository for paper CERET: Cost-Effective Extrinsic Refinement for Text Generation (NAACL 2024).
Basic Info
Statistics
- Stars: 5
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
CERET
Authors: Jason Cai, Hang Su, Monica Sunkara, Igor Shalyminov, Saab Mansour
This is the code repository for paper CERET: Cost-Effective Extrinsic Refinement for Text Generation (NAACL 2024). [Link]
CERET is a framework for refining LLM predictions by considering semantic stability, entailment and inter-sample uncertainty measures. This approach does not require additional training, or iterative inference of LLM(s).
Experimental results show that CERET significantly outperforms Self-consistency and Self-rarank baselines for abstractive summarization and question answering. Compared to various LLM self-improvement / self-reflection methods, our approach has lower latency and is more cost-effective.
Setup
You can create a conda environment with provided YAML file.
conda env create -f env/ceret_env_v05.yml
conda activate ceret_env
Alternatively, if you choose to configure the evironment yourself, please ensure the following major libraries are installed:
bert_score
nltk
rouge
sentencepiece
torch
transformers
Inference Command
Below is the command for refining a subset of DialogSum predictions produced by Vicuna. Replace /PATH/TO/CERET-LLM-refine with the path to CERET directory. For input data schema, please refer to ${code_dir}/exp/dialog_sum/vicuna13b_dialogsum_test_1107/.
``` codedir="/PATH/TO/CERET-LLM-refine" cd $codedir
conda activate
stage1dir="${codedir}/exp/dialogsum/vicuna13bdialogsumtest1107" stage2dir="${stage1dir}_refine1"
evalmode="gen" mkdir -p $stage2dir export PYTHONPATH=$code_dir
python eval/refiner.py \ --inpath "${stage1dir}/data.json" \ --ingroupedhypspath "${stage1dir}/groupedhyps.json" \ --outpath "${stage2dir}/data.json" \ --outgroupedhypspath "${stage2dir}/groupedhyps.json" \ --tunecoeffrespath "${stage2dir}/tunecoeffres.json" \ --logoutpath "${stage2dir}/log.txt" \ --embsavepath "${stage2dir}/semanticemb.npy" \ --nlisavepath "${stage2dir}/nlisaveres.json" \ --evalmode $evalmode \ --scorecoeff1 0.3333 \ --scorecoeff2 0.3333 \ --docoefficient_tuning true
```
Example Output
2024-04-23 19:20:12,388 [INFO] ### Evaluating with best coefficient combo.
2024-04-23 19:20:12,426 [INFO] src_len: 128.76
2024-04-23 19:20:12,426 [INFO] ref_len: 20.914
2024-04-23 19:20:12,426 [INFO] hyp_len: 36.394
2024-04-23 19:20:12,690 [INFO] rouge_1: 35.302
2024-04-23 19:20:12,690 [INFO] rouge_2: 11.692
2024-04-23 19:20:12,690 [INFO] rouge_l: 28.918
2024-04-23 19:20:37,850 [INFO] bert_score, P: 25.202
2024-04-23 19:20:37,850 [INFO] bert_score, R: 36.440
2024-04-23 19:20:37,850 [INFO] bert_score, F1: 30.775
2024-04-23 19:20:37,850 [INFO] rouge_1/rouge_2/rouge_l/bert_score_F1/hyp_len:
2024-04-23 19:20:37,850 [INFO] 35.3, 11.7, 28.9, 30.8, 36.4
2024-04-23 19:20:37,943 [INFO] Oracle performance -- rouge_1: 43.098599764120834
2024-04-23 19:20:37,944 [INFO] ### Evaluating with input coefficient combo.
2024-04-23 19:20:37,977 [INFO] src_len: 128.76
2024-04-23 19:20:37,977 [INFO] ref_len: 20.914
2024-04-23 19:20:37,977 [INFO] hyp_len: 42.122
2024-04-23 19:20:38,271 [INFO] rouge_1: 34.688
2024-04-23 19:20:38,271 [INFO] rouge_2: 11.691
2024-04-23 19:20:38,271 [INFO] rouge_l: 28.443
2024-04-23 19:21:04,944 [INFO] bert_score, P: 21.965
2024-04-23 19:21:04,945 [INFO] bert_score, R: 36.807
2024-04-23 19:21:04,945 [INFO] bert_score, F1: 29.306
2024-04-23 19:21:04,945 [INFO] rouge_1/rouge_2/rouge_l/bert_score_F1/hyp_len:
2024-04-23 19:21:04,945 [INFO] 34.7, 11.7, 28.4, 29.3, 42.1
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.
Citation
``` @article{jcai2024ceret, title={CERET: Cost-Effective Extrinsic Refinement for Text Generation}, author={Cai, Jason and Su, Hang and Sunkara, Monica and Shalyminov, Igor and Mansour, Saab}, journal={arXiv preprint arXiv:2406.05588}, year={2024}, url={https://arxiv.org/abs/2406.05588} }
```
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0