feedbackpreference
This is the repo for our proposed Feedback Preference corpus
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Keywords
Repository
This is the repo for our proposed Feedback Preference corpus
Basic Info
Statistics
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
A Preference-based Feedback Corpus
Tian Lan, Ziao Ma, RongCheng Tu, Chen Xu, Heyan Huang, Xian-ling Mao
Beijing Insititute of Technology
TL;DR: We introduce a corpus for critique-tuning, which includes pairs of feedbacks for preference learning, like PPO or DPO (RLHF), aiming to improve the alignment between generated feedback with human judgments.
Introduction
The self-critique capability of large-scale language models is currently a very popular research topic. The impressive self-crituque capabilities of GPT-4's powerful proprietary model stand out. However, in contrast, the current abilities of open-source models, like Llama2 series, are relatively limited. To address this issue, lots of efforts, such as UltraCM-13B and CritiqueLLM have been taken to train open-source models by utilizing data generated by GPT-4 API, aiming to distill its powerful self-critique capabilities. However, the community currently lacks the preference-based self-critique data that consists of a high-quality and a relatively low-quality feedback, leading to the gaps between existing critique-tuned open-source models and human preferences and relevance. Therefore, to fill this gap, we have used GPT-4 to collect preference-based feedback (critique) dataset based on the Feedback-Collection corpus, further enhancing the self-critique capabilities of open-source models.
In addition to the Feedback-Collection corpus, there are other open-source critique-tuning datasets such as UltraFeedback and Auto-J. However, these datasets lack essential scoring rubrics and reference responses, making it challenging to assess the quality of their feedbacks. Unlike these works, Feedback-Collection not only defines strict scoring criteria but also provides reference response, high-quality feedbacks generated by GPT-4, and critique-tuned 7B/13B open-source models. This comprehensive set of information contributes to a more thorough evaluation of critique capabilities. Thus, we choose to build the preference-based feedback corpus based on Feedback-Collection corpus.
Specifically, we first inference their 7B and 13B critique-tuned LLMs on the training set of Feedback-Collection. Then, we collect generated feedbacks with a score difference of 2 or more compared to the ratings given by GPT-4 feedbacks (the overall scoring range being 1-5 points). This choice was made because these feedbacks exhibit significant score disparities in comparison to those generated by GPT-4, making it relatively easy to assess the quality differences between them. Finally, we prompt GPT-4 to choose which feedback is better by using Chain-of-Thought, i.e. generating the rationale about these feedbacks. The complete prompt is shown as follow: ```python '''
Task Description:
An instruction (might include an Input inside it), a response to evaluate, a reference answer that gets a score of 5, a score rubric representing a evaluation criteria, and two generated feedbacks are given. 1. Write a detailed analysis that compare qualities of two feedbacks strictly based on the given score rubric (meta-feedback), not evaluating in general. 2. Each feedback contains the [RESULT] to provide their scores for response, ranging from 1 to 5 (5 is perfect and 1 is very bad). 3. After writing an analysis (meta-feedback), write a preference label indicates which feedback is better. You should refer to the score rubric. 4. The output format should look as follows: \"Meta-Feedback: (write an analysis for two feedbacks) LABEL\" 5. Please do not generate any other opening, closing, and explanations.
The instruction to evaluate:
{orig_instruction}
Response to evaluate:
{orig_response}
Reference Answer (Score 5):
{origreferenceanswer}
Score Rubrics:
[{origcriteria}] Score 1: {origscore1description} Score 2: {origscore2description} Score 3: {origscore3description} Score 4: {origscore4description} Score 5: {origscore5_description}
Feedbacks to evaluate:
A: {feedback_a}
B: {feedback_b}
Meta-Feedbacks:
''' ```
Note that it is just the initial phase of our work. In the future, we plan to build upon this foundation by annotating a high-quality preference-based critique test set, and conduct comprehensive tests on existing critique-tuned open-source models.
Road Map
- [ ] Collecting more preference-based feedback samples from existing critique-tuned corpus
- [x] Annotate High-quality Test Set
- [ ] Examine the existing reward models and LLMs on our annotated feedback preference corpus
- [ ] Training Llama2-7B Model with SFT and DPO
- [ ] Reward Model for Feedback Preference
Dataset Link
Our preference feedback dataset could be found in huggingface dataset: FeedbackPreference
You can also find it in data/processed_feedback_preference.json.
Citation
Please cite our paper if Shepherd contributes in your work:
bibtex
@misc{Tian_Feedback_Preference_2023,
author = {Tian, Lan},
month = dec,
title = {{Feedback Preference}},
url = {https://github.com/gmftbyGMFTBY/FeedbackPreference},
year = {2023}
}
Owner
- Name: GMFTBY
- Login: gmftbyGMFTBY
- Kind: user
- Location: China - Beijing
- Company: Beijing Institute of Technology
- Repositories: 8
- Profile: https://github.com/gmftbyGMFTBY
Those who are crazy enough to think they can change the world are the ones who can.
Citation (CITATION.cff)
cff-version: 0.0.1 message: "If you use this software, please cite it as below." type: data authors: - family-names: "Tian" given-names: "Lan" orcid: "https://orcid.org/my-orcid?orcid=0000-0002-5200-1537" title: "Feedback Preference" version: 0.0.1 date-released: 2023.12.29 url: "https://github.com/gmftbyGMFTBY/FeedbackPreference"