feedbackpreference

This is the repo for our proposed Feedback Preference corpus

https://github.com/gmftbygmftby/feedbackpreference

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.6%) to scientific vocabulary

Keywords

critique feedback llm preference
Last synced: 6 months ago · JSON representation ·

Repository

This is the repo for our proposed Feedback Preference corpus

Basic Info
  • Host: GitHub
  • Owner: gmftbyGMFTBY
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 2.88 MB
Statistics
  • Stars: 4
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
critique feedback llm preference
Created about 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

A Preference-based Feedback Corpus

Tian Lan, Ziao Ma, RongCheng Tu, Chen Xu, Heyan Huang, Xian-ling Mao

Beijing Insititute of Technology


TL;DR: We introduce a corpus for critique-tuning, which includes pairs of feedbacks for preference learning, like PPO or DPO (RLHF), aiming to improve the alignment between generated feedback with human judgments.

Introduction

The self-critique capability of large-scale language models is currently a very popular research topic. The impressive self-crituque capabilities of GPT-4's powerful proprietary model stand out. However, in contrast, the current abilities of open-source models, like Llama2 series, are relatively limited. To address this issue, lots of efforts, such as UltraCM-13B and CritiqueLLM have been taken to train open-source models by utilizing data generated by GPT-4 API, aiming to distill its powerful self-critique capabilities. However, the community currently lacks the preference-based self-critique data that consists of a high-quality and a relatively low-quality feedback, leading to the gaps between existing critique-tuned open-source models and human preferences and relevance. Therefore, to fill this gap, we have used GPT-4 to collect preference-based feedback (critique) dataset based on the Feedback-Collection corpus, further enhancing the self-critique capabilities of open-source models.

In addition to the Feedback-Collection corpus, there are other open-source critique-tuning datasets such as UltraFeedback and Auto-J. However, these datasets lack essential scoring rubrics and reference responses, making it challenging to assess the quality of their feedbacks. Unlike these works, Feedback-Collection not only defines strict scoring criteria but also provides reference response, high-quality feedbacks generated by GPT-4, and critique-tuned 7B/13B open-source models. This comprehensive set of information contributes to a more thorough evaluation of critique capabilities. Thus, we choose to build the preference-based feedback corpus based on Feedback-Collection corpus.

Specifically, we first inference their 7B and 13B critique-tuned LLMs on the training set of Feedback-Collection. Then, we collect generated feedbacks with a score difference of 2 or more compared to the ratings given by GPT-4 feedbacks (the overall scoring range being 1-5 points). This choice was made because these feedbacks exhibit significant score disparities in comparison to those generated by GPT-4, making it relatively easy to assess the quality differences between them. Finally, we prompt GPT-4 to choose which feedback is better by using Chain-of-Thought, i.e. generating the rationale about these feedbacks. The complete prompt is shown as follow: ```python '''

Task Description:

An instruction (might include an Input inside it), a response to evaluate, a reference answer that gets a score of 5, a score rubric representing a evaluation criteria, and two generated feedbacks are given. 1. Write a detailed analysis that compare qualities of two feedbacks strictly based on the given score rubric (meta-feedback), not evaluating in general. 2. Each feedback contains the [RESULT] to provide their scores for response, ranging from 1 to 5 (5 is perfect and 1 is very bad). 3. After writing an analysis (meta-feedback), write a preference label indicates which feedback is better. You should refer to the score rubric. 4. The output format should look as follows: \"Meta-Feedback: (write an analysis for two feedbacks) LABEL\" 5. Please do not generate any other opening, closing, and explanations.

The instruction to evaluate:

{orig_instruction}

Response to evaluate:

{orig_response}

Reference Answer (Score 5):

{origreferenceanswer}

Score Rubrics:

[{origcriteria}] Score 1: {origscore1description} Score 2: {origscore2description} Score 3: {origscore3description} Score 4: {origscore4description} Score 5: {origscore5_description}

Feedbacks to evaluate:


A: {feedback_a}

B: {feedback_b}

Meta-Feedbacks:

''' ```

Note that it is just the initial phase of our work. In the future, we plan to build upon this foundation by annotating a high-quality preference-based critique test set, and conduct comprehensive tests on existing critique-tuned open-source models.

Road Map

  • [ ] Collecting more preference-based feedback samples from existing critique-tuned corpus
  • [x] Annotate High-quality Test Set
  • [ ] Examine the existing reward models and LLMs on our annotated feedback preference corpus
  • [ ] Training Llama2-7B Model with SFT and DPO
  • [ ] Reward Model for Feedback Preference

Dataset Link

Our preference feedback dataset could be found in huggingface dataset: FeedbackPreference You can also find it in data/processed_feedback_preference.json.

Citation

Please cite our paper if Shepherd contributes in your work:

bibtex @misc{Tian_Feedback_Preference_2023, author = {Tian, Lan}, month = dec, title = {{Feedback Preference}}, url = {https://github.com/gmftbyGMFTBY/FeedbackPreference}, year = {2023} }

Owner

  • Name: GMFTBY
  • Login: gmftbyGMFTBY
  • Kind: user
  • Location: China - Beijing
  • Company: Beijing Institute of Technology

Those who are crazy enough to think they can change the world are the ones who can.

Citation (CITATION.cff)

cff-version: 0.0.1
message: "If you use this software, please cite it as below."
type: data
authors:
- family-names: "Tian"
  given-names: "Lan"
  orcid: "https://orcid.org/my-orcid?orcid=0000-0002-5200-1537"
title: "Feedback Preference"
version: 0.0.1
date-released: 2023.12.29
url: "https://github.com/gmftbyGMFTBY/FeedbackPreference"

GitHub Events

Total
Last Year