https://github.com/betswish/mirage-reproduce

Code for reproducing all experimental results in our MIRAGE paper: https://arxiv.org/abs/2406.13663

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Code for reproducing all experimental results in our MIRAGE paper: https://arxiv.org/abs/2406.13663

Basic Info

Host: GitHub
Owner: Betswish
Language: Python
Default Branch: main
Homepage:
Size: 10.1 MB

Statistics

Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme

Toward faithful answer attribution with model internals 🌴

Authors (* Equal contribution): Jirui Qi* • Gabriele Sarti* • Raquel Fernández • Arianna Bisazza

[!TIP] This is the repository for reproducing all experimental results in our MIRAGE paper, accepted by the EMNLP 2024 Main Conference. Also, check our demo here!

If you find the paper helpful and use the content, we kindly suggest you cite through: bibtex @inproceedings{qi-etal-2024-model, title = "Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation", author = "Qi, Jirui and Sarti, Gabriele and Fern{\'a}ndez, Raquel and Bisazza, Arianna", editor = "Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung", booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2024", address = "Miami, Florida, USA", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.emnlp-main.347/", doi = "10.18653/v1/2024.emnlp-main.347", pages = "6037--6053", abstract = "Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sources, and fail to faithfully reflect LLMs' context usage throughout the generation. In this work, we present MIRAGE {--} Model Internals-based RAG Explanations {--} a plug-and-play approach using model internals for faithful answer attribution in RAG applications. MIRAGE detects context-sensitive answer tokens and pairs them with retrieved documents contributing to their prediction via saliency methods. We evaluate our proposed approach on a multilingual extractive QA dataset, finding high agreement with human answer attribution. On open-ended QA, MIRAGE achieves citation quality and efficiency comparable to self-citation while also allowing for a finer-grained control of attribution parameters. Our qualitative evaluation highlights the faithfulness of MIRAGE`s attributions and underscores the promising application of model internals for RAG answer attribution. Code and data released at https://github.com/Betswish/MIRAGE." }

Environment:

For a quick start, you may load our environment easily with Conda: conda env create -f MIRAGE.yaml

Alternatively, you can install all packages by yourself:

Python: 3.9.19

Packages: pip install -r requirements.txt

Reproduction of the alignment with human annotations (Experiments in Section 4)

The code is in the folder sec4_alignment. See more detailed instructions in the README.MD there.

Reproduction of citation generation and comparison with self-citation on long-form QA dataset ELI5 (Experiments in Section 5)

The code is in the folder sec5_longQA. See more detailed instructions in the README.MD there.

Owner

Login: Betswish
Kind: user

Repositories: 1
Profile: https://github.com/Betswish

GitHub Events

Total

Watch event: 2
Push event: 20

Last Year

Watch event: 2
Push event: 20

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/betswish/mirage-reproduce

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Toward faithful answer attribution with model internals 🌴

Environment:

Reproduction of the alignment with human annotations (Experiments in Section 4)

Reproduction of citation generation and comparison with self-citation on long-form QA dataset ELI5 (Experiments in Section 5)

Owner

GitHub Events

Total

Last Year