https://github.com/amazon-science/irgr
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: amazon-science
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 2.06 MB
Statistics
- Stars: 12
- Watchers: 3
- Forks: 0
- Open Issues: 2
- Releases: 0
Metadata Files
README.md
Iterative Retrieval-Generation Reasoner (NAACL 2022)
This repository contains the code and data for the paper:
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner
Entailment Trees represents a chain of reasoning that shows how a hypothesis (or an answer to a question) can be explained from simpler textual evidence.
Iterative Retrieval-Generation Reasoner our proposed architecture that iteratively searches for suitable premises, constructing a single entailment step at a time. At every generation step, the model searches for a distinct set of premises that will support the generation of a single step, therefore mitigating the language model’s input size limit and improving generation correctness.
Setting Up Environemnt
First you need to install the dependencies of the project:
bash
conda env create --file irgr.yml
pip install -r requirements.txt
Then activate your conda environment:
bash
conda activate irgr
Setting up Jupyter
Most of the code is wrapped inside Jupyter Notebooks.
You can either start a Jupyter server locally or follow the AWS instructions on how to setup the jupyter notebook on EC2 instances and access it through your browser:
https://docs.aws.amazon.com/dlami/latest/devguide/setup-jupyter.html
Data Folder Structure
You can download the Entailment Bank data and evaluation code by running:
python setup.py
Running Experiments
You can re-start the kernel and run the whole notebook to execute data-loading / training / evaluation
dada loading has to be done before executing training and evaluation.
Entailment Tree Generation
The main model's code is in src/entailment_iterative.ipynb.
The model can generate explanations and proofs for EntailmentBank dataset.
Premise Retrieval
The main model's code is in src/entailment_retrieval.ipynb.
This model retrieves a set of premises from the corpus. Training uses the EntailmentBank + World Tree V2 corpus.
Citation
@inproceedings{neves-ribeiro-etal-2022-entailment,
title = "Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner",
author = "Neves Ribeiro, Danilo and
Wang, Shen and
Ma, Xiaofei and
Dong, Rui and
Wei, Xiaokai and
Zhu, Henghui and
Chen, Xinchi and
Xu, Peng and
Huang, Zhiheng and
Arnold, Andrew and
Roth, Dan",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022",
month = jul,
year = "2022",
address = "Seattle, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-naacl.35",
doi = "10.18653/v1/2022.findings-naacl.35",
pages = "465--475",
abstract = "Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain the reasoning behind a QA system{'}s answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300{\%} gain in overall correctness.",
}
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 2
- Total pull requests: 2
- Average time to close issues: about 2 months
- Average time to close pull requests: 13 minutes
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 3.0
- Average comments per pull request: 0.5
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 2
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- unnormalization (1)
- Raising-hrx (1)
Pull Request Authors
- dependabot[bot] (1)