https://github.com/amazon-science/irgr

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: amazon-science
License: apache-2.0
Language: Python
Default Branch: main
Size: 2.06 MB

Statistics

Stars: 12
Watchers: 3
Forks: 0
Open Issues: 2
Releases: 0

Created about 4 years ago · Last pushed almost 3 years ago

Metadata Files

Readme Contributing License Code of conduct

Iterative Retrieval-Generation Reasoner (NAACL 2022)

This repository contains the code and data for the paper:

Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner

Entailment Trees represents a chain of reasoning that shows how a hypothesis (or an answer to a question) can be explained from simpler textual evidence.

Task definition

Iterative Retrieval-Generation Reasoner our proposed architecture that iteratively searches for suitable premises, constructing a single entailment step at a time. At every generation step, the model searches for a distinct set of premises that will support the generation of a single step, therefore mitigating the language model’s input size limit and improving generation correctness.

System overview

Setting Up Environemnt

First you need to install the dependencies of the project:

bash conda env create --file irgr.yml pip install -r requirements.txt

Then activate your conda environment:

bash conda activate irgr

Setting up Jupyter

Most of the code is wrapped inside Jupyter Notebooks.

You can either start a Jupyter server locally or follow the AWS instructions on how to setup the jupyter notebook on EC2 instances and access it through your browser:

https://docs.aws.amazon.com/dlami/latest/devguide/setup-jupyter.html

Data Folder Structure

You can download the Entailment Bank data and evaluation code by running:

python setup.py

Running Experiments

You can re-start the kernel and run the whole notebook to execute data-loading / training / evaluation

dada loading has to be done before executing training and evaluation.

Entailment Tree Generation

The main model's code is in src/entailment_iterative.ipynb.

The model can generate explanations and proofs for EntailmentBank dataset.

Premise Retrieval

The main model's code is in src/entailment_retrieval.ipynb.

This model retrieves a set of premises from the corpus. Training uses the EntailmentBank + World Tree V2 corpus.

Citation

@inproceedings{neves-ribeiro-etal-2022-entailment, title = "Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner", author = "Neves Ribeiro, Danilo and Wang, Shen and Ma, Xiaofei and Dong, Rui and Wei, Xiaokai and Zhu, Henghui and Chen, Xinchi and Xu, Peng and Huang, Zhiheng and Arnold, Andrew and Roth, Dan", booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022", month = jul, year = "2022", address = "Seattle, United States", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.findings-naacl.35", doi = "10.18653/v1/2022.findings-naacl.35", pages = "465--475", abstract = "Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain the reasoning behind a QA system{'}s answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300{\%} gain in overall correctness.", }

Owner

Name: Amazon Science
Login: amazon-science
Kind: organization

Website: https://amazon.science
Twitter: AmazonScience
Repositories: 80
Profile: https://github.com/amazon-science

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 2
Total pull requests: 2
Average time to close issues: about 2 months
Average time to close pull requests: 13 minutes
Total issue authors: 2
Total pull request authors: 1
Average comments per issue: 3.0
Average comments per pull request: 0.5
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 2

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science