deepa2
Resources for creating, importing and using DeepA2 Argument Analysis Framework datasets
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.8%) to scientific vocabulary
Keywords
Repository
Resources for creating, importing and using DeepA2 Argument Analysis Framework datasets
Basic Info
Statistics
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 16
- Releases: 16
Topics
Metadata Files
README.md
Deep Argument Analysis (deepa2)
This project provides deepa2, which
- 🥚 takes NLP data (e.g. NLI, argument mining) as ingredients;
- 🎂 bakes DeepA2 datatsets conforming to the Deep Argument Analysis Framework;
- 🍰 serves DeepA2 data as text2text datasets suitable for training language models.
There's a public collection of 🎂 DeepA2 datatsets baked with deepa2 at the HF hub.
The Documentation describes usage options and gives background info on the Deep Argument Analysis Framework.
Quickstart
Integrating deepa2 into Your Training Pipeline
- Install
deepa2into your ML project's virtual environment, e.g.:
bash
source my-projects-venv/bin/activate
python --version # should be ^3.7
python -m pip install deepa2
- Add
deepa2preprocessor to your training pipeline. Your training script may look like, for example:
```sh
!/bin/bash
configure and activate environment
...
download deepa2 datasets and
prepare for text2text training
deepa2 serve \ --path some-deepa2-dataset \ # <<< 🎂 --exportformat csv \ --exportpath t2t \ # >>> 🍰
run default training script,
e.g., with 🤗 Transformers
python .../runsummarization.py \ --trainfile t2t/train.csv \ # <<< 🍰 --textcolumn "text" \ --summarycolumn "target" \ --...
clean-up
rm -r t2t ```
- That's it.
Create DeepA2 datasets with deepa2 from existing NLP data
Install poetry.
Clone the repository:
bash
git clone https://github.com/debatelab/deepa2-datasets.git
Install this package from within the repo's root folder:
bash
poetry install
Bake a DeepA2 dataset, e.g.:
bash
poetry run deepa2 bake \\
--name esnli \\ # <<< 🥚
--debug-size 100 \\
--export-path ./data/processed # >>> 🎂
Contribute a DeepA2Builder for another Dataset
We welcome contributions to this repository, especially scripts that port existing datasets to the DeepA2 Framework. Within this repo, a code module that transforms data into the DeepA2 format contains
- a Builder class that describes how DeepA2 examples will be constructed and that implements the abstract
builder.Builderinterface (such as, e.g.,builder.entailmentbank_builder.EnBankBuilder); - a DataLoader which provides a method for loading the raw data as a 🤗 Dataset object (such as, for example,
builder.entailmentbank_builder.EnBankLoader) -- you may usedeepa2.DataLoaderas is in case the data is available in a way compatible with 🤗 Dataset; - dataclasses which describe the features of the raw data and the preprocessed data, and which extend the dummy classes
deepa2.RawExampleanddeepa2.PreprocessedExample; - a collection of unit tests that check the concrete Builder's methods (such as, e.g.,
tests/test_enbank.py); - a documentation of the pipeline (as for example in
docs/esnli.md).
Consider suggesting to collaboratively construct such a pipeline by opening a new issue.
Citation
This repository builds on and extends the DeepA2 Framework originally presented in:
bibtex
@article{betz2021deepa2,
title={DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models},
author={Gregor Betz and Kyle Richardson},
year={2021},
eprint={2110.01509},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Owner
- Name: DebateLab @ KIT
- Login: debatelab
- Kind: organization
- Website: https://debatelab.github.io/
- Repositories: 5
- Profile: https://github.com/debatelab
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Betz" given-names: "Gregor" orcid: "https://orcid.org/0000-0001-5802-5030" website: "https://www.gregorbetz.de/" - family-names: "Richardson" given-names: "Kyle" website: "https://www.krichardson.me/" title: "DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models" version: 0.1.1 date-released: 2021-10-04 url: "https://arxiv.org/abs/2110.01509"
GitHub Events
Total
- Watch event: 3
Last Year
- Watch event: 3
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 27
- Total pull requests: 20
- Average time to close issues: 14 days
- Average time to close pull requests: 35 minutes
- Total issue authors: 2
- Total pull request authors: 2
- Average comments per issue: 0.19
- Average comments per pull request: 0.0
- Merged pull requests: 20
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- ggbetz (24)
- yakazimir (3)
Pull Request Authors
- ggbetz (19)
- dependabot[bot] (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 63 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 17
- Total maintainers: 1
pypi.org: deepa2
Cast NLP data as multiangular DeepA2 datasets and integrate these in training pipeline
- Homepage: https://github.com/debatelab/deepa2
- Documentation: https://deepa2.readthedocs.io/
- License: Apache-2.0
-
Latest release: 0.1.16
published over 3 years ago
Rankings
Maintainers (1)
Dependencies
- 104 dependencies
- black ^22.1.0 develop
- coverage ^6.4.1 develop
- flake8 ^4.0.1 develop
- ipykernel ^6.7.0 develop
- ipython 7.31.1 develop
- matplotlib ^3.5.1 develop
- mypy ^0.931 develop
- pandas-stubs ^1.2.0 develop
- pylint ^2.12.2 develop
- pytest ^6.2.5 develop
- types-PyYAML ^6.0.4 develop
- types-requests ^2.27.8 develop
- Jinja2 ^3.0.3
- datasets ^1.18.0
- editdistance ^0.6.0
- networkx ^2.6.3
- numpy 1.21.5
- pandas 1.3.5
- pyarrow ^6.0.1
- python >=3.7.1,<3.11
- requests ^2.27.1
- sacrebleu ^2.1.0
- ttp ^0.8.4
- typer ^0.4.0
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- snok/install-poetry v1.3.0 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- paambaati/codeclimate-action v3.0.0 composite
- snok/install-poetry v1.3.0 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- snok/install-poetry v1.3.0 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- snok/install-poetry v1.3.0 composite