https://github.com/awslabs/multi-domain-goal-oriented-dialogues-dataset
Data from the publication "Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data"
https://github.com/awslabs/multi-domain-goal-oriented-dialogues-dataset
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary
Repository
Data from the publication "Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data"
Basic Info
Statistics
- Stars: 21
- Watchers: 5
- Forks: 3
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
Data from "Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data"
Repository Structure
Under the top level ./data directory, you will find the following two sub-directories:
1. unannotated:
unannotated human to human conversations from the airline, fastfood, finance, insurance, media, and software domains. Conversations are split by domain and given in TSV format with columns: "conversationId", "turnNumber", "utteranceId", "utterance", "authorRole".
2. paper_splits:
pre-processed training, development, and test splits for customer turns used to obtain intent classification and slot-labeling results in Table 7 of the paper. As in the paper, we partition these data by annotation granularity, either sentence level (located at ./data/papersplits/splitsannotatedatsentencelevel) or turn level (located at ./data/papersplits/splitsannotatedatturnlevel). Under each annotation granularity subdirectory, we provide splits for each domain: airline, fastfood, finance, insurance, media, and software. The splits are labeled as "train.tsv", "dev.tsv", "test.tsv" and contain the following tab separated columns: "conversationId", "turnNumber", "sentenceNumber" (only for sentence level splits), "utteranceId", "utterance", "slot-labels", and "intent". The labels in the slot-labels field are separated by spaces. In the case of multiple intents for a single input, we separate the intents with the special token <div>.
License
This project is licensed under the CDLA Permissive License. Terms given in LICENSE.txt.
Reference
For reference please cite our EMNLP-2019 paper: Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data (BibTex below)
@inproceedings{peskov-etal-2019-multi,
title = "Multi-Domain Goal-Oriented Dialogues ({M}ulti{D}o{GO}): Strategies toward Curating and Annotating Large Scale Dialogue Data",
author = "Peskov, Denis and Clarke, Nancy and Krone, Jason and Fodor, Brigi and Zhang, Yi and Youssef, Adel and Diab, Mona",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1460",
doi = "10.18653/v1/D19-1460",
pages = "4526--4536",
}
Owner
- Name: Amazon Web Services - Labs
- Login: awslabs
- Kind: organization
- Location: Seattle, WA
- Website: http://amazon.com/aws/
- Repositories: 914
- Profile: https://github.com/awslabs
AWS Labs
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 3
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 3
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jpcorb20 (1)
- moyapchen (1)
- scottmackieverint (1)