https://github.com/awslabs/pptod

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System (ACL 2022)

https://github.com/awslabs/pptod

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary

Keywords

plugandplay pretrained-models task-oriented-dialogue
Last synced: 9 months ago · JSON representation

Repository

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System (ACL 2022)

Basic Info
Statistics
  • Stars: 159
  • Watchers: 8
  • Forks: 29
  • Open Issues: 8
  • Releases: 0
Topics
plugandplay pretrained-models task-oriented-dialogue
Created over 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang

Code of our PPTOD paper: Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

News:

  • [2022/02/24] PPTOD is accepted to the main conference of ACL 2022!
  • [2021/09/29] PPTOD is publicly released!

Introduction:

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified model that seamlessly supports both task-oriented dialogue understanding and response generation in a plug-and-play fashion. In addition, we introduce a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification. Results show that PPTOD creates new state-of-the-art on all evaluated tasks in both full training and low-resource scenarios. Furthermore, comparisons against previous SOTA methods show that the responses generated by PPTOD are more factually correct and semantically coherent as judged by human annotators.

Alt text

Main Results:

The following table shows our models performances on end-to-end dialogue modelling (Inform, Success, BLEU, and Combined Score) on MultiWOZ 2.0. It also shows the dialogue state tracking (DST) results on MultiWOZ 2.0 and intent classification accuracy on Banking77.

| | Inform |Success|BLEU|Combined Score|DST Joint Accuracy|Intent Classification Accuracy| | :-------------: |:-------------:|:-----:|:-----:|:-----:|:-----:|:-----:| |PPTOD-small |87.80|75.30 | 19.89|101.44|51.50|93.27| | PPTOD-base|89.20| 79.40|18.62 |102.92|53.37|93.86| | PPTOD-large|82.60| 74.10|19.21 |97.56|53.89|94.08|

Citation:

If you find our paper and resources useful, please kindly cite our paper:

bibtex @article{su2021multitask, author = {Yixuan Su and Lei Shu and Elman Mansimov and Arshit Gupta and Deng Cai and Yi{-}An Lai and Yi Zhang}, title = {Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System}, booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)", publisher = "Association for Computational Linguistics", year = {2022}, url = {https://arxiv.org/abs/2109.14739} }

Example Usage:

In the following, we provide an example of how to use PPTOD to address different TOD tasks without fine-tuning on any downstream task! We assume you have downloaded the pptod-small checkpoint and have it in the "./checkpoints/small/" directory (you can find instructions below). ```python

load the pre-trained PPTOD-small

import torch from transformers import T5Tokenizer modelpath = r'./checkpoints/small/' tokenizer = T5Tokenizer.frompretrained(modelpath) from E2ETOD.modelling.T5Model import T5GenModel from E2ETOD.ontology import soseostokens specialtokens = soseostokens model = T5GenModel(modelpath, tokenizer, specialtokens, dropout=0.0, addspecialdecodertoken=True, istraining=False) model.eval() python

prepare some pre-defined tokens and task-specific prompts

soscontexttokenid = tokenizer.converttokenstoids([''])[0] eoscontexttokenid = tokenizer.converttokenstoids([''])[0] padtokenid, sosbtokenid, eosbtokenid, sosatokenid, eosatokenid, \ sosrtokenid, eosrtokenid, sosictokenid, eosictokenid = \ tokenizer.converttokenstoids(['<PAD>', '<sosb>', '', '', '', '','', '', '']) bsprefixtext = 'translate dialogue to belief state:' bsprefixid = tokenizer.converttokenstoids(tokenizer.tokenize(bsprefixtext)) daprefixtext = 'translate dialogue to dialogue action:' daprefixid = tokenizer.converttokenstoids(tokenizer.tokenize(daprefixtext)) nlgprefixtext = 'translate dialogue to system response:' nlgprefixid = tokenizer.converttokenstoids(tokenizer.tokenize(nlgprefixtext)) icprefixtext = 'translate dialogue to user intent:' icprefixid = tokenizer.converttokenstoids(tokenizer.tokenize(icprefixtext)) python

an example dialogue context

dialoguecontext = "<sosu> can i reserve a five star place for thursday night at 3:30 for 2 people i'm happy to assist you! what city are you dining in? seattle please. " contextid = tokenizer.converttokenstoids(tokenizer.tokenize(dialogue_context)) python

predict belief state

inputid = bsprefixid + [soscontexttokenid] + contextid + [eoscontexttokenid] inputid = torch.LongTensor(inputid).view(1, -1) x = model.model.generate(inputids = inputid, decoderstarttokenid = sosbtokenid, padtokenid = padtokenid, eostokenid = eosbtokenid, maxlength = 128) print (model.tokenized_decode(x[0]))

the predicted result is

[restaurant] rating five star date thursday night start time 3:30 number of people 2 city seattle

python

predict dialogue act

inputid = daprefixid + [soscontexttokenid] + contextid + [eoscontexttokenid] inputid = torch.LongTensor(inputid).view(1, -1) x = model.model.generate(inputids = inputid, decoderstarttokenid = sosatokenid, padtokenid = padtokenid, eostokenid = eosatokenid, maxlength = 128) print (model.tokenized_decode(x[0]))

the predicted result is

[restaurant] [inform] restaurant name rating [multiplechoice] restaurant name <eosa>

python

predict system response

inputid = nlgprefixid + [soscontexttokenid] + contextid + [eoscontexttokenid] inputid = torch.LongTensor(inputid).view(1, -1) x = model.model.generate(inputids = inputid, decoderstarttokenid = sosrtokenid, padtokenid = padtokenid, eostokenid = eosrtokenid, maxlength = 128) print (model.tokenized_decode(x[0]))

the predicted result is

ok, let me find some options for you.

python

predict user intent

inputid = icprefixid + [soscontexttokenid] + contextid + [eoscontexttokenid] inputid = torch.LongTensor(inputid).view(1, -1) x = model.model.generate(inputids = inputid, decoderstarttokenid = sosictokenid, padtokenid = padtokenid, eostokenid = eosictokenid, maxlength = 128) print (model.tokenized_decode(x[0]))

the predicted result is

[bookrestaurant] <eosd>

```

1. Environment Setup:

yaml pip3 install -r requirements.txt python -m spacy download en_core_web_sm

2. PPTOD Checkpoints:

You can download checkpoints of PPTOD with different configurations here.

| PPTOD-small | PPTOD-base | PPTOD-large | | :-------------: |:-------------:| :-----:| | here | here | here |

To use PPTOD, you should download the checkpoint you want and unzip it in the ./checkpoints directory.

Alternatively, you can run the following commands to download the PPTOD checkpoints.

(1) Downloading Pre-trained PPTOD-small Checkpoint:

yaml cd checkpoints chmod +x ./download_pptod_small.sh ./download_pptod_small.sh

(2) Downloading Pre-trained PPTOD-base Checkpoint:

yaml cd checkpoints chmod +x ./download_pptod_base.sh ./download_pptod_base.sh

(3) Downloading Pre-trained PPTOD-large Checkpoint:

yaml cd checkpoints chmod +x ./download_pptod_large.sh ./download_pptod_large.sh

3. Data Preparation:

The detailed instruction for preparing the pre-training corpora and the data of downstream TOD tasks are provided in the ./data folder.

4. Dialogue Multi-Task Pre-training:

To pre-train a PPTOD model from scratch, please refer to details provided in ./Pretraining directory.

5. Benchmark TOD Tasks:

(1) End-to-End Dialogue Modelling:

To perform End-to-End Dialogue Modelling using PPTOD, please refer to details provided in ./E2E_TOD directory.

(2) Dialogue State Tracking:

To perform Dialogue State Tracking using PPTOD, please refer to details provided in ./DST directory.

(3) Intent Classification:

To perform Intent Classification using PPTOD, please refer to details provided in ./IC directory.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Owner

  • Name: Amazon Web Services - Labs
  • Login: awslabs
  • Kind: organization
  • Location: Seattle, WA

AWS Labs

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 18
  • Total pull requests: 7
  • Average time to close issues: 15 days
  • Average time to close pull requests: 4 months
  • Total issue authors: 16
  • Total pull request authors: 4
  • Average comments per issue: 1.78
  • Average comments per pull request: 0.14
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 3
Past Year
  • Issues: 2
  • Pull requests: 2
  • Average time to close issues: 9 days
  • Average time to close pull requests: 6 months
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.5
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • Monstarrr (2)
  • xiami2019 (2)
  • richlee123 (1)
  • zqwerty (1)
  • ShaneTian (1)
  • SuvodipDey (1)
  • Salierioo (1)
  • NLP-hua (1)
  • jianguoz (1)
  • anujatayal (1)
  • svjack (1)
  • ruleGreen (1)
  • pengdejia (1)
  • kingb12 (1)
  • Leezekun (1)
Pull Request Authors
  • dependabot[bot] (3)
  • dlwlgus53 (2)
  • sailik1991 (1)
  • yxuansu (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (3)

Dependencies

requirements.txt pypi
  • absl-py *
  • gdown *
  • nltk *
  • progressbar *
  • pytest *
  • pyyaml *
  • sacrebleu ==1.4.10
  • sentencepiece *
  • six *
  • sklearn *
  • spacy *
  • torch ==1.6.0
  • torchvision ==0.7.0
  • transformers ==4.7.0
  • wheel *