https://github.com/amir22010/mt-dnn

Multi-Task Deep Neural Networks for Natural Language Understanding

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Multi-Task Deep Neural Networks for Natural Language Understanding

Basic Info

Host: GitHub
Owner: Amir22010
License: mit
Language: Python
Default Branch: master
Size: 73.2 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Fork of namisan/mt-dnn

Created almost 7 years ago · Last pushed almost 7 years ago

https://github.com/Amir22010/mt-dnn/blob/master/

**New Release **

MT-DNN with Knowledge Distillation code and model.

# Multi-Task Deep Neural Networks for Natural Language Understanding

This PyTorch package implements the Multi-Task Deep Neural Networks (MT-DNN) for Natural Language Understanding, as described in:

Xiaodong Liu\*, Pengcheng He\*, Weizhu Chen and Jianfeng Gao

Multi-Task Deep Neural Networks for Natural Language Understanding

[arXiv version](https://arxiv.org/abs/1901.11504)

\*: Equal contribution

Xiaodong Liu, Pengcheng He, Weizhu Chen and Jianfeng Gao

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

[arXiv version](https://arxiv.org/abs/1904.09482)

## Quickstart

### Setup Environment
#### Install via pip:
1. python3.6

Reference to download and install : https://www.python.org/downloads/release/python-360/

2. install requirements

```> pip install -r requirements.txt```

#### Use docker:
1. Pull docker

```> docker pull allenlao/pytorch-mt-dnn:v0.1```

2. Run docker

```> docker run -it --rm --runtime nvidia allenlao/pytorch-mt-dnn:v0.1 bash```

Please refer to the following link if you first use docker: https://docs.docker.com/

### Train a toy MT-DNN model
1. Download data

```> sh download.sh```

Please refer to download GLUE dataset: https://gluebenchmark.com/

2. Preprocess data

```> python prepro.py --bert_model bert-base-uncased --do_lower_case```

3. Training

```> python train.py```

**Note that we ran experiments on 4 V100 GPUs for base MT-DNN models. You may need to reduce batch size for other GPUs.**

### GLUE Result reproduce
1. MTL refinement: refine MT-DNN (shared layers), initialized with the pre-trained BERT model, via MTL using all GLUE tasks excluding WNLI to learn a new shared representation.

**Note that we ran this experiment on 8 V100 GPUs (32G) with a batch size of 32.**
+ Preprocess GLUE data via the aforementioned script
+ Training:

```>scripts\run_mt_dnn.sh```

2. Finetuning: finetune MT-DNN to each of the GLUE tasks to get task-specific models.

Here, we provide two examples, STS-B and RTE. You can use similar scripts to finetune all the GLUE tasks.

+ Finetune on the STS-B task

```> scripts\run_stsb.sh```

You should get about 90.5/90.4 on STS-B dev in terms of Pearson/Spearman correlation.

+ Finetune on the RTE task

```> scripts\run_rte.sh```

You should get about 83.8 on RTE dev in terms of accuracy.

### SciTail & SNIL Result reproduce (Domain Adaptation)
1. Domain Adaptation on SciTail

```>scripts\scitail_domain_adaptation_bash.sh```

2. Domain Adaptation on SNLI

```>scripts\snli_domain_adaptation_bash.sh```

### Extract embeddings
1. Extracting embeddings of a pair text example

```>python extractor.py --do_lower_case --finput input_examples\pair-input.txt --foutput input_examples\pair-output.json --bert_model bert-base-uncased --checkpoint mt_dnn_models\mt_dnn_base.pt```

Note that the pair of text is split by a special token ```|||```. You may refer ``` input_examples\pair-output.json``` as example.

2. Extracting embeddings of a single sentence example

```>python extractor.py --do_lower_case --finput input_examples\single-input.txt --foutput input_examples\single-output.json --bert_model bert-base-uncased --checkpoint mt_dnn_models\mt_dnn_base.pt```

### TODO
[ ] Publish pretrained Tensorflow checkpoints.

## FAQ

### Did you share the pretrained mt-dnn models?
Yes, we released the pretrained shared embedings via MTL which are aligned to BERT base/large models: ```mt_dnn_base.pt``` and ```mt_dnn_large.pt```.

To obtain the similar models:
1. run the ```>sh scripts\run_mt_dnn.sh```, and then pick the best checkpoint based on the average dev preformance of MNLI/RTE.

2. strip the task-specific layers via ```scritps\strip_model.py```.

### Why SciTail/SNLI do not enable SAN?
For SciTail/SNLI tasks, the purpose is to test generalization of the learned embedding and how easy it is adapted to a new domain instead of complicated model structures for a direct comparison with BERT. Thus, we use a linear projection on the all **domain adaptation** settings.

### What is the difference between V1 and V2
The difference is in the QNLI dataset. Please refere to the GLUE official homepage for more details. If you want to formulate QNLI as pair-wise ranking task as our paper, make sure that you use the old QNLI data.

Then run the prepro script with flags: ```> python prepro.py --old_glue```

If you have issues to access the old version of the data, please contact the GLUE team.

### Did you fine-tune single task for your GLUE leaderboard submission?
We can use the multi-task refinement model to run the prediction and produce a reasonable result. But to achieve a better result, it requires a fine-tuneing on each task. It is worthing noting the paper in arxiv is a littled out-dated and on the old GLUE dataset. We will update the paper as we mentioned below.

## Notes and Acknowledgments
BERT pytorch is from: https://github.com/huggingface/pytorch-pretrained-BERT

BERT: https://github.com/google-research/bert

We also used some code from: https://github.com/kevinduh/san_mrc

### How do I cite MT-DNN?

For now, please cite [arXiv version](https://arxiv.org/abs/1901.11504):

```
@article{liu2019mt-dnn,
title={Multi-Task Deep Neural Networks for Natural Language Understanding},
author={Liu, Xiaodong and He, Pengcheng and Chen, Weizhu and Gao, Jianfeng},
journal={arXiv preprint arXiv:1901.11504},
year={2019}
}

@article{liu2019mt-dnn-kd,
title={Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding},
author={Liu, Xiaodong and He, Pengcheng and Chen, Weizhu and Gao, Jianfeng},
journal={arXiv preprint arXiv:1904.09482},
year={2019}
}
```
### Contact Information

For help or issues using MT-DNN, please submit a GitHub issue.

For personal communication related to MT-DNN, please contact Xiaodong Liu (`xiaodl@microsoft.com`), Pengcheng He (`penhe@microsoft.com`), Weizhu Chen (`wzchen@microsoft.com`) or Jianfeng Gao (`jfgao@microsoft.com`).

Owner

Name: Amir Khan
Login: Amir22010
Kind: user
Location: India

Repositories: 3
Profile: https://github.com/Amir22010

working on developing a state of art AI solutions mainly in computer vision, chat bots and nlp domain. building an awesome AI as a professional developer 😍.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/amir22010/mt-dnn

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/Amir22010/mt-dnn/blob/master/

Owner

GitHub Events

Total

Last Year