https://github.com/dmlc/gluon-nlp - v0.10.0 Maintenance Release

This release includes the following fixes: - [BUGFIX] remove wd from squad (#1223) - Fix deprecation warnings due to invalid escape sequences. (#1219) - Fix layernormeps in BERTEncoder (#1214) - [BUGFIX] Fix vocab determinism in py35 (#1166) (#1167)

As we prepare for the NumPy-based GluonNLP development, we are making the following adjustments to the branch usage: - master (old) -> v0.x: this branch will be used for maintenance of GluonNLP 0.x versions. - numpy -> master: the new master branch will be used for GluonNLP 1.0 onward with NumPy-compatible interface, based on the upcoming MXNet 2.0.

- Python
Published by szha over 5 years ago

https://github.com/dmlc/gluon-nlp - v0.9.2: Bug Fix

This patch release includes the following bug fix: - [BUGFIX] remove wd from squad (#1223) - Fix deprecation warnings due to invalid escape sequences. (#1219) - Fix layernormeps in BERTEncoder (#1214)

- Python
Published by szha over 5 years ago

https://github.com/dmlc/gluon-nlp - v0.9.1: Bug Fix

This release includes the bug fix for https://github.com/dmlc/gluon-nlp/pull/1158 (#1167). It affects the determinism of the instantiated vocabulary object on the order of special tokens on Python 3.5. Users of Python 3.5 are strongly encouraged to upgrade to this version.

- Python
Published by szha almost 6 years ago

https://github.com/dmlc/gluon-nlp - v0.9.0: BERT Inference Time Cut by Half and 90% Scaling Efficiency for Distributed Training

News

GluonNLP was featured in EMNLP 2019 Hong Kong! Check out the code accompanying the tutorial.
"GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing" has been published in the Journal of Machine Learning Research.

Models and Scripts in v0.9

BERT

INT8 Quantization for BERT Sentence Classification and Question Answering (#1080)! Also Check out the blog post.

Enhancements to the pretraining script (#1121, #1099) and faster tokenizer for BERT (#921, #1024) as well as multi-GPU support for SQuAD fine-tuning (#1079).

Make BERT a HybridBlock (#877).

XLNet

The XLNet model introduced by Yang, Zhilin, et. al in "XLNet: Generalized Autoregressive Pretraining for Language Understanding". The model was converted from the original repository (#866).

GluonNLP further provides scripts for finetuning XLNet on the Glue (#995) and SQuAD datasets (#1130) that reproduce the authors results. Check out the usage.

DistilBERT

The DistilBERT model introduced by Sanh, Victor, et. al in "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter" (#922).

Transformer

Add a separate Transformer inference script to make inference easy and make it convenient to analysis the performance of transformer inference (#852).

Korean BERT

Pre-trained Korean BERT is available as part of GluonNLP (#1057)

RoBERTa

GluonNLP now provides scripts for finetuning RoBERTa (#931).

GPT2

GPT2 is now a HybridBlock the model can be exported for running from other MXNet language bindings (#1010).

New Features

Add NamedTuple + Dict batchify (#959)
Add even_size option to split sampler (#1028)
Add length normalized metrics for machine translation tasks (#1095)
Add raw attention scores to the AttentionCell #951 (#964)
Add round_to feature to BERT & XLNet finetuning scripts (#1133)
Add stratified trainvalidsplit similar to sklearn.modelselection.traintest_split (#933)
Add SuperGlue dataset API (#858)
Add Multi Model Server deployment code example for developers (#1140)
Allow custom dropout, number of layers/units for BERT (#950)
Avoid race condition when downloading vocab (#1078)
Deprecate specifying Vocab padding, bos and eos_token as positional arguments (#945)
Fast multitensor adam optimizer (#1111)
Faster gradglobalnorm for clipping (#1115)
Hybridizable AWDRNN/StandardRNN (#911)
Padding seq length to multiple of 8 in BERT model (#909)
Scripts for producing the figures that explain the bucketing strategy (#908)
Split up Seq2SeqDecoder in Seq2SeqDecoder and Seq2SeqOneStepDecoder (#976)
Switch CI to Python 3.5 and declare Python 3.5 support (#1009)
Try to use the new None feature in MXNet + Drop support for MXNet 1.5 (#967)
Use fused gelu operator (#1082)
Use softmax with length, and interleaved matmul for BERT (#1136)
Documentation of Model Conversion Scripts at https://gluon-nlp.mxnet.io/v0.9.x/modelzoo/conversiontools/index.html (#922)

Bug Fixes and code cleanup

Add version checker to all scripts (#930)
Add version checker to all tutorials (#934)
Add 'packaging' to requirements (#1143)
Adjust code owner (#923)
Avoid using dict for attention cell parameter creation (#1050)
Bump version in preparation for 0.9 release (#987)
Change SimVerb3500 URL to aclweb hosted version (#979)
Correct propagation of error codes in GluonNLP-py3-master-gpu-doc (#971)
Corrected np.random.randint upper limit in data.stream.py (#935)
Declare Python version requirement in setup.py (#927)
Declare more optional dependencies (#958)
Declare pytest seed marker in pytest.ini (#940)
Disable HybridBeamSearch (#1021)
Drop LAMB optimizer from GluonNLP in favor of MXNet version (#1116)
Drop unused compatibility helpers and fix doc (#928)
Fix #905 (#906)
Fix a SQuAD 2.0 evaluation bug (#907)
Fix argument analogy-max-vocab-size (#904)
Fix broken multi-head attention cell (#878)
Fix bugs in BERT export script (#944)
Fix chnsenticorp dataset download link (#873)
Fix file sampler for BERT (#977)
Fix index.rst and gpu flag in machine translation (#952)
Fix log in finetune_squad.py (#1001)
Fix parameter sharing of WeightDropParameter (#1083)
Fix scripts/questionanswering/datapipeline.py requiring optional package (#1013)
Fix the weight tie and weight sharing for AWDRNN (#1087)
Fix training command in Language Modeling index.rst (#1100)
Fix version check in traingnmt.py and traintransformer.py (#1003)
Fix standard rnn weight sharing error (#1122)
Glue data preprocessing pipeline and bert & xlnet scripts (#1031)
Improve Vocab.repr if reservedtokens or unknowntoken is None (#989)
Improve readability (#975)
Improve test robustness (#960)
Improve the readability of the training script. This fix replaces magic numbers with the name (#1006)
Make EmbeddingCenterContextBatchify returned dtype robust to empty sentences (#954)
Modify the log average loss (#1103)
Move ICSL script out of BERT folder (#1131)
Move NER script out of bert folder (#1090)
Move ParallelBigRNN into nlp.model namespace (#1118)
Move getrnncell out of seq2seqencoderdecoder (#1073)
Mxnet version check (#1063)
Refactor BERT with new data preprocessing (#1124)
Remove NLTKMosesTokenizer in favor of SacreMosesTokenizer (#942)
Remove extra dropout in BERT/RoBERTa (#1022)
Remove outdated comment (#943)
Remove padding warning (#916)
Replace unicode comma with ascii comma (#1056)
Split up inheritance structure of TransformerEncoder and BERTEncoder (#988)
Support int32 for sampled blocks (#1106)
Switch batch jobs to use G4dn.2x instance (#1041)
TransformerXL LayerNorm eps and XLNet pretrained model config (#1005)
Unify BERT horovod and kvstore pre-training script (#889)
Update README.rst (#884)
Update data_api.rst (#893)
Update embedding script (#1046)
Update fp16_utils.py (#1037)
Update index.rst (#876)
Update index.rst (#891)
Update navbar install (#983)
Update numba dependency in setup.py (#941)
Update outdated contributor list (#963)
Update preparecleanenv.sh (#998)

Documentation

Add comment to BERT notebook (#1026)
Add missing docs for nlp.utils (#936)
Add more documentation to XLNet scripts (#985)
Add section for "Clone the master branch for development" (#1075)
Add to toc tree depth to enable multiple level menu (#1108)
Cite source of pretrained parameters for bert12768_12 (#915)
Doc fix for vocab.subwords (#885)
Enhance vocab not found err msg (#917)
Fix command line examples for text classification (#874)
Fix math formula in docs (#920)
More detailed doc for CorpusBPTTBatchify (#888)
Release checklist (#890)
Remove non-existent arguments for BERT and Transformer (#946)
Remove py3 usage from the doc (#1077)
Update installation guide with selectors (#966)
Update mxnet version in installation doc (#1072)
Update pre-trained model link (#1117)
Update Installation instructions for source (#1146)

Continuous Integration

Disable SimVerb test for 14 days (#953)
Disable horovod test temporarily (#1030)
Disable known bad mxnet nightly version (#997)
Enable integration tests on CPU (#957)
Enable testing warnings with pytest and update deprecated API invocations (#980)
Enable timestamp in CI (#925)
Enable type checks and inference with pytype (#1018)
Fix CI (#875)
Preserve stderr and stdout streams in doc CI stage for Cloudwatch (#882)
Remove skip_master feature (#1017)
Switch source of MXNet nightly build (#1058)
Test MXNet 1.6 pre-release as part of CI pipeline (#1023)
Update MXNet master version tested on CI (#1113)
Update numba (#1096)
Use Cuda 10.0 MXNet build (#991)

- Python
Published by leezu about 6 years ago

https://github.com/dmlc/gluon-nlp - v0.8.3: Minor Bug Fixes

Add int32 support for importance sampling (model.ISDense) and noise contrastive estimation (model.NCEDense).

- Python
Published by eric-haibin-lin about 6 years ago

https://github.com/dmlc/gluon-nlp - v0.8.2: Bug Fixes

This release covers a few fixes for the bugs reported: - Fixed argument passing in the bert/embedding.py script - Updated SimVerb3500 dataset URL to the aclweb hosted version - Removed multi-processing in DataLoader from in bert/pretraining_utils.py which potentially causes crash when horovod mpi is used for training - Before MXNet 1.6.0, Gluon Trainer assumes deterministic parameter creation order for distributed traiing. The attention cell for BERT and transformer has a non-deterministic parameter creation order in v0.8.1 and v0.8.0, which will cause divergence during distributed training. It is now fixed.

Note that since v0.8.2, the default branch of gluon-nlp github will be switched to the latest stable branch, instead of the master branch under development.

- Python
Published by eric-haibin-lin about 6 years ago

https://github.com/dmlc/gluon-nlp - v0.8.1

News

GluonNLP was featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
GluonNLP 0.8.1 will no longer support Python 2. (#721, #838)
Interested in BERT int8 quantization for deployment? Check out the blog post here.

Models and Scripts

RoBERTa

The RoBERTa model introduced by Yinhan Liu, et. al in "RoBERTa: A Robustly Optimized BERT Pretraining Approach". The model checkpoints are converted from the original repository. Check out the usage here. (#870)

Transformer-XL

The Transformer-XL model introduced by Zihang Dai, et. al in "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context". (#846)

Bug Fixes

Fixed hybridization for the BERT model (#877)
Change the variable model to bert_classifier (#828) thank you @LindenLiu
Revert "Add axis argument to squeeze()" (#857)
[BUGFIX] Remove incorrect vocab.padding_token requirement in CorpusBPTTBatchify
[BUGFIX] Fix Vocab with unknowntoken remapped to != 0 via tokento_idx arg (#862)
[BUGFIX] Fix AMP in finetune_classifier.py (#848)
[BUGFIX] fix broken multi-head attention cell (#878) @ZiyueHuang
[FIX] fix chnsenticorp dataset download link (#873)
fix the usage of pad in bert (#850)

Documentation

Clarify Bert does not require MXNet nightly anymore (#860)
[DOC] fix broken links (#833)
[DOC] Update BERT index.rst (#844)
[DOC] Add GluonCV/NLP archive (#823)
[DOC] add missing dataset document (#832)
[DOC] remove wrong tutorial header level (#826)
[DOC] Fix a typo in attention_cell's docstring (#841) thank you @shenfei
[DOC] Upgrade mxnet dependency to 1.5.0 and use Cuda 10.1 on CI (#842)
Remove Py2 icon from Readme. Add 3.7 (#856)
[DOC] Improve help message (#855) thank you @apeforest
Update index.rst (#853)
[DOC] Fix Machine Translation with Transformers example (#865)
update button style (#869)
[DOC] doc fix for vocab.subwords (#885) thank you @liusy182

Continuous Integration

[CI] Support py3-mastergpudoc CI run on arbitrary branches (#829)
Enforce AWS Batch jobName rules (#836)
dump linkcheck errors to comments (#827)
Enable Sphinx Autodoc typehints (#830)
[CI] Preserve stderr and stdout streams in doc CI stage for Cloudwatch

- Python
Published by eric-haibin-lin over 6 years ago

https://github.com/dmlc/gluon-nlp - v0.8.0

News

GluonNLP is featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
GluonNLP 0.8.0 will no longer support Python 2. #721

Models

RoBERTa

RoBERTa is now available in GluonNLP BERT model zoo. #870

Transformer-XL

Transformer-XL is now available in GluonNLP language model zoo. #846

- Python
Published by szha over 6 years ago

https://github.com/dmlc/gluon-nlp - v0.7.1

News

GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.
This is the last release in GluonNLP that will officially support Python 2. #721

Models and Scripts

BERT

a BERT BASE model pre-trained on a large corpus including OpenWebText Corpus, BooksCorpus, and English Wikipedia, which has comparable performance with the BERT large model from Google. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (#799, #687, #806, #669, #665). Thank you @davisliang @vanyacohen @Skylion007

| Source | GluonNLP | google-research/bert | google-research/bert | |-----------|-----------------------------------------|-----------------------------|-----------------------------| | Model | bert1276812 | bert1276812 | bert24102416 | | Dataset | `openwebtextbookcorpuswikienuncased|bookcorpuswikienuncased|bookcorpuswikienuncased` | | SST-2 | 95.3 | 93.5 | 94.9 | | RTE | 73.6 | 66.4 | 70.1 | | QQP | 72.3 | 71.2 | 72.1 | | SQuAD 1.1 | 91.0/84.4 | 88.5/80.8 | 90.9/84.1 | | STS-B | 87.5 | 85.8 | 86.5 | | MNLI-m/mm | 85.3/84.9 | 84.6/83.4 | 86.7/85.9 |

The SciBERT model introduced by Iz Beltagy and Arman Cohan and Kyle Lo in "SciBERT: Pretrained Contextualized Embeddings for Scientific Text". The model checkpoints are converted from the original repository from AllenAI with the following datasets (#735):
- scibert_scivocab_uncased
- scibert_scivocab_cased
- scibert_basevocab_uncased
- scibert_basevocab_cased
The BioBERT model introduced by Lee, Jinhyuk, et al. in "BioBERT: a pre-trained biomedical language representation model for biomedical text mining". The model checkpoints are converted from the original repository with the following datasets (#735):
- biobert_v1.0_pmc_cased
- biobert_v1.0_pubmed_cased
- biobert_v1.0_pubmed_pmc_cased
- biobert_v1.1_pubmed_cased
The ClinicalBERT model introduced by Kexin Huang and Jaan Altosaar and Rajesh Ranganath in "ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission". The model checkpoints are converted from the original repository with the clinicalbert_uncased dataset (#735)
The ERNIE model introduced by Sun, Yu, et al. in "ERNIE: Enhanced Representation through Knowledge Integration". You can get the model checkpoints converted from the original repository with model.get_model("ernie_12_768_12", "baidu_ernie_uncased") (#759) thanks @paperplanet
BERT fine-tuning script for named entity recognition on CoNLL2003 with test F1 92.2 (#612).
BERT fine-tuning script for Chinese XNLI dataset with 78.3% validation accuracy. (#759) thanks @paperplanet
BERT fine-tuning script for intent classification and slot labelling on ATIS (95.9 F1) and SNIPS (95.9 F1). (#817)

GPT-2

The GPT-2 language model introduced by Radford, Alec, et al. in "Language Models are Unsupervised Multitask Learners". The model checkpoints are converted from the original repository, with a script to generate text from GPT-2 model (gpt2_117m, gpt2_345m) trained on the openai_webtext dataset (#761).

ESIM

The ESIM model for text matching introduced by Chen, Qian, et al. in "Enhanced LSTM for Natural Language Inference". (#689)

Data

Natural language understanding with datasets from the GLUE benchmark: CoLA, SST-2, MRPC, STS-B, MNLI, QQP, QNLI, WNLI, RTE (#682)
Sentiment analysis datasets: CR, MPQA (#663)
Intent classification and slot labeling datasets: ATIS and SNIPS (#816)

New Features

[Feature] support save model / trainer states to S3 (#700)
[Feature] support load model/trainer states from s3 (#702)
[Feature] Add SentencePieceTokenizer for BERT (#669)
[FEATURE] Flexible vocabulary (#732)
[API] Moving MaskedSoftmaxCELoss and LabelSmoothing to model API (#754) thanks @ThomasDelteil
[Feature] add the List batchify function (#812) thanks @ThomasDelteil
[FEATURE] Add LAMB optimizer (#733)

Bug Fixes

[BUGFIX] Fixes for BERT embedding, pretraining scripts (#640) thanks @Deseaus
[BUGFIX] Update hash of wikicncased and wikimultilingualcased vocab (#655)
fix bert forward call parameter mismatch (#695) thanks @paperplanet
[BUGFIX] Fix mlm_loss reporting for eval dataset (#696)
Fix getrnn_cell (#648) thanks @MarisaKirisame
[BUGFIX] fix mrpc dataset idx (#708)
[bugfix] fix hybrid beam search sampler(#710)
[BUGFIX] [DOC] Update nlp.model.getmodel documentation and getmodel API (#734)
[BUGFIX] Fix handling of duplicate special tokens in Vocabulary (#749)
[BUGFIX] Fix TokenEmbedding serialization with emb[emb.unknown_token] != 0 (#763)
[BUGFIX] Fix glue test result serialization (#773)
[BUGFIX] Fix init bug for multilevel BiLMEncoder (#783) thanks @Ishitori

API Changes

[API] Dropping support for wikimultilingual and wikicn (#764)
[API] Remove getbertmodel from the public API list (#767)

Enhancements

[FEATURE] offer loadw2vbinary method to load w2v binary file (#620)
[Script] Add inference function for BERT classification (#639) thanks @TaoLv
[SCRIPT] - Add static BERT base export script (for use with MXNet Module API) (#672)
[Enhancement] One script to export bert for classification/regression/QA (#705)
[enhancement] refactor bert finetuning script (#692)
[Enhancement] only use the best model for inference for bert classification (#716)
[Dataset] redistribute conll2004 (#719)
[Enhancement] add periodic evaluation for BERT pre-training (#720)
[FEATURE]add XNLI task (#717)
[refactor] Refactor BERT script folder (#744)
[Enhancement] BERT pre-training data generation from sentencepiece vocab (#743)
[REFACTOR] Refactor TokenEmbedding to reduce number of places that initialize internals (#750)
[Refactor] Refactor BERT SQuAD inference code (#758)
[Enhancement] Fix dtype conversion, add sentencepiece support for SQuAD (#766)
[Dataset] Move MRPC dataset to API (#780)
[BiDAF-QANet] Common data processing logic for BiDAF and QANet (#739) thanks @Ishitori
[DATASET] add LCQMC, ChnSentiCorp dataset (#774) thanks @paperplanet
[Improvement] Implement parser evaluation in Python (#772)
[Enhancement] Add whole word masking for BERT (#770) thanks @basicv8vc
[Enhancement] Mix precision support for BERT finetuning (#793)
Generate BERT training samples in compressed format (#651)

Minor Fixes

Various documentation fixes: #635, #637, #647, #656, #664, #667, #670, #676, #678, #681, #698, #704, #731, #745, #762, #771, #746, #778, #800, #810, #807 #814 thanks @rongruosong @crcrpar @mrchypark @xwind-h
Fix BERT multiprocessing data creation bug which causes unnecessary dispatching to single worker (#649)
[BUGFIX] Update BERT test and pre-train script (#661)
update url for ws353 (#701)
bump up version (#742)
[DOC] Update textCNN results (#737)
padding value warning (#747)
[TUTORIAL][DOC] Tutorial Updates (#802) thanks @faramarzmunshi

Continuous Integration

skip failing tests in mxnet master (#685)
[CI] update nodes for CI (#686)
[CI] CI refactoring to speed up tests (#566)
[CI] fix codecov (#693)
use fixture for squad dataset tests (#699)
[CI] create zipped notebooks for link check (#712)
Fix test infrastructure for pytest > 4 and bump CI pytest version (#728)
[CI] set root in BERT tests (#738)
Fix conftest.py functionscopeseed (#748)
[CI] Fix links in contribute.rst (#752)
[CI] Update CI dependencies (#756)
Revert "[CI] Update CI dependencies (#756)" (#769)
[CI] AWS Batch serverless CI Pipeline for parallel notebook execution during website build step (#791)
[CI] Don't exit pipeline before displaying AWS Batch logfiles (#801)
[CI] Fix for "Don't exit pipeline before displaying AWS Batch logfile (#803)
add license checker (#804)
enable timeout (#813)
Fix website build on master branch (#819)

- Python
Published by eric-haibin-lin over 6 years ago

https://github.com/dmlc/gluon-nlp - v0.7.0

News

GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond.
GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.

Models and Scripts

BERT

BERT model pre-trained on OpenWebText Corpus, BooksCorpus, and English Wikipedia. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (#799, #687, #806, #669, #665). Thank you @davisliang

| Source | GluonNLP | google-research/bert | google-research/bert | |-----------|-----------------------------------------|-----------------------------|-----------------------------| | Model | bert1276812 | bert1276812 | bert24102416 | | Dataset | `openwebtextbookcorpuswikienuncased|bookcorpuswikienuncased|bookcorpuswikienuncased` | | SST-2 | 95.3 | 93.5 | 94.9 | | RTE | 73.6 | 66.4 | 70.1 | | QQP | 72.3 | 71.2 | 72.1 | | SQuAD 1.1 | 91.0/84.4 | 88.5/80.8 | 90.9/84.1 | | STS-B | 87.5 | 85.8 | 86.5 | | MNLI-m/mm | 85.3/84.9 | 84.6/83.4 | 86.7/85.9 |

The SciBERT model introduced by Iz Beltagy and Arman Cohan and Kyle Lo in "SciBERT: Pretrained Contextualized Embeddings for Scientific Text". The model checkpoints are converted from the original repository from AllenAI with the following datasets (#735):
- scibert_scivocab_uncased
- scibert_scivocab_cased
- scibert_basevocab_uncased
- scibert_basevocab_cased
The BioBERT model introduced by Lee, Jinhyuk, et al. in "BioBERT: a pre-trained biomedical language representation model for biomedical text mining". The model checkpoints are converted from the original repository with the following datasets (#735):
- biobert_v1.0_pmc_cased
- biobert_v1.0_pubmed_cased
- biobert_v1.0_pubmed_pmc_cased
- biobert_v1.1_pubmed_cased
The ClinicalBERT model introduced by Kexin Huang and Jaan Altosaar and Rajesh Ranganath in "ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission". The model checkpoints are converted from the original repository with the clinicalbert_uncased dataset (#735)
The ERNIE model introduced by Sun, Yu, et al. in "ERNIE: Enhanced Representation through Knowledge Integration". You can get the model checkpoints converted from the original repository with model.get_model("ernie_12_768_12", "baidu_ernie_uncased") (#759) thanks @paperplanet
BERT fine-tuning script for named entity recognition on CoNLL2003 with test F1 92.2 (#612).
BERT fine-tuning script for Chinese XNLI dataset with 78.3% validation accuracy. (#759) thanks @paperplanet
BERT fine-tuning script for intent classification and slot labelling on ATIS (95.9 F1) and SNIPS (95.9 F1). (#817)

GPT-2

The GPT-2 language model introduced by Radford, Alec, et al. in "Language Models are Unsupervised Multitask Learners". The model checkpoints are converted from the original repository, with a script to generate text from GPT-2 model (gpt2_117m, gpt2_345m) trained on the openai_webtext dataset (#761).

ESIM

The ESIM model for text matching introduced by Chen, Qian, et al. in "Enhanced LSTM for Natural Language Inference". (#689)

Data

Natural language understanding with datasets from the GLUE benchmark: CoLA, SST-2, MRPC, STS-B, MNLI, QQP, QNLI, WNLI, RTE (#682)
Sentiment analysis datasets: CR, MPQA (#663)
Intent classification and slot labeling datasets: ATIS and SNIPS (#816)

New Features

[Feature] support save model / trainer states to S3 (#700)
[Feature] support load model/trainer states from s3 (#702)
[Feature] Add SentencePieceTokenizer for BERT (#669)
[FEATURE] Flexible vocabulary (#732)
[API] Moving MaskedSoftmaxCELoss and LabelSmoothing to model API (#754) thanks @ThomasDelteil
[Feature] add the List batchify function (#812) thanks @ThomasDelteil
[FEATURE] Add LAMB optimizer (#733)

Bug Fixes

[BUGFIX] Fixes for BERT embedding, pretraining scripts (#640) thanks @Deseaus
[BUGFIX] Update hash of wikicncased and wikimultilingualcased vocab (#655)
fix bert forward call parameter mismatch (#695) thanks @paperplanet
[BUGFIX] Fix mlm_loss reporting for eval dataset (#696)
Fix getrnn_cell (#648) thanks @MarisaKirisame
[BUGFIX] fix mrpc dataset idx (#708)
[bugfix] fix hybrid beam search sampler(#710)
[BUGFIX] [DOC] Update nlp.model.getmodel documentation and getmodel API (#734)
[BUGFIX] Fix handling of duplicate special tokens in Vocabulary (#749)
[BUGFIX] Fix TokenEmbedding serialization with emb[emb.unknown_token] != 0 (#763)
[BUGFIX] Fix glue test result serialization (#773)
[BUGFIX] Fix init bug for multilevel BiLMEncoder (#783) thanks @Ishitori

API Changes

[API] Dropping support for wikimultilingual and wikicn (#764)
[API] Remove getbertmodel from the public API list (#767)

Enhancements

[FEATURE] offer loadw2vbinary method to load w2v binary file (#620)
[Script] Add inference function for BERT classification (#639) thanks @TaoLv
[SCRIPT] - Add static BERT base export script (for use with MXNet Module API) (#672)
[Enhancement] One script to export bert for classification/regression/QA (#705)
[enhancement] refactor bert finetuning script (#692)
[Enhancement] only use the best model for inference for bert classification (#716)
[Dataset] redistribute conll2004 (#719)
[Enhancement] add periodic evaluation for BERT pre-training (#720)
[FEATURE]add XNLI task (#717)
[refactor] Refactor BERT script folder (#744)
[Enhancement] BERT pre-training data generation from sentencepiece vocab (#743)
[REFACTOR] Refactor TokenEmbedding to reduce number of places that initialize internals (#750)
[Refactor] Refactor BERT SQuAD inference code (#758)
[Enhancement] Fix dtype conversion, add sentencepiece support for SQuAD (#766)
[Dataset] Move MRPC dataset to API (#780)
[BiDAF-QANet] Common data processing logic for BiDAF and QANet (#739) thanks @Ishitori
[DATASET] add LCQMC, ChnSentiCorp dataset (#774) thanks @paperplanet
[Improvement] Implement parser evaluation in Python (#772)
[Enhancement] Add whole word masking for BERT (#770) thanks @basicv8vc
[Enhancement] Mix precision support for BERT finetuning (#793)
Generate BERT training samples in compressed format (#651)

Minor Fixes

Various documentation fixes: #635, #637, #647, #656, #664, #667, #670, #676, #678, #681, #698, #704, #731, #745, #762, #771, #746, #778, #800, #810, #807 #814 thanks @rongruosong @crcrpar @mrchypark @xwind-h
Fix BERT multiprocessing data creation bug which causes unnecessary dispatching to single worker (#649)
[BUGFIX] Update BERT test and pre-train script (#661)
update url for ws353 (#701)
bump up version (#742)
[DOC] Update textCNN results (#737)
padding value warning (#747)
[TUTORIAL][DOC] Tutorial Updates (#802) thanks @faramarzmunshi

Continuous Integration

skip failing tests in mxnet master (#685)
[CI] update nodes for CI (#686)
[CI] CI refactoring to speed up tests (#566)
[CI] fix codecov (#693)
use fixture for squad dataset tests (#699)
[CI] create zipped notebooks for link check (#712)
Fix test infrastructure for pytest > 4 and bump CI pytest version (#728)
[CI] set root in BERT tests (#738)
Fix conftest.py functionscopeseed (#748)
[CI] Fix links in contribute.rst (#752)
[CI] Update CI dependencies (#756)
Revert "[CI] Update CI dependencies (#756)" (#769)
[CI] AWS Batch serverless CI Pipeline for parallel notebook execution during website build step (#791)
[CI] Don't exit pipeline before displaying AWS Batch logfiles (#801)
[CI] Fix for "Don't exit pipeline before displaying AWS Batch logfile (#803)
add license checker (#804)
enable timeout (#813)
Fix website build on master branch (#819)

- Python
Published by eric-haibin-lin over 6 years ago

https://github.com/dmlc/gluon-nlp - v0.6.0

News

Tutorial proposal for GluonNLP is accepted at EMNLP 2019, Hong Kong, and KDD 2019, Anchorage.

Models and Scripts

BERT pre-training on BooksCorpus and English Wikipedia with mixed precision and gradient accumulation on GPUs. We achieved the following fine-tuning results based on the produced checkpoint on validation sets(#482, #505, #489). Thank you @haven-jeon
- | Dataset | MRPC | SQuAD 1.1 | SST-2 | MNLI-mm | |:----------:|:--------------:|:--------------:|:-----------:|:-------:| | Score | 87.99% | 80.99/88.60 | 93% | 83.6% |
BERT fine-tuning on various sentence classification datasets with checkpoints converted from the official repository(#600, #571, #481). Thank you @kenjewu @haven-jeon
- | Dataset | MRPC | RTE | SST-2 | MNLI-m/mm | |:---------:|:--------------:|:--------------:|:--------------:|:--------------:| | Score | 88.7% | 70.8% | 93% | 84.55%, 84.66% |
BERT fine-tuning on question answering datasets with checkpoints converted from the official repository(#493). Thank you @fiercex
- | Dataset | SQuAD 1.1 | SQuAD 1.1 | SQuAD 2.0 | |:---------:|:---------------:|:---------------:|:-------------:| | Model | bert1276812| bert24102416 |bert241024_16| | F1/EM | 88.53/80.98 | 90.97/84.05 | 77.96/81.02 |
BERT model convertion scripts for checkpoints from the original tensorflow repository, and more converted models(#456, #461, #449). Thank you @fiercex:
- Multilingual Wikipedia (cased, BERT Base)
- Chinese Wikipedia (cased, BERT Base)
- Books Corpus & English Wikipedia (uncased, BERT Large)
Scripts and command line interface for BERT embedding of raw sentences(#587, #618). Thank you @imgarylai
Scripts for exporting BERT model for deployment (#624)

New Features

[API] Add BERTVocab (#509) thanks @kenjewu
[API] Add Transforms for BERT (#526) thanks @kenjewu
[API] add data parallel for transformer (#387)
[FEATURE] Add squad2.0 Dataset (#551) thanks @fiercex
[FEATURE] Add NumpyDataset (#498)
[FEATURE] Add TruncNorm initializer for BERT (#548) thanks @Ishitori
[FEATURE] Add split sampler for distributed training (#494)
[FEATURE] Custom metric for masked accuracy (#503)
[FEATURE] Support custom sampler in SimpleDatasetStream (#507)
[FEATURE] clip gradient norm by parameter (#470)

Bug Fixes

[BUGFIX] Fix Data Preprocessing for Translation Data (#568)
[FIX] fix parameter clip (#527)
[FIX] Fix divergence of the training of transformer (#543)
[FIX] Fix documentation and a bug in NCE Block (#558)
[FIX] Fix hashing single ngrams in NGramHashes (#450)
[FIX] Fix weight dying in BERTModel.decoder for BERT pre-training (#500)
[BUGFIX] Modifying the FastText Classification training for accurate mean pooling (#529) thanks @sravanbabuiitm

API Changes

[API] BERT return intermediate encodings per layer (#606) thanks @Ishitori
[API] Better handle case when backoff is not possible in TokenEmbedding (#459)
[FIX] Rename wikicn/wikimultilingual to wikicncased/wikimultilingualuncased (#594) thanks @kenjewu
[FIX] Update default value of BERTAdam epsilon to 1e-6 (#601)
[FIX] Fix BERT decoder API for masked language model prediction (#501)
[FIX] Remove bias correction term in BERTAdam (#499)

Enhancements

[BUGFIX] use glove.840B.300d for NLI experiments (#567)
[API] Add debug option for parallel (#584)
[FEATURE] Skip dropout layer in Transformer when rate=0 (#597) thanks @TaoLv
[FEATURE] update sharded loader (#468)
[FIX] Update BERTLayerNorm Implementation (#485)
[TUTORIAL] Use FixedBucketSampler in BERT tutorial for better performance (#506) thanks @Ishitori
[API] Add Bert tokenizer to transforms.py (#464) thanks @fiercex
[FEATURE] Add data parallel to big rnn lm script (#564)

Minor Fixes

Various documentation fixes: #484, #613, #614, #438, #448, #550, #563, #611, #605, #440, #554, #445, #556, #603, #483, #576, #610, #547, #458, #574, #510, #447, #465, #436, #622, #583 thanks @anuragsarkar97 @brettkoonce
[FIX] fix repeated unzipping in squad dataset (#553)
[FIX] web fixes (#453)
[FIX] Remove unused argument in fasttextwordngram.py (#486) thanks @kurtjanssensai
[FIX] Remove unused code (#528)
[FIX] Remove unused code in text_classification script (#442)
[MISC] Bump up version (#454)
[BUGFIX] fix pylint error (#549)
[FIX] Simplify the data preprocessing code for the sentiment analysis script (#462)
[FEATURE] BERT doc fixes and script usability enhancements (#444)
[FIX] Fix Py2 compatibility of machine_translation/dataprocessor.py (#541) thanks @ymjiang
[BUGFIX] Fix GluonNLP MXNet dependency (#555)
[BUGFIX] Fix Weight Drop and Test (#546)
[CI] Add version upper bound to doc.yml (#467)
[CI] speed up tests (#582)
[CI] upgrade mxnet to 1.4.0 (#617)
[FIX] Revert an unintended change (#525)
[BUGFIX] update paths and imports in bert scripts (#634)

- Python
Published by eric-haibin-lin almost 7 years ago

https://github.com/dmlc/gluon-nlp - v0.5.0

Highlights

Featured in AWS re:invent 2018

Models

BERT
- The Bidirectional Encoder Representations from Transformers model as introduced by Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018) (#409). Model parameters are converted from the original model checkpoints from Google research, including:
- BERT BASE model trained on
  - Book Corpus & English Wikipedia (cased)
  - Book Corpus & English Wikipedia (uncased)
  - multilingual Wikipedia (uncased)
- BERT LARGE model trained on Book Corpus & English Wikipedia (uncased)
ELMo
- The Embeddings from Language Models as introduced by Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018) (#227, #428). Model parameters are converted from the original model checkpoints in AllenNLP, including the small, medium, original models trained on 1 billion words dataset, and the original model trained on 5.5B tokens consisting of Wikipedia & monolingual news crawl data from WMT 2008-2012.
Word Embedding
- The GloVe model as introduced by Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014 (#359).
Natural Language Inference
- The Decomposable Attention Model as introduced by Parikh, Ankur P., et al. "A decomposable attention model for natural language inference." arXiv preprint arXiv:1606.01933 (2016). (#404). On the SNLI test set, it achieves 84.6% accuracy (without intra-sentence attention) and 84.4% accuracy (with intra-sentence attention). Thank you @linmx0130 @hhexiy!
Dependency Parsing
- The Deep Biaffine Attention Dependency Parser as introduced by Dozat, Timothy, and Christopher D. Manning. "Deep biaffine attention for neural dependency parsing." arXiv preprint arXiv:1611.01734 (2016). (#408). It achieved 96% UAS on the Penn Treebank dataset. Thank you @hankcs!
Text Classification
- The Text CNN model as introduced by Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). (#391) Thank you @xiaotinghe!

New Tutorials

ELMo
- A tutorial on generating contextualized representation with the pre-trained ELMo model, as introduced by Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018) (#227, #428).
BERT
- A tutorial on fine-tuning the BERT model for sentence pair classification, as introduced by Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018) (#437)

New Datasets

Sentiment Analysis
- MR, a movie-review data set of 10,662 sentences labeled with respect to their overall sentiment polarity (positive or negative). (#391)
- SST_1, an extension of the MR data set with fine-grained labels (#391)
- SST_2, an extension of the MR data set with binary sentiment polarity labels (#391)
- SUBJ, a subjectivity data set for sentiment analysis (#391)
- TREC, a movie-review data set of 10,000 sentences labeled with respect to their subjectivity status (subjective or objective). (#391)

API Updates

Changed Vocab constructor from staticmethod to classmethod to handle inheritance (#386)
Added Transformer Encoder APIs (#409)
Added pre-trained ELMo model to model.get_model API (#227)
Added pre-trained BERT model to model.get_model API (#409)
Added unknown_lookup setter to TokenEmbedding (#429)
Added dtype support to EmbeddingCenterContextBatchify (#416)
Propagated exceptions from PrefetchingStream (#406)
Added sentencepiece tokenizer detokenizer (#380)
Added CSR format for variable length data in embedding training (#384)

Fixes & Small Changes

Included output of nlp.embedding.list_sources() in API docs (#421)
Supported symlinks in examples and scripts (#403)
Fixed weight tying in GNMT and Transformer (#413)
Simplified transformer notebook (#400)
Fixed LazyTransformDataStream prefetching (#397)
Adopted src/gluonnlp folder layout (#390)
Fixed text8 archive file name for downloads from S3 (#388) Thanks @bkktimber!
Fixed ppl reporting for training on multi gpu in the language model notebook (#365). Thanks @ThomasDelteil!
Fixed a spelling mistake in QA script. (#379) Thanks @qyhfbqz!

- Python
Published by eric-haibin-lin over 7 years ago

https://github.com/dmlc/gluon-nlp - v0.4.1

Highlights

Hands-on Tutorial at KDD 2018 (website) (code)

Models

Language Model
- The Large Scale Word Language Model as introduced by Jozefowicz, Rafal, et al. “Exploring the limits of language modeling”. arXiv preprint arXiv:1602.02410 (2016) achieved test PPL 43.62 on GBW dataset (#179 #270 #277 #278 #286 #294)
- The NT-ASGD based Language Model as introduced by Merity, S., et al. “Regularizing and optimizing LSTM language models”. ICLR 2018 achieved test PPL 65.62 on WikiText-2 dataset (#170)
Document Classification
- The Classification Model as introduced by Joulin, Armand, et al. “Bag of tricks for efficient text classification” achieved validation accuracy validation accuracy 98 on Yelp review dataset (#258 #297)
Question Answering
- The QANet as introduced by Jozefowicz, Rafal, et al. “ QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension”. ICLR 2018 achieved F1 score 79.5 on SQuAD 1.1 dataset (#339) (coming soon to master branch)

New Tutorials

Machine Translation
- The Google NMT as introduced by Wu, Yonghui, et al. “Google's neural machine translation system: Bridging the gap between human and machine translation”. arXiv preprint arXiv:1609.08144 (2016) is introduced as part of the gluonnlp tutorial (#261)
- The Transformer based Machine Translation by Vaswani, Ashish, et al. “Attention is all you need.” Advances in Neural Information Processing Systems. 2017 is introduced as part of the gluonnlp tutorial (#279)
Sentence Embedding
- A Structured Self-attentive Sentence Embedding (#366) by Z. Lin, M. Feng, C. Santos, M. Yu, B. Xiang, B. Zhou, Y. Bengio, "A Structured Self-attentive Sentence Embedding" ICLR 2017 is introduced in gluonnlp tutorial (#366)

New Datasets

Word Embedding
- Wikipedia (#218)
- Fil9 dataset(#363)
- FastText crawl-300d-2M-subword(#336), wiki-news-300d-1M-subword(#368), cc.en.300(#373)

API updates

Added dataloader that allows multi-shard sampling (#237 #280 #285)
Simplified DataStream, added DatasetStream, refactored and extended PrefetchingStream (#235)
Unified BPTT batchify for dataset and stream (#246)
Added symbolic beam search (#233)
Added SequenceSampler (#272)
Refactored Transform APIs (#282)
Reorganized index of the repo and model zoo page (#357)

Fixes & Small Changes

Fixed module name in batchify.py example (#239)
Improved imports structure (#248)
Added test for nmt scripts (#234)
Speeded up batchify.Pad (#249)
Fixed LanguageModelDataset.bptt_batchify (#243)
Fixed weight drop and add tests (#268)
Fixed relative links that pypi doesn't handle (#293)
Updated notebook build logic (#309)
Added community link (#313)
Enabled run tests in parallel (#317)
Enabled word embedding scripts tests (#321)

See all commits

- Python
Published by cgraywang over 7 years ago

https://github.com/dmlc/gluon-nlp - v0.3.3

GluonNLP v0.3 contains many exciting new features. (depends on MXNet 1.3.0b20180725)

Models

Language Models
- The Cache Language Model as introduced by Grave, E., et al. “Improving neural language models with a continuous cache”. ICLR 2017 is introduced as part of gluonnlp.model.train (#110)
- The Activation Regularizer and Temporal Activation Regularizer as introduced by Merity, S., et al. "Regularizing and optimizing LSTM language models". ICLR 2018 is introduced as part of gluonnlp.loss (#110)
Machine Translation
- The Transformer Model as introduced by Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017* is introduced as part of the gluonnlp nmt scripts (#133)
Word embeddings
- Trainable word embedding models are introduced as part of gluonnlp.model.train (#136)
- Word2Vec by Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
- FastText models by Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135-146.

New Datasets

Machine Translation
- WMT2014BPE (#135) (#177) (#180)
Question Answering
- Stanford Question Answering Dataset (SQuAD) Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 2383-2392). (#113)
Word Embeddings
- Text8 (#136)

API changes

The download directory for datasets and other artifacts can now be specified via the MXNET_HOME environment variable. (#106)
TokenEmbedding class now exposes the Inverse Vocab as well (#123)
SortedSampler now supports useaveragelength option (#135)
Add more strategies for bucket creation (#145)
Add tokenizer to bleu (#154)
Add Convolutional Encoder and Highway Layer (#129) (#186)
Add plain text of translation data. (#158)
Use Sherlock Holmes dataset instead of PTB for language model notebook (#174)
Add classes JiebaToknizer and NLTKStanfordSegmenter for Chinese Word Segmentation (#164)
Allow toggling output and prompt in documentation website (#184)
Add shape assertion statements for better user experience to some attention cells (#201)
Add support for computation of word embeddings for unknown words in TokenEmbedding class (#185)
Distribute subword vectors for pretrained fastText embeddings enabling embeddings for unknown words (#185)

Fixes & Small Changes

fixed bptt_batchify sometimes returned an invalid last batch (#120)
Fixed wrong PPL calculation in word language model script for multi-GPU (#150)
Fix split compound words and wmt16 results (#151)
Adapt pretrained word embeddings example notebook for nd.topk change in mxnet 1.3 (#153)
Fix beam search script (#175)
Fix small bugs in parser (#183)
TokenEmbedding: Skip lines with invalid bytes instead of crashing (#188)
Fix overly large memory use in TokenEmbedding serialization/deserialization if some tokens are overly large (eg. 50k characters) (#187)
Remove duplicates in WordSim353 when combining segments (#192)

See all commits

- Python
Published by leezu over 7 years ago

https://github.com/dmlc/gluon-nlp -

Features

GluonNLP provides its users with easy access to

State of the art models
Pre-trained word embeddings
Many public datasets for different tasks
Examples friendly to users that are new to the task
Reproducible training scripts

Models

Gluon NLP Toolkit supplies model definitions for common NLP tasks. These can be adapted for the users requirements or taken as blueprint for new developments. All of these are implemented using Gluon Blocks allowing easy reuse as plug-and-play neural network building blocks.

Language Models
- Standard RNN language model
- AWD language model by salesforce
Attention Cells
Beam Search
- Beam Search Sampler
- Beam Search Scorer

Data

Gluon NLP Toolkit provides tools for building efficient data pipelines for NLP tasks by defining a Dataset class interface and utilities for transforming them. Several datasets are included by default and will be automatically downloaded when used.

Language modeling with WikiText
- WikiText is a popular language modeling dataset from Salesforce. It is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.
Sentiment Analysis with IMDB
- IMDB: IMDB is a popular dataset for binary sentiment classification. It provides a set of 25,000 highly polar movie reviews for training, 25,000 for testing, and additional unlabeled data.
CoNLL datasets
- These datasets include data for the shared tasks, such as part-of-speech (POS) tagging, chunking, named entity recognition (NER), semantic role labeling (SRL), etc.
- We provide built in support for CoNLL 2000 – 2002, 2004, as well as the Universal Dependencies dataset which is used in the 2017 and 2018 competitions.
Word embedding evaluation datasets
- There are a number of commonly used datasets for intrinsic evaluation for word embeddings. We provide commonly used datasets for the similarity and analogy evaluation tasks.

Gluon NLP further ships with common datasets data transformation functions, dataset samplers to determine how to iterate through datasets as well as functions to generate data batches.

A complete and up-to-date list of supplied datasets and utilities is available in the API documentation.

Other features

Examples and scripts

The Gluon NLP toolkit also provides scripts that use the functionality of the toolkit for various tasks

Word Embedding Evaluation
Beam Search Generator
Word language modeling
Sentiment Analysis through Fine-tuning, w/ Bucketing
Machine Translation

- Python
Published by leezu almost 8 years ago

Recent Releases of https://github.com/dmlc/gluon-nlp

https://github.com/dmlc/gluon-nlp - v0.10.0 Maintenance Release

https://github.com/dmlc/gluon-nlp - v0.9.2: Bug Fix

https://github.com/dmlc/gluon-nlp - v0.9.1: Bug Fix

https://github.com/dmlc/gluon-nlp - v0.9.0: BERT Inference Time Cut by Half and 90% Scaling Efficiency for Distributed Training

News

Models and Scripts in v0.9

BERT

XLNet

DistilBERT

Transformer

Korean BERT

RoBERTa

GPT2

New Features

Bug Fixes and code cleanup

Documentation

Continuous Integration

https://github.com/dmlc/gluon-nlp - v0.8.3: Minor Bug Fixes

https://github.com/dmlc/gluon-nlp - v0.8.2: Bug Fixes

https://github.com/dmlc/gluon-nlp - v0.8.1

News

Models and Scripts

RoBERTa

Transformer-XL

Bug Fixes

Documentation

Continuous Integration

https://github.com/dmlc/gluon-nlp - v0.8.0

News

Models

RoBERTa

Transformer-XL

https://github.com/dmlc/gluon-nlp - v0.7.1

News

Models and Scripts

BERT

GPT-2

ESIM

Data

New Features

Bug Fixes

API Changes

Enhancements

Minor Fixes

Continuous Integration

https://github.com/dmlc/gluon-nlp - v0.7.0

News

Models and Scripts

BERT

GPT-2

ESIM

Data

New Features

Bug Fixes

API Changes

Enhancements

Minor Fixes

Continuous Integration

https://github.com/dmlc/gluon-nlp - v0.6.0

News

Models and Scripts

New Features

Bug Fixes

API Changes

Enhancements

Minor Fixes

https://github.com/dmlc/gluon-nlp - v0.5.0

Highlights

Models

New Tutorials

New Datasets

API Updates

Fixes & Small Changes

https://github.com/dmlc/gluon-nlp - v0.4.1

Highlights

Models

New Tutorials

New Datasets

API updates