https://github.com/awslabs/mlm-scoring

Python library & examples for Masked Language Model Scoring (ACL 2020)

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary

Keywords

bert language-model mxnet nlp pytorch speech-recognition xlm

Last synced: 5 months ago · JSON representation

Repository

Python library & examples for Masked Language Model Scoring (ACL 2020)

Basic Info

Host: GitHub
Owner: awslabs
License: apache-2.0
Language: Python
Default Branch: master
Homepage: https://www.aclweb.org/anthology/2020.acl-main.240/
Size: 22.9 MB

Statistics

Stars: 342
Watchers: 14
Forks: 60
Open Issues: 11
Releases: 0

Topics

bert language-model mxnet nlp pytorch speech-recognition xlm

Created almost 6 years ago · Last pushed about 3 years ago

Metadata Files

Readme Contributing License Code of conduct

Masked Language Model Scoring

This package uses masked LMs like BERT, RoBERTa, and XLM to score sentences and rescore n-best lists via pseudo-log-likelihood scores, which are computed by masking individual words. We also support autoregressive LMs like GPT-2. Example uses include: - Speech Recognition: Rescoring an ESPnet LAS model (LibriSpeech) - Machine Translation: Rescoring a Transformer NMT model (IWSLT'15 en-vi) - Linguistic Acceptability: Unsupervised ranking within linguistic minimal pairs (BLiMP)

Paper: Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff. "Masked Language Model Scoring", ACL 2020.

Installation

Python 3.6+ is required. Clone this repository and install: bash pip install -e . pip install torch mxnet-cu102mkl # Replace w/ your CUDA version; mxnet-mkl if CPU only. Some models are via GluonNLP and others are via 🤗 Transformers, so for now we require both MXNet and PyTorch. You can now import the library directly: ```python from mlm.scorers import MLMScorer, MLMScorerPT, LMScorer from mlm.models import get_pretrained import mxnet as mx ctxs = [mx.cpu()] # or, e.g., [mx.gpu(0), mx.gpu(1)]

MXNet MLMs (use names from mlm.models.SUPPORTED_MLMS)

model, vocab, tokenizer = getpretrained(ctxs, 'bert-base-en-cased') scorer = MLMScorer(model, vocab, tokenizer, ctxs) print(scorer.scoresentences(["Hello world!"]))

>> [-12.410664200782776]

print(scorer.scoresentences(["Hello world!"], pertoken=True))

>> [[None, -6.126736640930176, -5.501412391662598, -0.7825151681900024, None]]

EXPERIMENTAL: PyTorch MLMs (use names from https://huggingface.co/transformers/pretrained_models.html)

model, vocab, tokenizer = getpretrained(ctxs, 'bert-base-cased') scorer = MLMScorerPT(model, vocab, tokenizer, ctxs) print(scorer.scoresentences(["Hello world!"]))

>> [-12.411025047302246]

print(scorer.scoresentences(["Hello world!"], pertoken=True))

>> [[None, -6.126738548278809, -5.501765727996826, -0.782496988773346, None]]

MXNet LMs (use names from mlm.models.SUPPORTED_LMS)

model, vocab, tokenizer = getpretrained(ctxs, 'gpt2-117m-en-cased') scorer = LMScorer(model, vocab, tokenizer, ctxs) print(scorer.scoresentences(["Hello world!"]))

>> [-15.995375633239746]

print(scorer.scoresentences(["Hello world!"], pertoken=True))

>> [[-8.293947219848633, -6.387561798095703, -1.3138668537139893]]

``` (MXNet and PyTorch interfaces will be unified soon!)

Scoring

Run mlm score --help to see supported models, etc. See examples/demo/format.json for the file format. For inputs, "score" is optional. Outputs will add "score" fields containing PLL scores.

There are three score types, depending on the model: - Pseudo-log-likelihood score (PLL): BERT, RoBERTa, multilingual BERT, XLM, ALBERT, DistilBERT - Maskless PLL score: same (add --no-mask) - Log-probability score: GPT-2

We score hypotheses for 3 utterances of LibriSpeech dev-other on GPU 0 using BERT base (uncased): bash mlm score \ --mode hyp \ --model bert-base-en-uncased \ --max-utts 3 \ --gpus 0 \ examples/asr-librispeech-espnet/data/dev-other.am.json \ > examples/demo/dev-other-3.lm.json

Rescoring

One can rescore n-best lists via log-linear interpolation. Run mlm rescore --help to see all options. Input one is a file with original scores; input two are scores from mlm score.

We rescore acoustic scores (from dev-other.am.json) using BERT's scores (from previous section), under different LM weights: bash for weight in 0 0.5 ; do echo "lambda=${weight}"; \ mlm rescore \ --model bert-base-en-uncased \ --weight ${weight} \ examples/asr-librispeech-espnet/data/dev-other.am.json \ examples/demo/dev-other-3.lm.json \ > examples/demo/dev-other-3.lambda-${weight}.json done The original WER is 12.2% while the rescored WER is 8.5%.

Maskless finetuning

One can finetune masked LMs to give usable PLL scores without masking. See LibriSpeech maskless finetuning.

Development

Run pip install -e .[dev] to install extra testing packages. Then:

To run unit tests and coverage, run pytest --cov=src/mlm in the root directory.

Owner

Name: Amazon Web Services - Labs
Login: awslabs
Kind: organization
Location: Seattle, WA

Website: http://amazon.com/aws/
Repositories: 914
Profile: https://github.com/awslabs

AWS Labs

GitHub Events

Total

Watch event: 10
Fork event: 2

Last Year

Watch event: 10
Fork event: 2

Issues and Pull Requests

Last synced: almost 2 years ago

All Time

Total issues: 21
Total pull requests: 3
Average time to close issues: about 1 month
Average time to close pull requests: 2 days
Total issue authors: 18
Total pull request authors: 3
Average comments per issue: 1.48
Average comments per pull request: 1.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: 18 days
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

gerardb7 (2)
mfelice (2)
Tanesan (2)
pstroe (1)
ajd12342 (1)
BarahFazili (1)
david-waterworth (1)
trangtv57 (1)
orenpapers (1)
ksoky (1)
yuchenlin (1)
dsorato (1)
VP007-py (1)
aflah02 (1)
sb1992 (1)

Pull Request Authors

ju-resplande (1)
zolastro (1)
ruanchaves (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

setup.py pypi

gluonnlp *
mosestokenizer *
regex *
sacrebleu *
transformers *

https://github.com/awslabs/mlm-scoring

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Masked Language Model Scoring

Installation

MXNet MLMs (use names from mlm.models.SUPPORTED_MLMS)

>> [-12.410664200782776]

>> [[None, -6.126736640930176, -5.501412391662598, -0.7825151681900024, None]]

EXPERIMENTAL: PyTorch MLMs (use names from https://huggingface.co/transformers/pretrained_models.html)

>> [-12.411025047302246]

>> [[None, -6.126738548278809, -5.501765727996826, -0.782496988773346, None]]

MXNet LMs (use names from mlm.models.SUPPORTED_LMS)

>> [-15.995375633239746]

>> [[-8.293947219848633, -6.387561798095703, -1.3138668537139893]]

Scoring

Rescoring

Maskless finetuning

Development

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies