Releases | Open Source Science

Skip init during model build (way faster building)
Enable quantization of LoRA layers
Enable 4bit quantization from bitsandbytes (NF4 / FP4)
Enable "some" bnb.optim Optimizers for benchmarking purpose
Refactor model state_dict loading to enable pseudo lazy loading with move on GPU as it loads
Enable Gradient checkpointing for FFN, MHA, LoRA modules
Make FFN bias optional (same as QKV): llama, mpt, redpajama, openllama converters changed accordingly. Convertv2v3 set addqkvbias=True, addffnbias=True. loadcheckpoint: if w1bias detected in checkpoint then addffnbias=True
Add Multi Query attention
Add Parallel Residual attention
Add Falcon 7B converter

- Python
Published by vince62s about 3 years ago

opennmt-py - OpenNMT-py v3.1.3

Step-by-step Tuto for Vicuna replication thanks Lina
MosaicML MPT7B converter and support (Alibi embeddings)
Open Llama converter / Redpajama converter
Switch GCLD3 to Fasttext thanks ArtanieTheOne
fix coverage attention in beam decoding
fix ct2 keys for "Llama / MPT7B based" OpenNMT-y models

- Python
Published by vince62s about 3 years ago

opennmt-py - OpenNMT-py v3.1.2

fixes: transforms (normalize, clean, inlinetags)
Llama support (rotary embeddings, RMSNorm, Silu activation)
8bit loading for specific layers (along with LoRa for other layers)
subword learner added to build_vocab

- Python
Published by vince62s about 3 years ago

opennmt-py - OpenNMT-py v3.1.1

fix major bug in 3.1.0 introduced with LoRa (3.1.0 not available)

- Python
Published by vince62s about 3 years ago

opennmt-py - OpenNMT-py v3.1.0

updated docs with Sphinx 6.4
Restore source features to v3 (thanks @anderleich)
add inline tags transform (thanks @panosk)
add docify transform to allow doc-level training / inference
fix NLLB training (decoderstarttoken)
New! LoRa adapters to finetune big models (egs: NLLB 3.3B)
various bug fixes

- Python
Published by vince62s about 3 years ago

opennmt-py - OpenNMT-py v3.0.4

override_opts to override checkpoints opt when training from
normalize transform based on (Sacre)Moses scripts
uppercase transform for adhoc data augmentation
suffix transform
Fuzzy match transform
WMT17 detailed example
NLLB-200 (from Meta/FB) models support (after conversion)
various bug fixes

- Python
Published by vince62s over 3 years ago

opennmt-py - OpenNMT-py v3.0.3

fix loss normalization when using accum or nb GPU > 1
use native CrossEntropyLoss with Label Smoothing. reported loss/ppl impacted by LS
fix long-time coverage loss bug thanks Sanghyuk-Choi
fix detok at scoring / fix tokenization Subword_nmt + Sentencepiece
various small bugs fixed

- Python
Published by vince62s over 3 years ago

opennmt-py - OpenNMT-py v3.0.2

3.0.2 (2022-12-07)

pyonmttok.Vocab is now pickable. dataloader switched to spawn. (MacOS/Windows compatible)
fix scoring with specific metrics (BLEU, TER)
fix tensorboard logging
fix dedup in batch iterator (only for TRAIN, was happening at inference also)
New: Change: tgtprefix renamed to tgtfile_prefix
New: tgtprefix / srcprefix used for "prefix" Transform (onmt/transforms/misc.py)
New: process transforms of buckets in batches (vs per example) / faster

- Python
Published by vince62s over 3 years ago

opennmt-py - OpenNMT-py v3.0.1

fix dynamic scoring
reinstate apex.amp level O1/O2 for benchmarking
New: LM distillation for NMT training
New: bucket_size ramp-up to avoid slow start
fix special tokens order
remove Library and add link to Yasmin's Tuto

- Python
Published by vince62s over 3 years ago

opennmt-py - OpenNMT-py v3.0.0

v3.0 !

Removed completely torchtext. Use Vocab object of pyonmttok instead
Dataloading changed accordingly with the use of pytorch Dataloader (num_workers)
queuesize / poolfactor no longer needed. bucket_size optimal value > 64K
options renamed: rnnsize => hiddensize (enc/decrnnsize => enc/dechidsize)
new tools/convertv2_v3.py to upgrade v2 models.pt
inference with length_penalty=avg is now the default
add_qkvbias (default false, but true for old model)

- Python
Published by vince62s over 3 years ago

opennmt-py - OpenNMT-py v2.3.0

New features

BLEU/TER (& custom) scoring during training and validation (#2198)
LM related tools (#2197)
Allow encoder/decoder freezing (#2176)
Dynamic data loading for inference (#2145)
Sentence-level scores at inference (#2196)
MBR and oracle reranking scoring tools (#2196)

Fixes and improvements

Updated beam exit condition (#2190)
Improve scores reporting (#2191)
Fix dropout scheduling (#2194)
Better catch CUDA ooms when training (#2195)
Fix source features support in inference and REST server (#2109)
Make REST server more flexible with dictionaries (#2104)
Fix target prefixing in LM decoding (#2099)

- Python
Published by francoishernandez almost 4 years ago

opennmt-py - OpenNMT-py v2.2.0

New features

Support source features (thanks @anderleich !)

Fixes and improvements

Adaptations to relax torch version
Customizable transform statistics (#2059)
Adapt release code for ctranslate2 2.0

- Python
Published by francoishernandez almost 5 years ago

opennmt-py - OpenNMT-py v2.1.2

Fixes and improvements

Fix update_vocab for LM (#2056)

- Python
Published by francoishernandez about 5 years ago

opennmt-py - OpenNMT-py v2.1.1

Fixes and improvements

Fix potential deadlock (b1a4615)
Add more CT2 conversion checks (e4ab06c)

- Python
Published by francoishernandez about 5 years ago

opennmt-py - OpenNMT-py v2.1.0

New features

Allow vocab update when training from a checkpoint (cec3cc8, 2f70dfc)

Fixes and improvements

Various transforms related bug fixes
Fix beam warning and buffers reuse
Handle invalid lines in vocab file gracefully

- Python
Published by francoishernandez about 5 years ago

opennmt-py - OpenNMT-py v2.0.1

Fixes and improvements

Support embedding layer for larger vocabularies with GGNN (e8065b7)
Reorganize some inference options (9fb5f30)

- Python
Published by francoishernandez over 5 years ago

opennmt-py - OpenNMT-py v2.0.0

First official release for OpenNMT-py major upgdate to 2.0!

New features

Language Model (GPT-2 style) training and inference
Nucleus (top-p) sampling decoding

Fixes and improvements

Fix some BART default values

- Python
Published by francoishernandez over 5 years ago

opennmt-py - OpenNMT-py v2.0.0rc2

Fixes and improvements

Parallelize onmtbuildvocab (422d824)
Some fixes to the on-the-fly transforms
Some CTranslate2 related updates
Some fixes to the docs

This will be the first release to be automatically deployed via GitHub Actions.

- Python
Published by francoishernandez over 5 years ago

opennmt-py - OpenNMT-py v2.0.0rc1

This is the first release candidate for OpenNMT-py major upgdate to 2.0.0!

The major idea behind this release is the -- almost -- complete makeover of the data loading pipeline . A new 'dynamic' paradigm is introduced, allowing to apply on the fly transforms to the data.

This has a few advantages, amongst which:

remove or drastically reduce the preprocessing required to train a model;
increase and simplify the possibilities of data augmentation and manipulation through on-the fly transforms.

These transforms can be specific tokenization methods, filters, noising, or any custom transform users may want to implement. Custom transform implementation is quite straightforward thanks to the existing base class and example implementations.

You can check out how to use this new data loading pipeline in the updated docs and examples.

All the readily available transforms are described here.

Performance

Given sufficient CPU resources according to GPU computing power, most of the transforms should not slow the training down. (Note: for now, one producer process per GPU is spawned -- meaning you would ideally need 2N CPU threads for N GPUs).

Breaking changes

A few features are dropped, at least for now:

audio, image and video inputs;
source word features.

Some very old checkpoints with previous fields and vocab structure are also incompatible with this new version.

For any user that still need some of these features, the previous codebase will be retained as legacy in a separate branch. It will no longer receive extensive development from the core team but PRs may still be accepted.

- Python
Published by francoishernandez over 5 years ago

opennmt-py - OpenNMT-py v1.2.0

Fixes and improvements

Support pytorch 1.6 (e813f4d, eaaae6a)
Support official torch 1.6 AMP for mixed precision training (2ac1ed0)
Flag to override batchsizemultiple in FP16 mode, useful in some memory constrained setups (23e5018)
Pass a dict and allow custom options in preprocess/postprocess functions of REST server (41f0c02, 8ec54d2)
Allow different tokenization for source and target in REST server (bb2d045, 4659170)
Various bug fixes

New features

Gated Graph Sequence Neural Networks encoder (11e8d0), thanks @SteveKommrusch
Decoding with a target prefix (95aeefb, 0e143ff, 91ab592), thanks @Zenglinxiao

- Python
Published by francoishernandez almost 6 years ago

opennmt-py - OpenNMT-py v1.1.1

Fixes and improvements

Fix backcompatibility when no 'corpus_id' field (c313c28)

- Python
Published by francoishernandez about 6 years ago

opennmt-py - OpenNMT-py v1.1.0

New features

Support CTranslate2 models in REST server (91d5d57)
Extend support for custom preprocessing/postprocessing function in REST server by using return dictionaries (d14613d, 9619ac3, 92a7ba5)
Experimental: BART-like source noising (5940dcf)

Fixes and improvements

Add options to CTranslate2 release (e442f3f)
Fix dataset shard order (458fc48)
Rotate only the server logs, not training (189583a)
Fix alignment error with empty prediction (91287eb)

- Python
Published by francoishernandez about 6 years ago

opennmt-py - OpenNMT-py v1.0.2

Fixes and improvements

Enable CTranslate2 conversion of Transformers with relative position (db11135)
Adapt -replace_unk to use with learned alignments if they exist (7625b53)

- Python
Published by francoishernandez over 6 years ago

opennmt-py - OpenNMT-py v1.0.1

Fixes and improvements

Ctranslate2 conversion handled in release script (1b50e0c)
Use attention_dropout properly in MHA (f5c9cd4)
Update apex FP16_Optimizer path (d3e2268)
Some REST server optimizations
Fix and add some docs

- Python
Published by francoishernandez over 6 years ago

opennmt-py - OpenNMT-py v1.0.0

New features

Implementation of "Jointly Learning to Align & Translate with Transformer" (@Zenglinxiao)

Fixes and improvements

Add nbest support to REST server (@Zenglinxiao)
Merge greedy and beam search codepaths (@Zenglinxiao)
Fix "block ngram repeats" (@KaijuML, @pltrdy)
Small fixes, some more docs

- Python
Published by francoishernandez over 6 years ago

opennmt-py - OpenNMT-py v1.0.0.rc1

We have now reached some good stability of the code base.

This is the 1.0.0 release candidate.

Fix Apex / FP16 training (Apex new API is buggy)
Multithread preprocessing way faster (Thanks François Hernandez)
Pip Installation v1.0.0.rc1 (thanks Paul Tardy)

Enjoy and feel free to report issues.

- Python
Published by vince62s over 6 years ago

opennmt-py - OpenNMT-py v0.9.2

Switch to Pytorch 1.2
Pre/post processing on the translation server (useful for Chinese) Thanks @Zenglinxiao
option to remove the FFN layer in AAN + AAN optimization (faster)
Coverage loss (per Abisee paper 2017) implementation Thanks @pltrdy
Video Captioning task: Thanks @flauted !
Token batch at inference
Small fixes and add-ons

- Python
Published by vince62s almost 7 years ago

opennmt-py - OpenNMT-py v0.9.1

New mechanism for MultiGPU training "1 batch producer / multi batch consumers" resulting in big memory saving when handling huge datasets thanks @pltrdy @francoishernandez
New APEX AMP (mixed precision) API thanks @francoishernandez NB: you need to resintall Nvidia/Apex
Option to overwrite shards when preprocessing
Small fixes and add-ons

- Python
Published by vince62s about 7 years ago

opennmt-py - OpenNMT-py v0.9.0

Updated Travis to Pytorch 1.1

Faster vocab building when processing shards (no reloading) thanks @francoishernandez
New dataweighting feature thanks @francoishernandez see the FAQ doc for more information
New dropout scheduler. Same logic as accumcount / accumsteps see opts.py
fix Gold Scores
small fixes and add-ons.

Unrelated, but new website online ! thanks @guillaumekln

Enjoy !

- Python
Published by vince62s about 7 years ago

opennmt-py - OpenNMT-py v0.8.2

Update documentation and Library example (thanks @flauted @elisemicho )
Revamp args
Bug fixes, save moving average in FP32 (thanks @francoishernandez )
Allow FP32 inference for FP16 models

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.8.1

Mostly bug fixes.

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.8.0

Many fixes and code cleaning thanks @flauted, @guillaumekln

Datasets code refactor (thanks @flauted) you need to re-preprocess datasets

New features FP16 Support: Experimental, using Apex, Checkpoints may break in future version. Continuous exponential moving average (thanks @francoishernandez, and Marian) Relative positions encoding (thanks @francoishernandez, and Google T2T) Deprecate the old beam search, fast batched beam search supports all options

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.7.2

Multi level text fields for better handling of embeddings. thanks @flauted

code cleaning and bug fixing thanks @bpopeters @guillaumekln @pltrdy

NB: you cannot train on 0.7.2 with preprocessed data on a prior version, you need to re-preprocess.

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.7.1

Many fixes and code refactoring thanks @bpopeters, @flauted, @guillaumekln

New features Random sampling thanks @daphnei Enable sharding for huge files at translation

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.7.0

Many fixes and code refactoring thanks @benopeters Migrated to Pytorch 1.0

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.6.0

Mostly fixes and code improvements.

New: yml config files. See the config folder

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.5.0

Ability to reset the optimizer when using -train_from

-resetoptim = ['none', 'all', 'states', 'keepstates'] none: default behavior as before all: reset the optimizer !! steps start at zero again. states: reset only states, keep all other parameters from checkpoint keepstates: keep current states from checkpoint, but allow to change parameters (learningrate for instance)

Bug fixes. Tested with Pytorch 1.0RC works fine.

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.4.1

fix preprocess filenames introduced by new sharding.

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.4

Fixed Speech2Text training (thanks Yuntian)

Removed -maxshardsize, replaced by -shard_size = number of examples in a shard.

Default value = 1M which works fine in most Text dataset cases. (will avoid Ram OOM in most cases)

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.3

Now requires Pytorch 0.4.1

Multi-node Multi-GPU with Torch Distributed

New options are: -masterip: ip address of the master node -masterport: port number of th emaster node -worldsize = total number of processes to be run (total GPUs accross all nodes) -gpuranks = list of indices of processes accross all nodes

-gpuid is deprecated

See examples in https://github.com/OpenNMT/OpenNMT-py/blob/master/docs/source/FAQ.md

Fixes to img2text now working

New sharding based on number of examples

Fixes to avoid 0.4.1 deprecated functions.

- Python
Published by vince62s over 7 years ago

opennmt-py - OpenNMT-py v0.2.1

Fixes and improvements

First compatibility steps with Pytorch 0.4.1 (non breaking)
Fix TranslationServer (when various request try to load the same model at the same time)
Fix StopIteration error (python 3.7)

New features

Ensemble at inference (thanks @Waino) see FAQ

- Python
Published by vince62s almost 8 years ago

opennmt-py - Last Pytorch 0.4.0 version

New in this release:

Multi-GPU based on torch distributed (acknowledgement to Fairseq) Change from Epoch to Step (see opts.py) Average Attention Network (AAN) for the Transformer (thanks @francoishernandez ) New fast beam search (see -fast in translate.py) (thanks @guillaumekln) Sparse attention / sparsemax (thanks to @bpopeters)

and many fixes.

This is the last version with pytorch 0.4.0 Next 0.4.1 pytorch version includes breakings changes.

- Python
Published by vince62s almost 8 years ago

opennmt-py - Pytorch 0.3 Last Release

- Python
Published by srush about 8 years ago

Recent Releases of opennmt-py

opennmt-py - OpenNMT-py v3.5.1

opennmt-py - OpenNMT-py v3.5.0

3.5.0 (2024-02-22)

opennmt-py - OpenNMT-py v3.4.3

opennmt-py - OpenNMT-py v3.4.2

opennmt-py - OpenNMT-py v3.4.1

opennmt-py - OpenNMT-py v3.4.0

opennmt-py - OpenNMT-py v3.3.0

opennmt-py - OpenNMT-py v3.2.0

opennmt-py - OpenNMT-py v3.1.3

opennmt-py - OpenNMT-py v3.1.2

opennmt-py - OpenNMT-py v3.1.1

opennmt-py - OpenNMT-py v3.1.0

opennmt-py - OpenNMT-py v3.0.4

opennmt-py - OpenNMT-py v3.0.3

opennmt-py - OpenNMT-py v3.0.2

3.0.2 (2022-12-07)

opennmt-py - OpenNMT-py v3.0.1

opennmt-py - OpenNMT-py v3.0.0

opennmt-py - OpenNMT-py v2.3.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v2.2.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v2.1.2

Fixes and improvements

opennmt-py - OpenNMT-py v2.1.1

Fixes and improvements

opennmt-py - OpenNMT-py v2.1.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v2.0.1

Fixes and improvements

opennmt-py - OpenNMT-py v2.0.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v2.0.0rc2

Fixes and improvements

opennmt-py - OpenNMT-py v2.0.0rc1

Performance

Breaking changes

opennmt-py - OpenNMT-py v1.2.0

Fixes and improvements

New features

opennmt-py - OpenNMT-py v1.1.1

Fixes and improvements

opennmt-py - OpenNMT-py v1.1.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v1.0.2

Fixes and improvements

opennmt-py - OpenNMT-py v1.0.1

Fixes and improvements

opennmt-py - OpenNMT-py v1.0.0

New features

Fixes and improvements

opennmt-py - OpenNMT-py v1.0.0.rc1

opennmt-py - OpenNMT-py v0.9.2

opennmt-py - OpenNMT-py v0.9.1

opennmt-py - OpenNMT-py v0.9.0

opennmt-py - OpenNMT-py v0.8.2

opennmt-py - OpenNMT-py v0.8.1

opennmt-py - OpenNMT-py v0.8.0

opennmt-py - OpenNMT-py v0.7.2

opennmt-py - OpenNMT-py v0.7.1

opennmt-py - OpenNMT-py v0.7.0

opennmt-py - OpenNMT-py v0.6.0

opennmt-py - OpenNMT-py v0.5.0

opennmt-py - OpenNMT-py v0.4.1

opennmt-py - OpenNMT-py v0.4

opennmt-py - OpenNMT-py v0.3

opennmt-py - OpenNMT-py v0.2.1

Fixes and improvements

New features

opennmt-py - Last Pytorch 0.4.0 version

opennmt-py - Pytorch 0.3 Last Release