Recent Releases of thinc
thinc - v8.3.6: Support Python 3.13
This release adds support for Python 3.13. In order to do this we're requiring Pydantic >= 2.0 and updated compilation to use Cython 3.0. This required an updated to the blis packaged that's not binary compatible, but thinc itself should not have any binary backwards compatibility issues.
- Python
Published by github-actions[bot] about 1 year ago
thinc - v8.3.4: Update Blis pin to revert to known-good v0.7
Previous releases have used releases of our blis package that vendor newer releases of the upstream blis library. Unfortunately these newer releases have had intermittent crashes on Windows that we haven't been able to track down.
I've therefore released a v1.2 of the blis package that goes back to the known-good v0.7 release of the vendored blis code, which we were using before. This release updates the verison-pin to use it.
It took a surprisingly long time to get v0.7 of blis to compile, due to conflicts on Windows. I regret the delay.
- Python
Published by github-actions[bot] over 1 year ago
thinc - v8.3.3: Fix Blis crashes, widen numpy pin
- Update blis pin to v1.1. This updates the vendored blis code to 1.1, which should fix crashes from the previously vendored v0.9 code on Windows.
- Widen numpy pin, allowing versions across v1 and v2. Previously I had thought that if I build against numpy v2, I couldn't also have v1 as a runtime dependency. This is actually incorrect, so we can widen the numpy pin
- Set flag on loading PyTorch models to improve safety of loading PyTorch models.
- Python
Published by github-actions[bot] over 1 year ago
thinc - v8.3.2: Fix regression to torch training, update ARM dependency
- Fix regression to torch training introduced in v8.3.1
- Restore MacOS ARM wheels, which were missing from previous builds
- Fix compatibility with thinc-apple-ops
- Python
Published by github-actions[bot] over 1 year ago
thinc - v8.3.1: Fix torch deprecation warning
torch.cuda.amp is deprecated (Pytorch 2.4). This PR updates shims pytorch.py to use torch.amp.autocast instead of torch.cuda.amp.autocast.
Thanks to @Atlogit for the patch.
- Python
Published by github-actions[bot] over 1 year ago
thinc - v9.1.1: Restore wheels for MacOS ARM 64
Previously we used a complicated build process that used self-hosted runners to build wheels for platforms Github Actions did not support. Github Actions has been adding support for ARM recently, so we've simplified the CI process to rely only on it exclusively.
This release adds back support for MacOS ARM64 wheels that were missing from the previous release. Linux ARM wheels are still pending, as Linux ARM architectures are currently only supported for private repos. Cross-compilation with QEMU is possible in theory, but in practice the build timed out after several hours.
- Python
Published by github-actions[bot] over 1 year ago
thinc - v9.1.0: Depend on numpy 2.0.0
Numpy is a build dependency of Thinc, and numpy 2.0 is not binary compatible with numpy 1.0 (fair enough). This means we can't have a version that's compatible across numpy v1 and numpy v2.
This release updates v9 by pinning to numpy 2.0, and builds against it. No other changes are made, so that we have paired versions that only differ in their dependencies.
- Python
Published by github-actions[bot] over 1 year ago
thinc - v8.3.0: Depend on numpy 2.0
Numpy is a build dependency of Thinc, and numpy 2.0 is not binary compatible with numpy 1.0 (fair enough). This means we can't have a version that's compatible across numpy v1 and numpy v2.
This release updates the pins to numpy 2.0 and builds against it. No other changes are made, so that we have paired versions that only differ in their dependencies.
- Python
Published by github-actions[bot] almost 2 years ago
thinc - v8.2.5: Restrict numpy pin to <2.0.0
Numpy v2.0 isn't binary compatible with v1 (understandably). We build against numpy so we need to restrict the pin.
- Python
Published by honnibal almost 2 years ago
thinc - v8.2.4: Relaxing `nbconvert` and `typing_extensions` upper pins
✨ New features and improvements
- Bump
nbconvertpin - Bump
typing_extensionspin for Python 3.7 - Updates to the test suite
👥 Contributors
@honnibal, @ines, @svlandeg
- Python
Published by svlandeg almost 2 years ago
thinc - v9.0.0: better learning rate schedules, integration of thinc-apple-ops
The main new feature of Thinc v9 is the support for learning rate schedules that can take the training dynamics into account. For example, the new
plateau.v1schedule scales the learning rate when no progress has been found after a given number of evaluation steps. Another visible change is thatAppleOpsis now part of Thinc, so it is not necessary anymore to installthinc-apple-opsto use the AMX units on Apple Silicon.
✨ New features and improvements
- Learning rate schedules can now take the training step as well as an arbitrary set of keyword arguments. This makes it possible to pass information such a the parameter name and last evaluation score to determine the learning rate (#804).
- Added the
plateau.v1schedule (#842). This schedule scales the learning rate if training was found to be stagnant for a given period. - The functionality of
thinc-apple-opsis integrated into Thinc (#927). Starting with this version of Thinc, it is not necessary anymore to installthinc-apple-ops.
🔴 Bug fixes
- Fix the use of thread-local storage (#917).
⚠️ Backwards incompatibilities
- Thinc v9.0.0 only support Python 3.9 and later.
- Schedules are not generators anymore, but implementations of the
Scheduleclass (#804). thinc.backends.linalghas been removed (#742). The same functionality is provided by implementations in BLAS that are better tested and more performant.thinc.extra.searchhas been removed (#743). The beam search functionality in this module was strongly coupled to the spaCy transition parser and has therefore moved to spaCy in v4.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @shadeMe, @svlandeg
- Python
Published by danieldk about 2 years ago
thinc - v8.2.3: Fix CuPy compatibility and fix strings2arrays for sequences of inequal length
🔴 Bug fixes
- Make strings2arrays work again for sequences of inequal length (#918).
- Fix
cupy.cublasimport (#921).
👥 Contributors
@danieldk, @honnibal, @ines, @svlandeg
- Python
Published by danieldk over 2 years ago
thinc - v8.2.2: Parametric attention with key transformation
✨ New features and improvements
Add the ParametricAttention_v2 layer, which adds support for key transformations (#913).
👥 Contributors
@danieldk, @honnibal, @ines, @svlandeg
- Python
Published by danieldk over 2 years ago
thinc - v8.2.1: Support Python 3.12
✨ New features and improvements
Updates and binary wheels for Python 3.12.
👥 Contributors
@adrianeboyd, @honnibal, @ines, @svlandeg
- Python
Published by adrianeboyd over 2 years ago
thinc - v8.2.0: Disable automatic MXNet and TensorFlow imports
✨ New features and improvements
To improve loading times and reduce conflicts, MXNet and TensorFlow are no longer imported automatically (#890).
⚠️ Backwards incompatibilities
MXNet and TensorFlow support needs to be enabled explicitly. Previously, MXNet and TensorFlow were imported automatically if they were available in the current environment.
To enable MXNet:
python
from thinc.api import enable_mxnet
enable_mxnet()
To enable TensorFlow:
python
from thinc.api import enable_tensorflow
enable_tensorflow()
With spaCy CLI commands you can provide this custom code using -c code.py. For training use spacy train -c code.py and to package your code with your pipeline use spacy package -c code.py.
Future deprecation warning: built-in MXNet and TensorFlow support will be removed in Thinc v9. If you need MXNet or TensorFlow support in the future, you can transition to using a custom copy of the current MXNetWrapper or TensorFlowWrapper in your package or project.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg
- Python
Published by adrianeboyd almost 3 years ago
thinc - v8.1.12: Support zero-length batches and hidden sizes in reductions
🔴 Bug fixes
- Support zero-length batches and hidden sizes in
reduce_{max,mean,sum}(#882). - Preserve values with dtype for
NumpyOps/CupyOps.asarray(#897).
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg
- Python
Published by adrianeboyd almost 3 years ago
thinc - v8.1.11: Support Pydantic v2, update package setup
✨ New features and improvements
- Update NumPy build constraints for NumPy v1.25 (#885).
- Switch from
distutilstosetuptools/sysconfig(#888). - Allow Pydantic v2 using transitional v1 support (#891).
📖 Documentation and examples
- Fix typo in example code (#879).
👥 Contributors
@adrianeboyd, @Ankush-Chander, @danieldk, @honnibal, @ines, @svlandeg
- Python
Published by adrianeboyd almost 3 years ago
thinc - v8.1.10: Lazy loading for CuPy kernels and additional CuPy and MPS improvements
✨ New features and improvements
- Implement
padas a CUDA kernel (#860). - Avoid h2d - d2h roundtrip when using
unflatten(#861). - Improve exception when CuPy/PyTorch MPS is not installed (#863).
- Lazily load custom
cupykernels (#870).
🔴 Bug fixes
- Initially load TorchScript models on CPU for MPS devices (#864).
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @shadeMe, @svlandeg
- Python
Published by adrianeboyd about 3 years ago
thinc - v8.1.9: Type fixes
🔴 Bug fixes
- Fix type signature of
Model.begin_update(#858).
👥 Contributors
@danieldk, @honnibal, @ines
- Python
Published by danieldk about 3 years ago
thinc - v8.1.8: New faster mapping layer and bug fixes for resizeable layer
✨ New features and improvements
- Add
premap_ids.v1layer for mapping from ints to ints (#815). - Update to mypy 1.0.x (#848).
🔴 Bug fixes
- Make resizable layer work with textcat and transformers (#820).
📖 Documentation
- Update website including
Dockerfile(#843, #844, #845).
👥 Contributors
@adrianeboyd, @danieldk, @essenmitsosse, @honnibal, @ines, @kadarakos, @patjouk, @polm, @svlandeg
- Python
Published by adrianeboyd about 3 years ago
thinc - v8.1.7: Updated layers and extended requirements
✨ New features and improvements
- Add
with_flatten.v2layer with symmetric input/output types (#821). - Extend to
typing_extensionsv4.4.x for Python 3.6 and 3.7 (#833).
📖 Documentation
- Update Gatsby for thinc.ai (#827).
👥 Contributors
@adrianeboyd, @albertvillanova, @danieldk, @essenmitsosse, @honnibal, @ines, @shadchin, @shadeMe, @svlandeg
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.6: New and updated layers, bug fixes and more
✨ New features and improvements
- Update to mypy 0.990 (#801).
- Extend to wasabi v1.1 (#813).
- Add
SparseLinear.v2, to fix indexing issues (#754). - Add
TorchScriptWrapper_v1(#802). - Add callbacks to facilitate lazy-loading models in
PyTorchShim(#796). - Make all layer defaults serializable (#808).
🔴 Bug fixes
- Add missing
packagingrequirement (#799). - Correct sequence length error messages for
reduce_first/last(#807). - Update
CupyOps.asarrayto always copy cupy arrays to the current device (#812). - Fix types for sequences passed to
Ops.asarray*(#819).
👥 Contributors
@adrianeboyd, @danieldk, @frobnitzem, @honnibal, @ines, @richardpaulhudson, @ryndaniels, @shadeMe, @svlandeg
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.5: Updates for Python 3.11
✨ New features and improvements
- Updates and binary wheels for Python 3.11 (#793).
- Make
__all__static to support type checking (#780).
👥 Contributors
@adrianeboyd, @honnibal, @ines, @rmitsch
- Python
Published by adrianeboyd over 3 years ago
thinc - v7.4.6: Updates for Python 3.10 and 3.11
✨ New features and improvements
- Updates for Python 3.10 and 3.11 (#791):
- Update vendored
wraptto v1.14.1. - Update dev requirements.
- Add wheels for Python 3.10 and 3.11.
- Update vendored
👥 Contributors
@adrianeboyd, @honnibal, @ines
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.4: Type fixes
🔴 Bug fixes
- Fix issue #785: Revert change to return type for
Ops.allocfrom #779.
👥 Contributors
@adrianeboyd, @honnibal, @ines, @svlandeg
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.3: Updates for pydantic and mypy
✨ New features and improvements
- Extend pydantic support to v1.10.x (#778).
- Support mypy 0.98x, drop mypy support for Python 3.6 (#776).
🔴 Bug fixes
- Fix issue #775: Fix
fix_random_seedentry point insetup.cfg.
👥 Contributors
@adrianeboyd, @honnibal, @ines, @pawamoy, @svlandeg
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.2: Update blis support and CuPy extras
✨ New features and improvements
- Update CuPy extras to add
cuda116,cuda117,cuda11xandcuda-autodetect, which uses the newcupy-wheelpackage (#740). - Add a pytest-randomly entry point for
fix_random_seed(#748).
🔴 Bug fixes
- Fix issue #772: Restrict supported
blisversions to~=0.7.8to avoid bugs in BLIS 0.9.0.
👥 Contributors
@adrianeboyd, @honnibal, @ines, @rmitsch, @svlandeg, @willfrey
- Python
Published by adrianeboyd over 3 years ago
thinc - v8.1.1: Use confection, new layers and bugfixes
✨ New features and improvements
- Use confection for configurations (#745).
- Add the Dish activation function and layer (#719).
- Add the
with_signpost_intervallayer to support layer profiling with macOS Instruments (#711). - Add
remap_ids.v2layer which allows more types of inputs (#726). - Extend BLIS support to version 0.9.x (#736).
- Improve performance when gradient scaling is used (#746).
- Improve MaxOut performance by unrolling
argmaxinmaxout(#702).
🔴 Bug fixes
- Fix issue #720: Improve type inference by replacing
FloatsTypeinOpsby aTypeVar. - Fix issue #739: Fix typing of
Ops.asarrayDfmethods. - Fix issue #757: Improve compatibility with supported Tensorflow versions.
👥 Contributors
@adrianeboyd, @cclauss, @danieldk, @honnibal, @ines, @kadarakos, @polm, @rmitsch, @shadeMe
- Python
Published by danieldk over 3 years ago
thinc - v8.1.0: Updated types and many Ops improvements
✨ New features and improvements
- Added support for mypy 0.950 and pydantic v1.9.0, added bound types throughout layers and ops (#599).
- Made all
NumpyOpsCPU kernels generic (#627). - Made all custom CUDA kernels generic (#603).
- Added bounds checks for
NumpyOps(#618). - Fixed out-of-bounds writes in
NumpyOpsandCupyOps(#664). - Reduced unnecessary zero-init allocations (#632).
- Fixed reductions when applied to zero-length sequences (#637).
- Added
NumpyOps.cblasto get a table of C BLAS functions (#643, #700). - Improved type-casting in
NumpyOps.asarray(#656). - Simplified
CupyOps.asarray(#661). - Fixed
Model.copy()for layers used more than once (#659). - Fixed potential race in
Shim(#677). - Convert numpy arrays using dlpack in
xp2tensorflowandxp2torchwhen possible (#686). - Improved speed of
HashEmbedby avoiding large temporary arrays (#696). - Added
Ops.reduce_lastandOps.reduce_first(#710). - Numerous test suite improvements.
- Experimental: Add support for Metal Performance Shaders with PyTorch nightlies (#685).
🔴 Bug fixes
- Fix issue #707: Fix label smoothing threshold for
to_categorical.
⚠️ Backwards incompatibilities
- In most cases the typing updates allow many casts and ignores to be removed, but types may also need minor modifications following the updates for mypy and pydantic.
get_array_modulenow returnsNonefor non-numpy/cupy array input rather than returningnumpyby default.- The
prefer_gpuandrequire_gpufunctions no longer set the default PyTorchtorch.Tensortype totorch.cuda.FloatTensor. This means that wrapped PyTorch models cannot assume that Tensors are allocated on a CUDA GPU after calling these functions. For example:
``` # Before Thinc v8.1.0, this Tensor would be allocated on the GPU after # {prefer,require}gpu. Now it will be allocated as a CPU tensor by default. tokenmask = torch.arange(maxseqlen)
# To ensure correct allocation, specify the device where the Tensor should be allocated.
# input refers to the input of the model.
tokenmask = torch.arange(maxseq_len, device=input.device)
```
This change brings Thinc's behavior in line with how device memory allocation is normally handled in PyTorch.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @koaning, @richardpaulhudson, @shadeMe, @svlandeg
- Python
Published by adrianeboyd almost 4 years ago
thinc - v8.0.17: Extended requirements, test suite fixes
✨ New features and improvements
- Extend support for
typing_extensionsup to v4.1.x (for Python 3.7 and earlier). - Various fixes in the test suite.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @shadeMe
- Python
Published by adrianeboyd almost 4 years ago
thinc - v8.0.16: Bug fixes
✨ New features and improvements
- Make
Ops.asarrayimplementations more robust.
🔴 Bug fixes
- Fix issue #624: Support CPU inference for models trained with gradient scaling.
- Fix issue #633: Fix invalid indexing in
Beamwhen no states have valid transitions. - Fix issue #639: Improve PyTorch
Tensorhandling inCupyOps.asarray. - Fix issue #649: Clamp inputs in
Ops.sigmoidto prevent overflow. - Fix issue #651: Fix type safety issue with model ID assignment.
- Fix issue #653: Correctly handle Tensorflow GPU tensors in tests.
- Fix issue #660: Make
is_torch_arraywork without PyTorch installed. - Fix issue #664: Fix out of-bounds writes in
CupyOps.adamandNumpyOps.adam.
⚠️ Backwards incompatibilities
- The
initimplementations for layers no longer returnModel.
📖 Documentation and examples
- Add notebook demonstrating Bloom embeddings.
- Fix LSTM benchmark example.
- Update installation instructions.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @koaning, @notplus, @richardpaulhudson, @shadeMe
- Python
Published by danieldk about 4 years ago
thinc - v8.0.15: Fix compatibility with older PyTorch versions
🔴 Bug fixes
- Fix issue #610: Improve compatibility with PyTorch versions before v1.9.0.
👥 Contributors
@adrianeboyd, @danieldk
- Python
Published by danieldk about 4 years ago
thinc - v8.0.14: New activation functions, bug fixes and more
✨ New features and improvements
- Add new activation functions:
ClippedLinear.v1,Gelu.v1,HardSigmoid.v1,HardSwish.v1,HardSwishMobilenet.v1,HardTanh.v1,ReluK.v1, andSwish.v1. - Automatically set the GPU allocator to PyTorch when PyTorch models are loaded through
PyTorchWrapperon GPU to avoid memory contention between CuPy and PyTorch. - Support big endian platforms through
thinc-bigendian-opsand consistently serialize model data with little endian byte order. - Add
Softmax.v2with support for softmax with temperature and optional normalization. - Add
CategoricalCrossentropy.v3andSequenceCategoricalCrossentropy.v3with support for label smoothing. - Speed up
CupyOps.maxoutby exploiting GPU parallelism better. - Support sequence lengths in the
NumpyOps.seq2colandCupyOps.seq2colimplementations ofOps.seq2colto determine padding. - Improve performance of
Ragged. - Support
Raggedarrays inexpand_window.v1.
🔴 Bug fixes
- Fix issue #552: Do not backpropagate
Inf/NaNout of PyTorch layers when using mixed-precision training. - Fix issue #578: Correctly cast the threshold argument of
CupyOps.mishand correct an equation inOps.backprop_mish. - Fix issue #587: Correct invariant checks in
CategoricalCrossentropy.get_grad. - Fix issue #592: Update
murmurhashrequirement. - Fix issue #594: Do not sort positional arguments in
Config.
⚠️ Backwards incompatibilities
- The
outkeyword argument ofOps.mishandOps.backprop_mishis replaced byinplacefor consistency with other activations.
📖Documentation and examples
- Update example Jupyter notebooks for the current Thinc version.
👥 Contributors
@adrianeboyd, @andrewsi-z, @danieldk, @honnibal, @ines, @Jette16, @kadarakos, @kianmeng, @polm, @svlandeg, @thatbudakguy
- Python
Published by danieldk about 4 years ago
thinc - v8.0.12: Bug fixes for set_ops and use_ops
🔴 Bug fixes
- Fix issue #553: Switch torch tensor type with
set_opsanduse_ops. - Fix issue #554: Always restore original ops after
use_ops.
👥 Contributors
@adrianeboyd, @danieldk, @ryndaniels, @svlandeg
- Python
Published by adrianeboyd over 4 years ago
thinc - v8.0.11: Improved GPU training time
✨ New features and improvements
- Speed up GPU training time with up to ~25% by using cuBLAS for computing Frobenius norms in gradient clipping.
- Give preference to
AppleOps(if available) when callingget_ops("cpu"). - Support missing values in
CategoricalCrossEntropywhen the labels are integers. - Provide the option to run
model.walkwith depth-first traversal. - Wrap
forward/initcallbacks of aModelinwith_debugandwith_nvtx_rangeto facilitate recursively instrumenting models.
🔴 Bug fixes
- Fix issue #537: Fix
replace_nodeon nodes with indirect node refs.
👥 Contributors
@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg
- Python
Published by svlandeg over 4 years ago
thinc - v8.0.10: Bug fix for get_array_ops
🔴 Bug fixes
- Fix issue #533: Fix
get_array_opsfor numpy arrays.
👥 Contributors
@adrianeboyd
- Python
Published by adrianeboyd over 4 years ago
thinc - v8.0.9: Support for NVTX ranges and mypy plugin fixes
✨ New features and improvements
- Add
opsregistry. - Enable config overrides to add new keys.
- Allow newer releases of
nbconvertandnbformat. - Layer for marking NVTX ranges.
- Support mixed-precision training in the PyTorch shim (experimental).
🔴 Bug fixes
- Fix issue #521: Fix
numpy_opsgemmoutput. - Fix issue #525: Fix
mypyplugin crash on variadic arguments.
👥 Contributors
@adrianeboyd, @connorbrinton, @danieldk, @honnibal, @ines, @svlandeg
- Python
Published by ines over 4 years ago
thinc - v8.0.8: CategoricalCrossentropy allows negated values
✨ New features and improvements
- Allow negated values in CategoricalCrossentropy
- Python
Published by svlandeg almost 5 years ago
thinc - v8.0.7: Bug fixes for n-grams and typing
🔴 Bug fixes
- Fix issue #512: Include final n-gram in
NumpyOps.ngrams. - Fix issue #516: Update initializers for typing in numpy 1.21+.
- Python
Published by adrianeboyd almost 5 years ago
thinc - v8.0.6: Bug fix for backprop_reduce_max GPU kernel
🔴 Bug fixes
- Fix
backprop_reduce_maxGPU kernel.
- Python
Published by adrianeboyd almost 5 years ago
thinc - v8.0.5: Updates for torch v1.9.0
✨ New features and improvements
- Update to support torch v1.9.0.
- Python
Published by adrianeboyd almost 5 years ago
thinc - v8.0.4: New tuplify and resizable layers, and some bug fixes
✨ New features and improvements
- Add
tuplifylayer. - More generic implementation of the
concatenatelayer. - Add
resizablelayer. - Introduce
forceparameter formodel.set_dim(). - Improve UX when setting the GPU allocator.
🔴 Bug fixes
- Fix issue #492: Fix backpropagation in
with_getitem. - Fix issue #494: Resolve forward refs issue with Pydantic.
- Fix issue #496: Avoid Pydantic versions with security vulnerabilities.
👥 Contributors
@adrianeboyd, @honnibal, @ines, @kludex, @polm, @svlandeg, @thomashacker
- Python
Published by svlandeg almost 5 years ago
thinc - v8.0.3: Bug fixes for config overrides and expand_window
🔴 Bug fixes
- Fix issue #486: Fix
expand_windowfor empty docs on GPU - Fix issue #487: Require
catalogue>=2.0.3due to performance regressions related toimportlib-metadata - Fix issue #488: Fix config override & interpolate interaction
- Python
Published by adrianeboyd about 5 years ago
thinc - v8.0.2: New map_list layer, bug fixes for saving to Pathy paths and more
✨ New features and improvements
- Add
map_listlayer (#472)
🔴 Bug fixes
- Fix issue #465: Fix saving models to Pathy paths
- Fix issue #466: Avoid initializing with Y if X is set
- Fix issue #470: Reset torch tensor type in
require_cpu - Fix issue #484: Ensure consistency of nO dim for BiLSTM
- Python
Published by adrianeboyd about 5 years ago
thinc - v8.0.1: Bug fixes for list2padded and LayerNorm
🔴 Bug fixes
- Fix issue #464: Fix list2padded op
- Add
nOtoLayerNorm
- Python
Published by adrianeboyd about 5 years ago
thinc - v8.0.0: Full rewrite, compose models using any framework such as PyTorch or TensorFlow, built-in type checking, config system and more
🔮 This version of Thinc has been rewritten from the ground up and will be used to power the upcoming spaCy v3.0. The new Thinc v8.0 is a lightweight deep learning library that offers an elegant, type-checked, functional-programming API for composing models, with support for layers defined in other frameworks such as PyTorch, TensorFlow or MXNet. You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. For more details, see the documentation.
✨ New features and improvements
- Use any framework: Switch between PyTorch, TensorFlow and MXNet models without changing your application, or even create mutant hybrids using zero-copy array interchange.
- Type checking: Develop faster and catch bugs sooner with sophisticated type checking. Trying to pass a 1-dimensional array into a model that expects two dimensions? That’s a type error. Your editor can pick it up as the code leaves your fingers.
- Config system: Configuration is a major pain for ML. Thinc lets you describe trees of objects with references to your own functions, so you can stop passing around blobs of settings. It's simple, clean, and it works for both research and production.
- Super lightweight: Small and easy to install with very few required dependencies, available on pip and conda for Linux, macOS and Windows. Simple source with a consistent API.
- Concise functional-programming approach to model definition using composition rather than inheritance.
- First-class support for variable-length sequences: multiple built-in sequence representations and your layers can use any object.
- Python
Published by ines over 5 years ago
thinc - v7.4.5: Fix numpy compatibility in binary wheels
🔴 Bug fixes
- Fix
numpycompatibility in binary wheel releases. - Fix
cupy-cuda111extra requirement.
- Python
Published by adrianeboyd over 5 years ago
thinc - v7.4.4: Update for cupy v8 and update package setup
🔴 Bug fixes
- Update for compatibility with
cupyv8. - Remove f-strings from
PyTorchWrapper. - Remove detailed
numpybuild constraints frompyproject.toml. - Update Cython extension setup.
- Python
Published by adrianeboyd over 5 years ago
thinc - v7.4.3: Fix memory leak in Beam and random seed in ParametricAttention
✨ New features and improvements
- Add
seedargument toParametricAttention. - Dynamically include
numpyheaders and addnumpybuild constraints. - Update tests to support
hypothesisv5.
🔴 Bug fixes
- Fix memory leak in
Beam.
- Python
Published by adrianeboyd over 5 years ago
thinc - v7.4.2: Update compatible cupy versions and for python 3.9
🔴 Bug fixes
- Restrict compatible
cupyversions to<8.0.0. - Update setup for python 3.9.
- Python
Published by adrianeboyd over 5 years ago
thinc - v7.4.1: Fix OOV vectors bug
🔴 Bug fixes
- Use 0-vector for OOV in
StaticVectorsto fix similarity bug in spaCy - Fix murmurhash on platforms where long type was not 64 bit
- Python
Published by honnibal about 6 years ago
thinc - v7.3.1: Relax dependecy requirements
🔴 Bug fixes
- Relax version range of
placto match spaCy.
- Python
Published by ines over 6 years ago
thinc - v7.3.0: Mish activation and experimental optimizers
✨ New features and improvements
- Add Mish activation. Use via the
thinc.v2v.Mishlayer, which computesf(X) = mish(W @ X + b). CUDA and Cython kernels are included to make the activation efficient. - Add experimental support for RAdam to the optimizer. Enable it with the keyword argument
use_radamtoTrue. In preliminary testing, it's a small change that's worth enabling. - Add experimental support for Lookahead to the optimizer. Enable it by setting the keyword argument
lookahead_kto a positive integer. In preliminary testing, it helps if you're not using parameter averaging, but with averaging it's a bit worse. - Add experimental support for LARS to the optimizer. Enable it by setting
use_larstoTrue. In preliminary testing, this hasn't worked well at all – possibly our implementation is broken.
🙏 Acknowledgements
Big thanks to @digantamisra98 for the Mish activation, especially the extensive experiments and simple gradient calculation. We expect to be using the activation in the next round of spaCy models.
Gratitude to the fast.ai community for their crowd-sourced experiments, and especially to users @LessW2020, @MGrankin and others for their optimizer implementations, which we referenced heavily when implementing the optimizers for Thinc. More importantly, it's super helpful to have a community filtering the deluge of papers for techniques that work on a few different datasets. This thread on optimization research was particularly helpful.
- Python
Published by ines over 6 years ago
thinc - v7.2.0: Simpler GPU install and bug fixes
✨ New features and improvements
- Ditch
thinc_gpu_opsfor simpler GPU install. - Improve GPU support and PyTorch wrapper.
🔴 Bug fixes
- Fix issue #47: Fix
ExtractWindownW>=2. - Fix issue #51: Ditch
thinc_gpu_opsfor simpler GPU install. - Fix issue #88: Fix Quora URL in datasets.
- Fix issue #115: Fix compilation on cygwin.
👥 Contributors
Thanks to @rupsaijna and @KoichiYasuoka for the pull requests!
- Python
Published by ines over 6 years ago
thinc - v7.1.1: Support preshed v3.0.0
✨ New features and improvements
- Allow support for
preshedv3.0.0, which includes some bug fixes when items are deleted from the table, and also features Bloom filters. - Use
collections.abcwhen possible and avoid deprecation warning.
👥 Contributors
Thanks to @hervenicol for the pull request!
- Python
Published by honnibal over 6 years ago
thinc - v7.1.0: Support other CPUs, read-only arrays
✨ New features and improvements
Support read-only numpy arrays, by specifying
constin Cython memory-view types. Read-only arrays are helpful for shared-memory multiprocessing, e.g. from Apache Arrow's Plasma object store.Update to
cython-blisv0.4, which supports non-x8664 CPU architectures. For wide (but slow) support, you can specify the environment variable `BLISARCH=generic` before installing.
- Python
Published by honnibal almost 7 years ago
thinc - v7.0.8: Fix version for PyPi
🔴 Bug fixes
- Fix version number for PyPi.
- Python
Published by ines almost 7 years ago
thinc - v7.0.7: Avoid allocating a negative shape for ngrams
🔴 Bug fixes
- Avoid allocating a negative shape for ngrams.
👥 Contributors
Thanks to @svlandeg for the pull request!
- Python
Published by ines almost 7 years ago
thinc - v7.0.6: Fix LinearModel regression
🔴 Bug fixes
- Fix regression in
LinearModelclass introduced in v7.0.5.
- Python
Published by ines almost 7 years ago
thinc - v7.0.5: Bug fixes for pickle, threading, unflatten and consistency
🔴 Bug fixes
- Fix issue #98: Fix syntax error in
CPickleimport. - Fix issue #102: Fix bug that could make
HashEmbedresults inconsistent across runs. - Fix issue #104: Fix unflatten padding when last element is empty.
- Fix issue #97: Pickling error on
LinearModel. - Fix issue with creating
Modelinstances in child threads with operator overloading.
👥 Contributors
Thanks to @giannisdaras, @simonhkswan, @chssch and @svlandeg for the pull requests and contributions.
- Python
Published by ines almost 7 years ago
thinc - v7.0.4: Don't require thinc_gpu_ops
🔴 Bug fixes
- Don't require
thinc_gpu_ops.
- Python
Published by ines almost 7 years ago
thinc - v7.0.3: Fix pruning in beam search
🔴 Bug fixes
- Fix incorrect calculation of
min_densityinthinc.search.Beamclass. Previously the beam was pruned based on the raw logit scores, instead of normalized probabilities.
- Python
Published by honnibal about 7 years ago
thinc - v7.0.2: Fix regression in linear model class
🔴 Bug fixes
- Fix regression in
thinc.linear.LinearModelclass.
- Python
Published by ines over 7 years ago
thinc - v7.0.1: Fix import errors
🔴 Bug fixes
- Fix import errors introduced when dropping dependencies in v7.0.0.
- Python
Published by ines over 7 years ago
thinc - v7.0.0: Overhaul package dependencies
⚠️ Backwards incompatibilities
- Thinc v7.0 drops support for Python 2.7 on Windows. Python 2.7 remains supported on Linux and OSX. Support could be restored in future. We're currently unable to build our new dependency,
blis, for Windows on Python 2.7. If you can assist with this, please let us know.
✨ New features and improvements
Use
blisfor matrix multiplication. Previous versions delegated matrix multiplication to platform-specific libraries via numpy. This led to inconsistent results, especially around multi-threading. We now provide a standalone package, with the Blis linear algebra routines. Importantly, we've built Blis to be single-threaded. This makes it much easier to do efficient inference, as the library will no longer spawn threads underneath you.Use
srslyfor serialization. We now provide a single package with forks of our preferred serialisation libraries – specifically,msgpack,ujsonandcloudpickle. This allows us to provide a single binary wheel for these dependencies, and to maintain better control of our dependency tree, preventing breakages.Update versions of
cymem,preshedandmurmurhash. Thinc is compiled against our memory pool and hash table libraries,cymemandpreshed. Changing these build-time dependencies requires Thinc to be recompiled. This is one reason the major version number needed to be incremented for this release.
- Python
Published by honnibal over 7 years ago
thinc - v6.12.1: Fix messagepack pin
🔴 Bug fixes
- Fix issue explosion/spaCy#2995: Pin
msgpackto version<0.6.0, to avoid the low message-length limit introduced in v0.6.0, which breaks spaCy. We will relax the pin once spaCy is updated to set themax_xx_lenargument tomsgpack.dumps()
- Python
Published by honnibal over 7 years ago
thinc - v6.12.0: Wheels and separate GPU ops
✨ New features and improvements
- Update dependencies to be able to provide binary wheels.
- Move GPU ops to separate package,
thinc_gpu_ops. - Support pip specifiers for GPU installation, e.g.
pip install thinc[cuda92].
🔴 Bug fixes
- Update
murmurhashpin to accept newer version.
- Python
Published by ines over 7 years ago
thinc - v6.10.3: Python 3.7 support and dependency updates
✨ New features and improvements
- Update
cytoolzversion pin to make Thinc compatible with Python 3.7. - Only install old
pathlibbackport on Python 2 (see #69). - Use
msgpackinstead ofmsgpack-python. - Drop
termcolordependency.
- Python
Published by ines almost 8 years ago
thinc - v6.11.2: Improve GPU installation
✨ New features and improvements
You can now require GPU capability using the pip "extras" syntax. Thinc also now expects CUDA to be installed at /usr/local/cuda by default. If you've installed it elsewhere, you can specify the location with the CUDA_HOME environment variable. Once Thinc is able to find CUDA, you can tell pip to install Thinc with cupy, as follows:
thinc[cuda]: Install cupy from source (compatible with a range of cuda versions)thinc[cuda80]: Install the cupy-cuda80 wheelthinc[cuda90]: Install the cupy-cuda90 wheelthinc[cuda91]: Install the cupy-cuda91 wheel
If you're installing Thinc from a local wheel file, the syntax for adding an "extras" specifier is a bit unintuitive. The trick is to make the file path into a URL, so you can use an #egg clause, as follows:
bash
pip install file://path/to/wheel#egg=thinc[cuda]
- Python
Published by ines about 8 years ago
thinc - 6.11.1: Support direct linkage to BLAS libraries
✨ New features and improvements
- Thinc now vendorizes OpenBLAS's
cblas_sgemmfunction, and delegates matrix multiplications to it by default. The provided function is single-threaded, making it easy to call Thinc from multiple processes. The default sgemm function can be overridden using theTHINC_BLASenvironment variable --- see below. thinc.neural.util.get_opsnow understands device integers, e.g.0for GPU 0, as well as strings like"cpu"and"cupy".- Update
StaticVectorsmodel, to make use of spaCy v2.0'sVectorsclass. - New
.gemm()method on NumpyOps and CupyOps classes, allowing matrix and vector multiplication to be handled with a simple function. Example usage:
Customizing the matrix multiplication backend
Previous versions of Thinc have relied on numpy for matrix multiplications. When numpy is installed via wheel using pip (the default), numpy will usually be linked against a suboptimal matrix multiplication kernel. This made it difficult to ensure that Thinc was well optimized for the target machine.
To fix this, Thinc now provides its own matrix multiplications, by bundling the source code for OpenBLAS's sgemm kernel within the library. To change the default BLAS library, you can specify an environment variable, giving the location of the shared library you want to link against:
```bash THINCBLAS=/opt/openblas/lib/libopenblas.so pip install thinc --no-cache-dir --no-binary export LDLIBRARY_PATH=/opt/openblas/lib
On OSX:
export DYLDLIBRARYPATH=/opt/openblas/lib
```
If you want to link against the Intel MKL instead of OpenBLAS, the easiest way is to install Miniconda. For instance, if you installed miniconda to `/opt/miniconda', the command to install Thinc linked against MKL would be:
```bash THINCBLAS=/opt/miniconda/numpy-mkl/lib/libmklrt.so pip install thinc --no-cache-dir --no-binary export LDLIBRARYPATH=/opt/miniconda/numpy-mkl/lib
On OSX:
export DYLDLIBRARYPATH=/opt/miniconda/numpy-mkl/lib
```
If the library file ends in a .a extension, it is linked statically; if it ends in .so, it's linked dynamically. Make sure you have the directory on your LD_LIBRARY_PATH at runtime if you use the dynamic linking.
🔴 Bug fixes
- Fix pickle support for
FeatureExtracterclass. - Fix unicode error in Quora dataset loader.
- Fix batch normalization bugs. Now supports batch "renormalization" correctly.
- Models now reliably distinguish predict vs. train modes, using the convention
drop=None. Previously, layers such asBatchNormrelied on having theirpredict()method called, which didn't work they were called by layers which didn't implement apredict()method. We now setdrop=Noneto make this more reliable. - Fix bug that caused incorrect data types to be produced by
FeatureExtracter.
👥 Contributors
Thanks to @dvsrepo, @justindujardin, @alephmelo and @darkdreamingdan for the pull requests and contributions.
- Python
Published by honnibal about 8 years ago
thinc - v6.10.2: Efficiency improvements and bug fixes
✨ New features and improvements
- Improve GPU utilisation for attention layer.
- Improve efficiency of Maxout layer on CPU.
🔴 Bug fixes
- Bug fix to
foreachcombinator, useful for hierarchical models. - Bug fix to batch normalization.
📖 Documentation and examples
- Update
imdb_cnntext classification example.
- Python
Published by ines over 8 years ago
thinc - v6.10.1: Fix GPU install and minor memory leak
🔴 Bug fixes
- Fix installation with CUDA 9.
- Fix minor memory leak in beam search.
- Fix dataset readers.
- Python
Published by ines over 8 years ago
thinc - v6.10.0: CPU efficiency improvements, refactoring
✨ Major features and improvements
- Provisional CUDA 9 support. CUDA 9 removes a compilation flag we require for CUDA 8. As a temporary workaround, you can build on CUDA 9 by setting the environment variable
CUDA9=1. For example:
bash
CUDA9=1 pip install thinc==6.10.0
* Improve efficiency of NumpyOps.scatter_add, when the indices only have a single dimension. This function was previously a bottle-neck for spaCy.
* Remove redundant copies in backpropagation of maxout non-linearity
* Call floating-point versions of sqrt, exp and tanh functions.
* Remove calls to tensordot, instead reshaping to make 2d dot calls.
* Improve efficiency of Adam optimizer on CPU.
* Eliminate redundant code in thinc.optimizers. There's now a single Optimizer class. For backwards compatibility, SGD and Adam functions are used to create optimizers with the Adam recipe or vanilla SGD recipe.
👥 Contributors
Thanks to @RaananHadar for the pull request!
- Python
Published by honnibal over 8 years ago
thinc - v6.9.0: Reorganize layers, bug fix to Layer Normalization
✨ Major features and improvements
- Add new namespace modules
thinc.v2v,thinc.i2v,thinc.t2t,thinc.t2vthat group layer implementations by input and output typevindicatesvector,iindicates integer ID,tindicates tensor. The input type refers to the logical unit, i.e. what constitutes a sample.
🔴 Bug fixes
- Fix bug in layer normalization. The bug fix means that models trained with Thinc 6.8 are incompatible with Thinc 6.9. For convenience, a backwards compatibility flag has been added, which can be set with
thinc.neural._classes.layernorm.set_compat_six_eight. This flag is off by default.
- Python
Published by honnibal over 8 years ago
thinc - v6.8.2: Fix packaging of gpu_ops
🔴 Bug fixes
- Fix incorrect packaging of
thinc.neural.gpu_ops, introduced in v6.8.1. - Fix bad data type in
thinc.extra.search.MaxViolation, which caused segfaults on some platforms.
- Python
Published by honnibal over 8 years ago
thinc - v6.8.1: Fix Windows support
✨ Major features and improvements
- Add new
foreachlayer combinator, which maps a layer across elements of a sequence. - Add support for
predictmethods to more layers, for use during decoding. - Improve correctness of batch normalization. Previously, some layers would force batch normalization to run in training mode, even during prediction. This led to decreased accuracy in some situations.
- Improved efficiency of
Maxoutlayer.
🔴 Bug fixes
- Fix compiler flags for MSVC
- Remove unnecessary Chainer dependency. Now depends on Chainer's
cupypackage. - Fix LSTM layer.
- Small bug fixes to beam search
- Python
Published by honnibal over 8 years ago
thinc - v6.8.0: SELU layer, attention, improved GPU/CPU compatibility
✨ Major features and improvements
- Add SELU layer, from Klambauer et al. (2017).
- Add parametric soft attention layer, as in Yang et al. (2016).
- New higher-order function
uniqued, which wraps layers giving them a per-batch cache. - Improve batch normalization, by tracking activation moving averages.
🔴 Bug fixes
- Fix GPU usage in pooling operations.
- Add optimized code for extracting ngram features.
- Improve CPU/GPU compatibility.
- Improve compatibility of
LinearModelclass.
👥 Contributors
Thanks to @tammoippen for the pull request!
- Python
Published by ines almost 9 years ago
thinc - v6.7.3: Fix convolution on GPU
🔴 Bug fixes
- Convolution is now computed the same on CPU and GPU.
- Python
Published by ines almost 9 years ago
thinc - v6.7.2: Bug fixes to serialization
🔴 Bug fixes
- Make order of dicts stable when serializing model.
- Python
Published by ines almost 9 years ago
thinc - v6.7.1: Improve serialization
✨ Major features and improvements
- Temporarily revert change to CuPy.
- Improve efficiency of Adam optimizer.
- Python
Published by ines almost 9 years ago
thinc - v6.7.0: Fixes to serialization, hash embeddings and flatten ops
✨ Major features and improvements
- Add
Model.to_bytes()andModel.from_bytes()methods, to support serialization that's compatible between Python versions. - Remove code depending on Chainer, and instead depend explicitly on the new
cupysubpackage, for simpler GPU installation. - Improve accuracy for HashEmbed table, by using 4 conditionally independent keys.
- Support padding in
flattenandwith_flatten ops. - Use the same hash function on both CPU and GPU, for model compatibility.
🔴 Bug fixes
HashEmbednow returns correct results for arrays of length not divisible by 16.- Provide
.cusource files in the source distribution. - Remove unnecessary allocations from the CPU maxout op.
- Fix issue #27: Remove Python2-specific code from
setup.py.
- Python
Published by honnibal almost 9 years ago
thinc - v6.6.0: Improved GPU usage and examples
✨ Major features and improvements
- Add GPU kernels for max and mean pool using variable-length sequences.
thinc.api.FeatureExtractor, for getting features from spaCyDocobjects.
🔴 Bug fixes
- Improve multi-device handling
thinc.api.addnow accepts a variable number of layers.- Improve
Residualclass.
⚠️ Backwards incompatibilities
- Some of the example code may be out of date.
📖 Documentation and examples
- Add reader for WikiNER corpora.
- Add example for Twitter NER.
- Add Siamese network example.
- Python
Published by ines about 9 years ago
thinc - v6.5.1: Improved linear class and Windows fix
✨ Major features and improvements
- Add hash kernel linear class.
🔴 Bug fixes
- Fix issue #22: Remove
random_bytesmethod fromOps. - Fix
termcolordependency.
📖 Documentation and examples
- Add IMDB to datasets.
- Add linear BOW example, using hash kernel.
👥 Contributors
Thanks to @rolando and @ogrisel for the pull requests!
- Python
Published by ines about 9 years ago
thinc - v6.5.0: Supervised similarity, fancier embedding and improvements to linear model
✨ Major features and improvements
- Improve GPU support.
- Add classes for siamese neural network architectures for supervised similarity.
- Add
HashEmbedclass, an embedding layer which uses the hashing trick to support a larger vocabulary in a shorter table. - Add support for distinct feature columns in the
Embedclass.
🔴 Bug fixes
- Fix model averaging for linear model.
- Fix
resume_training()method for linear model. - Fix L1 penalty for linear model.
📖 Documentation and examples
- Add supervised similarity example for Quora, StackExchange and SNLI data.
- Python
Published by honnibal about 9 years ago
thinc - v6.3.0: Efficiency improvements, argument checking and error messaging
✨ Major features and improvements
- NEW: Add
thinc.checkmodule to specify argument constraints for functions and methods. - NEW: Add
thinc.exceptionsmodule with custom exception messaging. - Add LSUV initialisation.
- Add averaged parameters, for reduced hyper-parameter sensitivity.
- Improve efficiency of maxout, window extraction and dropout.
📋 Tests
- Reorganise and improve tests.
- Reach 100% coverage over the entire package.
- Python
Published by honnibal over 9 years ago
thinc - v6.2.0: Improve API and introduce overloaded operators
✨ Major features and improvements
- NEW:
Modelnow hasdefine_operators()classmethod to overload operators for a given block. - Add
chain(),clone()andconcatenate()functions for use with overloaded operators. - Add
describemodule which provides class decorators for defining new layers. - Allow layers to calculate input and output sizes based on training data.
Together, these features allow very concise model definitions:
python
with Model.define_operators({'**': clone, '>>': chain}):
model = BatchNorm(ReLu(width)) ** depth >> Softmax()
⚠️ Backwards incompatibilities
- Major revisions to previously undocumented neural network APIs (see above).
📋 Tests
- Reorganise and improve tests for neural network functions.
- Reach 100% coverage over the current neural network classes.
- Python
Published by honnibal over 9 years ago
thinc - v6.1.3: More neural network functions and training continuation
✨ Major features and improvements
- NEW: Add several useful higher-order functions, including
@layerizeand@metalayerizedecorators to turn functions into weightless layers. - NEW: Add batch normalization layer.
- NEW: Add residual layer using pre-activation approach.
- Simplify model setup and initialization.
- Add
ELUlayer.
🔴 Bug fixes
- The
AveragedPerceptronclass can now continue training after model loading. Previously, the weights were zeroed for each feature as soon as it was updated. This affected spaCy users, especially those adding new classes to the named entity recognizer.
📖 Documentation and examples
- Add CNN tagger example.
- Python
Published by honnibal over 9 years ago
thinc - v6.0.0: Add thinc.neural for NLP-oriented deep learning
✨ Major features and improvements
- NEW: Add
thinc.neuralto develop neural networks for spaCy. - Introduce support for Affine, Maxout, ReLu and Softmax vector-to-vector layers.
- Introduce support for efficient static word embedding layer with projection matrix and per-word-type memoisation.
- Introduce support for efficient word vector convolution layer, which also supports per-word-type memoisation.
- Introduce support for
MeanPooling,MaxPoolingandMinPooling. AddMultiPoolinglayer for concatenative pooling. - Introduce support for annealed dropout training.
- Introduce support for classical momentum, Adam and Eve optimisers.
- Introduce support for averaged parameters for each optimiser.
⚠️ Backwards incompatibilities
The Example class now holds a pointer to its ExampleC struct, where previously it held the struct value. This introduces a small backwards incompatibility in spaCy.
- Python
Published by honnibal over 9 years ago