Recent Releases of thinc

thinc - v8.3.6: Support Python 3.13

This release adds support for Python 3.13. In order to do this we're requiring Pydantic >= 2.0 and updated compilation to use Cython 3.0. This required an updated to the blis packaged that's not binary compatible, but thinc itself should not have any binary backwards compatibility issues.

- Python
Published by github-actions[bot] about 1 year ago

thinc - v8.3.4: Update Blis pin to revert to known-good v0.7

Previous releases have used releases of our blis package that vendor newer releases of the upstream blis library. Unfortunately these newer releases have had intermittent crashes on Windows that we haven't been able to track down.

I've therefore released a v1.2 of the blis package that goes back to the known-good v0.7 release of the vendored blis code, which we were using before. This release updates the verison-pin to use it.

It took a surprisingly long time to get v0.7 of blis to compile, due to conflicts on Windows. I regret the delay.

- Python
Published by github-actions[bot] over 1 year ago

thinc - v8.3.3: Fix Blis crashes, widen numpy pin

  • Update blis pin to v1.1. This updates the vendored blis code to 1.1, which should fix crashes from the previously vendored v0.9 code on Windows.
  • Widen numpy pin, allowing versions across v1 and v2. Previously I had thought that if I build against numpy v2, I couldn't also have v1 as a runtime dependency. This is actually incorrect, so we can widen the numpy pin
  • Set flag on loading PyTorch models to improve safety of loading PyTorch models.

- Python
Published by github-actions[bot] over 1 year ago

thinc - v8.3.2: Fix regression to torch training, update ARM dependency

  • Fix regression to torch training introduced in v8.3.1
  • Restore MacOS ARM wheels, which were missing from previous builds
  • Fix compatibility with thinc-apple-ops

- Python
Published by github-actions[bot] over 1 year ago

thinc - v8.3.1: Fix torch deprecation warning

torch.cuda.amp is deprecated (Pytorch 2.4). This PR updates shims pytorch.py to use torch.amp.autocast instead of torch.cuda.amp.autocast.

Thanks to @Atlogit for the patch.

- Python
Published by github-actions[bot] over 1 year ago

thinc - v9.1.1: Restore wheels for MacOS ARM 64

Previously we used a complicated build process that used self-hosted runners to build wheels for platforms Github Actions did not support. Github Actions has been adding support for ARM recently, so we've simplified the CI process to rely only on it exclusively.

This release adds back support for MacOS ARM64 wheels that were missing from the previous release. Linux ARM wheels are still pending, as Linux ARM architectures are currently only supported for private repos. Cross-compilation with QEMU is possible in theory, but in practice the build timed out after several hours.

- Python
Published by github-actions[bot] over 1 year ago

thinc - v9.1.0: Depend on numpy 2.0.0

Numpy is a build dependency of Thinc, and numpy 2.0 is not binary compatible with numpy 1.0 (fair enough). This means we can't have a version that's compatible across numpy v1 and numpy v2.

This release updates v9 by pinning to numpy 2.0, and builds against it. No other changes are made, so that we have paired versions that only differ in their dependencies.

- Python
Published by github-actions[bot] over 1 year ago

thinc - v8.3.0: Depend on numpy 2.0

Numpy is a build dependency of Thinc, and numpy 2.0 is not binary compatible with numpy 1.0 (fair enough). This means we can't have a version that's compatible across numpy v1 and numpy v2.

This release updates the pins to numpy 2.0 and builds against it. No other changes are made, so that we have paired versions that only differ in their dependencies.

- Python
Published by github-actions[bot] almost 2 years ago

thinc - v8.2.5: Restrict numpy pin to <2.0.0

Numpy v2.0 isn't binary compatible with v1 (understandably). We build against numpy so we need to restrict the pin.

- Python
Published by honnibal almost 2 years ago

thinc - v8.2.4: Relaxing `nbconvert` and `typing_extensions` upper pins

✨ New features and improvements

  • Bump nbconvert pin
  • Bump typing_extensions pin for Python 3.7
  • Updates to the test suite

👥 Contributors

@honnibal, @ines, @svlandeg

- Python
Published by svlandeg almost 2 years ago

thinc - v9.0.0: better learning rate schedules, integration of thinc-apple-ops

The main new feature of Thinc v9 is the support for learning rate schedules that can take the training dynamics into account. For example, the new plateau.v1 schedule scales the learning rate when no progress has been found after a given number of evaluation steps. Another visible change is that AppleOps is now part of Thinc, so it is not necessary anymore to install thinc-apple-ops to use the AMX units on Apple Silicon.

✨ New features and improvements

  • Learning rate schedules can now take the training step as well as an arbitrary set of keyword arguments. This makes it possible to pass information such a the parameter name and last evaluation score to determine the learning rate (#804).
  • Added the plateau.v1 schedule (#842). This schedule scales the learning rate if training was found to be stagnant for a given period.
  • The functionality of thinc-apple-ops is integrated into Thinc (#927). Starting with this version of Thinc, it is not necessary anymore to install thinc-apple-ops.

🔴 Bug fixes

  • Fix the use of thread-local storage (#917).

⚠️ Backwards incompatibilities

  • Thinc v9.0.0 only support Python 3.9 and later.
  • Schedules are not generators anymore, but implementations of the Schedule class (#804).
  • thinc.backends.linalg has been removed (#742). The same functionality is provided by implementations in BLAS that are better tested and more performant.
  • thinc.extra.search has been removed (#743). The beam search functionality in this module was strongly coupled to the spaCy transition parser and has therefore moved to spaCy in v4.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @shadeMe, @svlandeg

- Python
Published by danieldk about 2 years ago

thinc - v8.2.3: Fix CuPy compatibility and fix strings2arrays for sequences of inequal length

🔴 Bug fixes

  • Make strings2arrays work again for sequences of inequal length (#918).
  • Fix cupy.cublas import (#921).

👥 Contributors

@danieldk, @honnibal, @ines, @svlandeg

- Python
Published by danieldk over 2 years ago

thinc - v8.2.2: Parametric attention with key transformation

✨ New features and improvements

Add the ParametricAttention_v2 layer, which adds support for key transformations (#913).

👥 Contributors

@danieldk, @honnibal, @ines, @svlandeg

- Python
Published by danieldk over 2 years ago

thinc - v8.2.1: Support Python 3.12

✨ New features and improvements

Updates and binary wheels for Python 3.12.

👥 Contributors

@adrianeboyd, @honnibal, @ines, @svlandeg

- Python
Published by adrianeboyd over 2 years ago

thinc - v8.2.0: Disable automatic MXNet and TensorFlow imports

✨ New features and improvements

To improve loading times and reduce conflicts, MXNet and TensorFlow are no longer imported automatically (#890).

⚠️ Backwards incompatibilities

MXNet and TensorFlow support needs to be enabled explicitly. Previously, MXNet and TensorFlow were imported automatically if they were available in the current environment.

To enable MXNet:

python from thinc.api import enable_mxnet enable_mxnet()

To enable TensorFlow:

python from thinc.api import enable_tensorflow enable_tensorflow()

With spaCy CLI commands you can provide this custom code using -c code.py. For training use spacy train -c code.py and to package your code with your pipeline use spacy package -c code.py.

Future deprecation warning: built-in MXNet and TensorFlow support will be removed in Thinc v9. If you need MXNet or TensorFlow support in the future, you can transition to using a custom copy of the current MXNetWrapper or TensorFlowWrapper in your package or project.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg

- Python
Published by adrianeboyd almost 3 years ago

thinc - v8.1.12: Support zero-length batches and hidden sizes in reductions

🔴 Bug fixes

  • Support zero-length batches and hidden sizes in reduce_{max,mean,sum} (#882).
  • Preserve values with dtype for NumpyOps/CupyOps.asarray (#897).

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg

- Python
Published by adrianeboyd almost 3 years ago

thinc - v8.1.11: Support Pydantic v2, update package setup

✨ New features and improvements

  • Update NumPy build constraints for NumPy v1.25 (#885).
  • Switch from distutils to setuptools/sysconfig (#888).
  • Allow Pydantic v2 using transitional v1 support (#891).

📖 Documentation and examples

  • Fix typo in example code (#879).

👥 Contributors

@adrianeboyd, @Ankush-Chander, @danieldk, @honnibal, @ines, @svlandeg

- Python
Published by adrianeboyd almost 3 years ago

thinc - v8.1.10: Lazy loading for CuPy kernels and additional CuPy and MPS improvements

✨ New features and improvements

  • Implement pad as a CUDA kernel (#860).
  • Avoid h2d - d2h roundtrip when using unflatten (#861).
  • Improve exception when CuPy/PyTorch MPS is not installed (#863).
  • Lazily load custom cupy kernels (#870).

🔴 Bug fixes

  • Initially load TorchScript models on CPU for MPS devices (#864).

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @shadeMe, @svlandeg

- Python
Published by adrianeboyd about 3 years ago

thinc - v8.1.9: Type fixes

🔴 Bug fixes

  • Fix type signature of Model.begin_update (#858).

👥 Contributors

@danieldk, @honnibal, @ines

- Python
Published by danieldk about 3 years ago

thinc - v8.1.8: New faster mapping layer and bug fixes for resizeable layer

✨ New features and improvements

  • Add premap_ids.v1 layer for mapping from ints to ints (#815).
  • Update to mypy 1.0.x (#848).

🔴 Bug fixes

  • Make resizable layer work with textcat and transformers (#820).

📖 Documentation

  • Update website including Dockerfile (#843, #844, #845).

👥 Contributors

@adrianeboyd, @danieldk, @essenmitsosse, @honnibal, @ines, @kadarakos, @patjouk, @polm, @svlandeg

- Python
Published by adrianeboyd about 3 years ago

thinc - v8.1.7: Updated layers and extended requirements

✨ New features and improvements

  • Add with_flatten.v2 layer with symmetric input/output types (#821).
  • Extend to typing_extensions v4.4.x for Python 3.6 and 3.7 (#833).

📖 Documentation

👥 Contributors

@adrianeboyd, @albertvillanova, @danieldk, @essenmitsosse, @honnibal, @ines, @shadchin, @shadeMe, @svlandeg

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.6: New and updated layers, bug fixes and more

✨ New features and improvements

  • Update to mypy 0.990 (#801).
  • Extend to wasabi v1.1 (#813).
  • Add SparseLinear.v2, to fix indexing issues (#754).
  • Add TorchScriptWrapper_v1 (#802).
  • Add callbacks to facilitate lazy-loading models in PyTorchShim (#796).
  • Make all layer defaults serializable (#808).

🔴 Bug fixes

  • Add missing packaging requirement (#799).
  • Correct sequence length error messages for reduce_first/last (#807).
  • Update CupyOps.asarray to always copy cupy arrays to the current device (#812).
  • Fix types for sequences passed to Ops.asarray* (#819).

👥 Contributors

@adrianeboyd, @danieldk, @frobnitzem, @honnibal, @ines, @richardpaulhudson, @ryndaniels, @shadeMe, @svlandeg

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.5: Updates for Python 3.11

✨ New features and improvements

  • Updates and binary wheels for Python 3.11 (#793).
  • Make __all__ static to support type checking (#780).

👥 Contributors

@adrianeboyd, @honnibal, @ines, @rmitsch

- Python
Published by adrianeboyd over 3 years ago

thinc - v7.4.6: Updates for Python 3.10 and 3.11

✨ New features and improvements

  • Updates for Python 3.10 and 3.11 (#791):
    • Update vendored wrapt to v1.14.1.
    • Update dev requirements.
    • Add wheels for Python 3.10 and 3.11.

👥 Contributors

@adrianeboyd, @honnibal, @ines

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.4: Type fixes

🔴 Bug fixes

  • Fix issue #785: Revert change to return type for Ops.alloc from #779.

👥 Contributors

@adrianeboyd, @honnibal, @ines, @svlandeg

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.3: Updates for pydantic and mypy

✨ New features and improvements

  • Extend pydantic support to v1.10.x (#778).
  • Support mypy 0.98x, drop mypy support for Python 3.6 (#776).

🔴 Bug fixes

  • Fix issue #775: Fix fix_random_seed entry point in setup.cfg.

👥 Contributors

@adrianeboyd, @honnibal, @ines, @pawamoy, @svlandeg

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.2: Update blis support and CuPy extras

✨ New features and improvements

  • Update CuPy extras to add cuda116, cuda117, cuda11x and cuda-autodetect, which uses the new cupy-wheel package (#740).
  • Add a pytest-randomly entry point for fix_random_seed (#748).

🔴 Bug fixes

  • Fix issue #772: Restrict supported blis versions to ~=0.7.8 to avoid bugs in BLIS 0.9.0.

👥 Contributors

@adrianeboyd, @honnibal, @ines, @rmitsch, @svlandeg, @willfrey

- Python
Published by adrianeboyd over 3 years ago

thinc - v8.1.1: Use confection, new layers and bugfixes

✨ New features and improvements

  • Use confection for configurations (#745).
  • Add the Dish activation function and layer (#719).
  • Add the with_signpost_interval layer to support layer profiling with macOS Instruments (#711).
  • Add remap_ids.v2 layer which allows more types of inputs (#726).
  • Extend BLIS support to version 0.9.x (#736).
  • Improve performance when gradient scaling is used (#746).
  • Improve MaxOut performance by unrolling argmax in maxout (#702).

🔴 Bug fixes

  • Fix issue #720: Improve type inference by replacing FloatsType in Ops by a TypeVar.
  • Fix issue #739: Fix typing of Ops.asarrayDf methods.
  • Fix issue #757: Improve compatibility with supported Tensorflow versions.

👥 Contributors

@adrianeboyd, @cclauss, @danieldk, @honnibal, @ines, @kadarakos, @polm, @rmitsch, @shadeMe

- Python
Published by danieldk over 3 years ago

thinc - v8.1.0: Updated types and many Ops improvements

✨ New features and improvements

  • Added support for mypy 0.950 and pydantic v1.9.0, added bound types throughout layers and ops (#599).
  • Made all NumpyOps CPU kernels generic (#627).
  • Made all custom CUDA kernels generic (#603).
  • Added bounds checks for NumpyOps (#618).
  • Fixed out-of-bounds writes in NumpyOps and CupyOps (#664).
  • Reduced unnecessary zero-init allocations (#632).
  • Fixed reductions when applied to zero-length sequences (#637).
  • Added NumpyOps.cblas to get a table of C BLAS functions (#643, #700).
  • Improved type-casting in NumpyOps.asarray (#656).
  • Simplified CupyOps.asarray (#661).
  • Fixed Model.copy() for layers used more than once (#659).
  • Fixed potential race in Shim (#677).
  • Convert numpy arrays using dlpack in xp2tensorflow and xp2torch when possible (#686).
  • Improved speed of HashEmbed by avoiding large temporary arrays (#696).
  • Added Ops.reduce_last and Ops.reduce_first (#710).
  • Numerous test suite improvements.
  • Experimental: Add support for Metal Performance Shaders with PyTorch nightlies (#685).

🔴 Bug fixes

  • Fix issue #707: Fix label smoothing threshold for to_categorical.

⚠️ Backwards incompatibilities

  • In most cases the typing updates allow many casts and ignores to be removed, but types may also need minor modifications following the updates for mypy and pydantic.
  • get_array_module now returns None for non-numpy/cupy array input rather than returning numpy by default.
  • The prefer_gpu and require_gpu functions no longer set the default PyTorch torch.Tensor type to torch.cuda.FloatTensor. This means that wrapped PyTorch models cannot assume that Tensors are allocated on a CUDA GPU after calling these functions. For example:

``` # Before Thinc v8.1.0, this Tensor would be allocated on the GPU after # {prefer,require}gpu. Now it will be allocated as a CPU tensor by default. tokenmask = torch.arange(maxseqlen)

# To ensure correct allocation, specify the device where the Tensor should be allocated. # input refers to the input of the model. tokenmask = torch.arange(maxseq_len, device=input.device) ```

This change brings Thinc's behavior in line with how device memory allocation is normally handled in PyTorch.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @koaning, @richardpaulhudson, @shadeMe, @svlandeg

- Python
Published by adrianeboyd almost 4 years ago

thinc - v8.0.17: Extended requirements, test suite fixes

✨ New features and improvements

  • Extend support for typing_extensions up to v4.1.x (for Python 3.7 and earlier).
  • Various fixes in the test suite.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @shadeMe

- Python
Published by adrianeboyd almost 4 years ago

thinc - v8.0.16: Bug fixes

✨ New features and improvements

🔴 Bug fixes

  • Fix issue #624: Support CPU inference for models trained with gradient scaling.
  • Fix issue #633: Fix invalid indexing in Beam when no states have valid transitions.
  • Fix issue #639: Improve PyTorch Tensor handling in CupyOps.asarray.
  • Fix issue #649: Clamp inputs in Ops.sigmoid to prevent overflow.
  • Fix issue #651: Fix type safety issue with model ID assignment.
  • Fix issue #653: Correctly handle Tensorflow GPU tensors in tests.
  • Fix issue #660: Make is_torch_array work without PyTorch installed.
  • Fix issue #664: Fix out of-bounds writes in CupyOps.adam and NumpyOps.adam.

⚠️ Backwards incompatibilities

  • The init implementations for layers no longer return Model.

📖 Documentation and examples

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @kadarakos, @koaning, @notplus, @richardpaulhudson, @shadeMe

- Python
Published by danieldk about 4 years ago

thinc - v8.0.15: Fix compatibility with older PyTorch versions

🔴 Bug fixes

  • Fix issue #610: Improve compatibility with PyTorch versions before v1.9.0.

👥 Contributors

@adrianeboyd, @danieldk

- Python
Published by danieldk about 4 years ago

thinc - v8.0.14: New activation functions, bug fixes and more

✨ New features and improvements

🔴 Bug fixes

  • Fix issue #552: Do not backpropagate Inf/NaN out of PyTorch layers when using mixed-precision training.
  • Fix issue #578: Correctly cast the threshold argument of CupyOps.mish and correct an equation in Ops.backprop_mish.
  • Fix issue #587: Correct invariant checks in CategoricalCrossentropy.get_grad.
  • Fix issue #592: Update murmurhashrequirement.
  • Fix issue #594: Do not sort positional arguments in Config.

⚠️ Backwards incompatibilities

  • The out keyword argument of Ops.mish and Ops.backprop_mish is replaced by inplace for consistency with other activations.

📖Documentation and examples

👥 Contributors

@adrianeboyd, @andrewsi-z, @danieldk, @honnibal, @ines, @Jette16, @kadarakos, @kianmeng, @polm, @svlandeg, @thatbudakguy

- Python
Published by danieldk about 4 years ago

thinc - v8.0.12: Bug fixes for set_ops and use_ops

🔴 Bug fixes

  • Fix issue #553: Switch torch tensor type with set_ops and use_ops.
  • Fix issue #554: Always restore original ops after use_ops.

👥 Contributors

@adrianeboyd, @danieldk, @ryndaniels, @svlandeg

- Python
Published by adrianeboyd over 4 years ago

thinc - v8.0.11: Improved GPU training time

✨ New features and improvements

  • Speed up GPU training time with up to ~25% by using cuBLAS for computing Frobenius norms in gradient clipping.
  • Give preference to AppleOps (if available) when calling get_ops("cpu").
  • Support missing values in CategoricalCrossEntropy when the labels are integers.
  • Provide the option to run model.walk with depth-first traversal.
  • Wrap forward/init callbacks of a Model in with_debug and with_nvtx_range to facilitate recursively instrumenting models.

🔴 Bug fixes

  • Fix issue #537: Fix replace_node on nodes with indirect node refs.

👥 Contributors

@adrianeboyd, @danieldk, @honnibal, @ines, @svlandeg

- Python
Published by svlandeg over 4 years ago

thinc - v8.0.10: Bug fix for get_array_ops

🔴 Bug fixes

  • Fix issue #533: Fix get_array_ops for numpy arrays.

👥 Contributors

@adrianeboyd

- Python
Published by adrianeboyd over 4 years ago

thinc - v8.0.9: Support for NVTX ranges and mypy plugin fixes

✨ New features and improvements

  • Add ops registry.
  • Enable config overrides to add new keys.
  • Allow newer releases of nbconvert and nbformat.
  • Layer for marking NVTX ranges.
  • Support mixed-precision training in the PyTorch shim (experimental).

🔴 Bug fixes

  • Fix issue #521: Fix numpy_ops gemm output.
  • Fix issue #525: Fix mypy plugin crash on variadic arguments.

👥 Contributors

@adrianeboyd, @connorbrinton, @danieldk, @honnibal, @ines, @svlandeg

- Python
Published by ines over 4 years ago

thinc - v8.0.8: CategoricalCrossentropy allows negated values

✨ New features and improvements

  • Allow negated values in CategoricalCrossentropy

- Python
Published by svlandeg almost 5 years ago

thinc - v8.0.7: Bug fixes for n-grams and typing

🔴 Bug fixes

  • Fix issue #512: Include final n-gram in NumpyOps.ngrams.
  • Fix issue #516: Update initializers for typing in numpy 1.21+.

- Python
Published by adrianeboyd almost 5 years ago

thinc - v8.0.6: Bug fix for backprop_reduce_max GPU kernel

🔴 Bug fixes

  • Fix backprop_reduce_max GPU kernel.

- Python
Published by adrianeboyd almost 5 years ago

thinc - v8.0.5: Updates for torch v1.9.0

✨ New features and improvements

  • Update to support torch v1.9.0.

- Python
Published by adrianeboyd almost 5 years ago

thinc - v8.0.4: New tuplify and resizable layers, and some bug fixes

✨ New features and improvements

  • Add tuplify layer.
  • More generic implementation of the concatenate layer.
  • Add resizable layer.
  • Introduce force parameter for model.set_dim().
  • Improve UX when setting the GPU allocator.

🔴 Bug fixes

  • Fix issue #492: Fix backpropagation in with_getitem.
  • Fix issue #494: Resolve forward refs issue with Pydantic.
  • Fix issue #496: Avoid Pydantic versions with security vulnerabilities.

👥 Contributors

@adrianeboyd, @honnibal, @ines, @kludex, @polm, @svlandeg, @thomashacker

- Python
Published by svlandeg almost 5 years ago

thinc - v8.0.3: Bug fixes for config overrides and expand_window

🔴 Bug fixes

  • Fix issue #486: Fix expand_window for empty docs on GPU
  • Fix issue #487: Require catalogue>=2.0.3 due to performance regressions related to importlib-metadata
  • Fix issue #488: Fix config override & interpolate interaction

- Python
Published by adrianeboyd about 5 years ago

thinc - v8.0.2: New map_list layer, bug fixes for saving to Pathy paths and more

✨ New features and improvements

  • Add map_list layer (#472)

🔴 Bug fixes

  • Fix issue #465: Fix saving models to Pathy paths
  • Fix issue #466: Avoid initializing with Y if X is set
  • Fix issue #470: Reset torch tensor type in require_cpu
  • Fix issue #484: Ensure consistency of nO dim for BiLSTM

- Python
Published by adrianeboyd about 5 years ago

thinc - v8.0.1: Bug fixes for list2padded and LayerNorm

🔴 Bug fixes

  • Fix issue #464: Fix list2padded op
  • Add nO to LayerNorm

- Python
Published by adrianeboyd about 5 years ago

thinc - v8.0.0: Full rewrite, compose models using any framework such as PyTorch or TensorFlow, built-in type checking, config system and more

🔮 This version of Thinc has been rewritten from the ground up and will be used to power the upcoming spaCy v3.0. The new Thinc v8.0 is a lightweight deep learning library that offers an elegant, type-checked, functional-programming API for composing models, with support for layers defined in other frameworks such as PyTorch, TensorFlow or MXNet. You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. For more details, see the documentation.

✨ New features and improvements

  • Use any framework: Switch between PyTorch, TensorFlow and MXNet models without changing your application, or even create mutant hybrids using zero-copy array interchange.
  • Type checking: Develop faster and catch bugs sooner with sophisticated type checking. Trying to pass a 1-dimensional array into a model that expects two dimensions? That’s a type error. Your editor can pick it up as the code leaves your fingers.
  • Config system: Configuration is a major pain for ML. Thinc lets you describe trees of objects with references to your own functions, so you can stop passing around blobs of settings. It's simple, clean, and it works for both research and production.
  • Super lightweight: Small and easy to install with very few required dependencies, available on pip and conda for Linux, macOS and Windows. Simple source with a consistent API.
  • Concise functional-programming approach to model definition using composition rather than inheritance.
  • First-class support for variable-length sequences: multiple built-in sequence representations and your layers can use any object.

- Python
Published by ines over 5 years ago

thinc - v7.4.5: Fix numpy compatibility in binary wheels

🔴 Bug fixes

  • Fix numpy compatibility in binary wheel releases.
  • Fix cupy-cuda111 extra requirement.

- Python
Published by adrianeboyd over 5 years ago

thinc - v7.4.4: Update for cupy v8 and update package setup

🔴 Bug fixes

  • Update for compatibility with cupy v8.
  • Remove f-strings from PyTorchWrapper.
  • Remove detailed numpy build constraints from pyproject.toml.
  • Update Cython extension setup.

- Python
Published by adrianeboyd over 5 years ago

thinc - v7.4.3: Fix memory leak in Beam and random seed in ParametricAttention

✨ New features and improvements

  • Add seed argument to ParametricAttention.
  • Dynamically include numpy headers and add numpy build constraints.
  • Update tests to support hypothesis v5.

🔴 Bug fixes

  • Fix memory leak in Beam.

- Python
Published by adrianeboyd over 5 years ago

thinc - v7.4.2: Update compatible cupy versions and for python 3.9

🔴 Bug fixes

  • Restrict compatible cupy versions to <8.0.0.
  • Update setup for python 3.9.

- Python
Published by adrianeboyd over 5 years ago

thinc - v7.4.1: Fix OOV vectors bug

🔴 Bug fixes

  • Use 0-vector for OOV in StaticVectors to fix similarity bug in spaCy
  • Fix murmurhash on platforms where long type was not 64 bit

- Python
Published by honnibal about 6 years ago

thinc - v7.3.1: Relax dependecy requirements

🔴 Bug fixes

  • Relax version range of plac to match spaCy.

- Python
Published by ines over 6 years ago

thinc - v7.3.0: Mish activation and experimental optimizers

✨ New features and improvements

  • Add Mish activation. Use via the thinc.v2v.Mish layer, which computes f(X) = mish(W @ X + b). CUDA and Cython kernels are included to make the activation efficient.
  • Add experimental support for RAdam to the optimizer. Enable it with the keyword argument use_radam to True. In preliminary testing, it's a small change that's worth enabling.
  • Add experimental support for Lookahead to the optimizer. Enable it by setting the keyword argument lookahead_k to a positive integer. In preliminary testing, it helps if you're not using parameter averaging, but with averaging it's a bit worse.
  • Add experimental support for LARS to the optimizer. Enable it by setting use_lars to True. In preliminary testing, this hasn't worked well at all – possibly our implementation is broken.

🙏 Acknowledgements

Big thanks to @digantamisra98 for the Mish activation, especially the extensive experiments and simple gradient calculation. We expect to be using the activation in the next round of spaCy models.

Gratitude to the fast.ai community for their crowd-sourced experiments, and especially to users @LessW2020, @MGrankin and others for their optimizer implementations, which we referenced heavily when implementing the optimizers for Thinc. More importantly, it's super helpful to have a community filtering the deluge of papers for techniques that work on a few different datasets. This thread on optimization research was particularly helpful.

- Python
Published by ines over 6 years ago

thinc - v7.2.0: Simpler GPU install and bug fixes

✨ New features and improvements

  • Ditch thinc_gpu_ops for simpler GPU install.
  • Improve GPU support and PyTorch wrapper.

🔴 Bug fixes

  • Fix issue #47: Fix ExtractWindow nW>=2.
  • Fix issue #51: Ditch thinc_gpu_ops for simpler GPU install.
  • Fix issue #88: Fix Quora URL in datasets.
  • Fix issue #115: Fix compilation on cygwin.

👥 Contributors

Thanks to @rupsaijna and @KoichiYasuoka for the pull requests!

- Python
Published by ines over 6 years ago

thinc - v7.1.1: Support preshed v3.0.0

✨ New features and improvements

  • Allow support for preshed v3.0.0, which includes some bug fixes when items are deleted from the table, and also features Bloom filters.
  • Use collections.abc when possible and avoid deprecation warning.

👥 Contributors

Thanks to @hervenicol for the pull request!

- Python
Published by honnibal over 6 years ago

thinc - v7.1.0: Support other CPUs, read-only arrays

✨ New features and improvements

  • Support read-only numpy arrays, by specifying const in Cython memory-view types. Read-only arrays are helpful for shared-memory multiprocessing, e.g. from Apache Arrow's Plasma object store.

  • Update to cython-blis v0.4, which supports non-x8664 CPU architectures. For wide (but slow) support, you can specify the environment variable `BLISARCH=generic` before installing.

- Python
Published by honnibal almost 7 years ago

thinc - v7.0.8: Fix version for PyPi

🔴 Bug fixes

  • Fix version number for PyPi.

- Python
Published by ines almost 7 years ago

thinc - v7.0.7: Avoid allocating a negative shape for ngrams

🔴 Bug fixes

  • Avoid allocating a negative shape for ngrams.

👥 Contributors

Thanks to @svlandeg for the pull request!

- Python
Published by ines almost 7 years ago

thinc - v7.0.6: Fix LinearModel regression

🔴 Bug fixes

  • Fix regression in LinearModel class introduced in v7.0.5.

- Python
Published by ines almost 7 years ago

thinc - v7.0.5: Bug fixes for pickle, threading, unflatten and consistency

🔴 Bug fixes

  • Fix issue #98: Fix syntax error in CPickle import.
  • Fix issue #102: Fix bug that could make HashEmbed results inconsistent across runs.
  • Fix issue #104: Fix unflatten padding when last element is empty.
  • Fix issue #97: Pickling error on LinearModel.
  • Fix issue with creating Model instances in child threads with operator overloading.

👥 Contributors

Thanks to @giannisdaras, @simonhkswan, @chssch and @svlandeg for the pull requests and contributions.

- Python
Published by ines almost 7 years ago

thinc - v7.0.4: Don't require thinc_gpu_ops

🔴 Bug fixes

  • Don't require thinc_gpu_ops.

- Python
Published by ines almost 7 years ago

thinc - v7.0.3: Fix pruning in beam search

🔴 Bug fixes

  • Fix incorrect calculation of min_density in thinc.search.Beam class. Previously the beam was pruned based on the raw logit scores, instead of normalized probabilities.

- Python
Published by honnibal about 7 years ago

thinc - v7.0.2: Fix regression in linear model class

🔴 Bug fixes

  • Fix regression in thinc.linear.LinearModel class.

- Python
Published by ines over 7 years ago

thinc - v7.0.1: Fix import errors

🔴 Bug fixes

  • Fix import errors introduced when dropping dependencies in v7.0.0.

- Python
Published by ines over 7 years ago

thinc - v7.0.0: Overhaul package dependencies

⚠️ Backwards incompatibilities

  • Thinc v7.0 drops support for Python 2.7 on Windows. Python 2.7 remains supported on Linux and OSX. Support could be restored in future. We're currently unable to build our new dependency, blis, for Windows on Python 2.7. If you can assist with this, please let us know.

✨ New features and improvements

  • Use blis for matrix multiplication. Previous versions delegated matrix multiplication to platform-specific libraries via numpy. This led to inconsistent results, especially around multi-threading. We now provide a standalone package, with the Blis linear algebra routines. Importantly, we've built Blis to be single-threaded. This makes it much easier to do efficient inference, as the library will no longer spawn threads underneath you.

  • Use srsly for serialization. We now provide a single package with forks of our preferred serialisation libraries – specifically, msgpack, ujson and cloudpickle. This allows us to provide a single binary wheel for these dependencies, and to maintain better control of our dependency tree, preventing breakages.

  • Update versions of cymem, preshed and murmurhash. Thinc is compiled against our memory pool and hash table libraries, cymem and preshed. Changing these build-time dependencies requires Thinc to be recompiled. This is one reason the major version number needed to be incremented for this release.

- Python
Published by honnibal over 7 years ago

thinc - v6.12.1: Fix messagepack pin

🔴 Bug fixes

  • Fix issue explosion/spaCy#2995: Pin msgpack to version <0.6.0, to avoid the low message-length limit introduced in v0.6.0, which breaks spaCy. We will relax the pin once spaCy is updated to set the max_xx_len argument to msgpack.dumps()

- Python
Published by honnibal over 7 years ago

thinc - v6.12.0: Wheels and separate GPU ops

✨ New features and improvements

  • Update dependencies to be able to provide binary wheels.
  • Move GPU ops to separate package, thinc_gpu_ops.
  • Support pip specifiers for GPU installation, e.g. pip install thinc[cuda92].

🔴 Bug fixes

  • Update murmurhash pin to accept newer version.

- Python
Published by ines over 7 years ago

thinc - v6.10.3: Python 3.7 support and dependency updates

✨ New features and improvements

  • Update cytoolz version pin to make Thinc compatible with Python 3.7.
  • Only install old pathlib backport on Python 2 (see #69).
  • Use msgpack instead of msgpack-python.
  • Drop termcolor dependency.

- Python
Published by ines almost 8 years ago

thinc - v6.11.2: Improve GPU installation

✨ New features and improvements

You can now require GPU capability using the pip "extras" syntax. Thinc also now expects CUDA to be installed at /usr/local/cuda by default. If you've installed it elsewhere, you can specify the location with the CUDA_HOME environment variable. Once Thinc is able to find CUDA, you can tell pip to install Thinc with cupy, as follows:

  • thinc[cuda]: Install cupy from source (compatible with a range of cuda versions)
  • thinc[cuda80]: Install the cupy-cuda80 wheel
  • thinc[cuda90]: Install the cupy-cuda90 wheel
  • thinc[cuda91]: Install the cupy-cuda91 wheel

If you're installing Thinc from a local wheel file, the syntax for adding an "extras" specifier is a bit unintuitive. The trick is to make the file path into a URL, so you can use an #egg clause, as follows:

bash pip install file://path/to/wheel#egg=thinc[cuda]

- Python
Published by ines about 8 years ago

thinc - 6.11.1: Support direct linkage to BLAS libraries

✨ New features and improvements

  • Thinc now vendorizes OpenBLAS's cblas_sgemm function, and delegates matrix multiplications to it by default. The provided function is single-threaded, making it easy to call Thinc from multiple processes. The default sgemm function can be overridden using the THINC_BLAS environment variable --- see below.
  • thinc.neural.util.get_ops now understands device integers, e.g. 0 for GPU 0, as well as strings like "cpu" and "cupy".
  • Update StaticVectors model, to make use of spaCy v2.0's Vectors class.
  • New .gemm() method on NumpyOps and CupyOps classes, allowing matrix and vector multiplication to be handled with a simple function. Example usage:

Customizing the matrix multiplication backend

Previous versions of Thinc have relied on numpy for matrix multiplications. When numpy is installed via wheel using pip (the default), numpy will usually be linked against a suboptimal matrix multiplication kernel. This made it difficult to ensure that Thinc was well optimized for the target machine.

To fix this, Thinc now provides its own matrix multiplications, by bundling the source code for OpenBLAS's sgemm kernel within the library. To change the default BLAS library, you can specify an environment variable, giving the location of the shared library you want to link against:

```bash THINCBLAS=/opt/openblas/lib/libopenblas.so pip install thinc --no-cache-dir --no-binary export LDLIBRARY_PATH=/opt/openblas/lib

On OSX:

export DYLDLIBRARYPATH=/opt/openblas/lib

```

If you want to link against the Intel MKL instead of OpenBLAS, the easiest way is to install Miniconda. For instance, if you installed miniconda to `/opt/miniconda', the command to install Thinc linked against MKL would be:

```bash THINCBLAS=/opt/miniconda/numpy-mkl/lib/libmklrt.so pip install thinc --no-cache-dir --no-binary export LDLIBRARYPATH=/opt/miniconda/numpy-mkl/lib

On OSX:

export DYLDLIBRARYPATH=/opt/miniconda/numpy-mkl/lib

```

If the library file ends in a .a extension, it is linked statically; if it ends in .so, it's linked dynamically. Make sure you have the directory on your LD_LIBRARY_PATH at runtime if you use the dynamic linking.

🔴 Bug fixes

  • Fix pickle support for FeatureExtracter class.
  • Fix unicode error in Quora dataset loader.
  • Fix batch normalization bugs. Now supports batch "renormalization" correctly.
  • Models now reliably distinguish predict vs. train modes, using the convention drop=None. Previously, layers such as BatchNorm relied on having their predict() method called, which didn't work they were called by layers which didn't implement a predict() method. We now set drop=None to make this more reliable.
  • Fix bug that caused incorrect data types to be produced by FeatureExtracter.

👥 Contributors

Thanks to @dvsrepo, @justindujardin, @alephmelo and @darkdreamingdan for the pull requests and contributions.

- Python
Published by honnibal about 8 years ago

thinc - v6.10.2: Efficiency improvements and bug fixes

✨ New features and improvements

  • Improve GPU utilisation for attention layer.
  • Improve efficiency of Maxout layer on CPU.

🔴 Bug fixes

  • Bug fix to foreach combinator, useful for hierarchical models.
  • Bug fix to batch normalization.

📖 Documentation and examples

  • Update imdb_cnn text classification example.

- Python
Published by ines over 8 years ago

thinc - v6.10.1: Fix GPU install and minor memory leak

🔴 Bug fixes

  • Fix installation with CUDA 9.
  • Fix minor memory leak in beam search.
  • Fix dataset readers.

- Python
Published by ines over 8 years ago

thinc - v6.10.0: CPU efficiency improvements, refactoring

✨ Major features and improvements

  • Provisional CUDA 9 support. CUDA 9 removes a compilation flag we require for CUDA 8. As a temporary workaround, you can build on CUDA 9 by setting the environment variable CUDA9=1. For example:

bash CUDA9=1 pip install thinc==6.10.0 * Improve efficiency of NumpyOps.scatter_add, when the indices only have a single dimension. This function was previously a bottle-neck for spaCy. * Remove redundant copies in backpropagation of maxout non-linearity * Call floating-point versions of sqrt, exp and tanh functions. * Remove calls to tensordot, instead reshaping to make 2d dot calls. * Improve efficiency of Adam optimizer on CPU. * Eliminate redundant code in thinc.optimizers. There's now a single Optimizer class. For backwards compatibility, SGD and Adam functions are used to create optimizers with the Adam recipe or vanilla SGD recipe.

👥 Contributors

Thanks to @RaananHadar for the pull request!

- Python
Published by honnibal over 8 years ago

thinc - v6.9.0: Reorganize layers, bug fix to Layer Normalization

✨ Major features and improvements

  • Add new namespace modules thinc.v2v, thinc.i2v, thinc.t2t, thinc.t2v that group layer implementations by input and output type v indicates vector, i indicates integer ID, t indicates tensor. The input type refers to the logical unit, i.e. what constitutes a sample.

🔴 Bug fixes

  • Fix bug in layer normalization. The bug fix means that models trained with Thinc 6.8 are incompatible with Thinc 6.9. For convenience, a backwards compatibility flag has been added, which can be set with thinc.neural._classes.layernorm.set_compat_six_eight. This flag is off by default.

- Python
Published by honnibal over 8 years ago

thinc - v6.8.2: Fix packaging of gpu_ops

🔴 Bug fixes

  • Fix incorrect packaging of thinc.neural.gpu_ops, introduced in v6.8.1.
  • Fix bad data type in thinc.extra.search.MaxViolation, which caused segfaults on some platforms.

- Python
Published by honnibal over 8 years ago

thinc - v6.8.1: Fix Windows support

✨ Major features and improvements

  • Add new foreach layer combinator, which maps a layer across elements of a sequence.
  • Add support for predict methods to more layers, for use during decoding.
  • Improve correctness of batch normalization. Previously, some layers would force batch normalization to run in training mode, even during prediction. This led to decreased accuracy in some situations.
  • Improved efficiency of Maxout layer.

🔴 Bug fixes

  • Fix compiler flags for MSVC
  • Remove unnecessary Chainer dependency. Now depends on Chainer's cupy package.
  • Fix LSTM layer.
  • Small bug fixes to beam search

- Python
Published by honnibal over 8 years ago

thinc - v6.8.0: SELU layer, attention, improved GPU/CPU compatibility

✨ Major features and improvements

  • Add SELU layer, from Klambauer et al. (2017).
  • Add parametric soft attention layer, as in Yang et al. (2016).
  • New higher-order function uniqued, which wraps layers giving them a per-batch cache.
  • Improve batch normalization, by tracking activation moving averages.

🔴 Bug fixes

  • Fix GPU usage in pooling operations.
  • Add optimized code for extracting ngram features.
  • Improve CPU/GPU compatibility.
  • Improve compatibility of LinearModel class.

👥 Contributors

Thanks to @tammoippen for the pull request!

- Python
Published by ines almost 9 years ago

thinc - v6.7.3: Fix convolution on GPU

🔴 Bug fixes

  • Convolution is now computed the same on CPU and GPU.

- Python
Published by ines almost 9 years ago

thinc - v6.7.2: Bug fixes to serialization

🔴 Bug fixes

  • Make order of dicts stable when serializing model.

- Python
Published by ines almost 9 years ago

thinc - v6.7.1: Improve serialization

✨ Major features and improvements

  • Temporarily revert change to CuPy.
  • Improve efficiency of Adam optimizer.

- Python
Published by ines almost 9 years ago

thinc - v6.7.0: Fixes to serialization, hash embeddings and flatten ops

✨ Major features and improvements

  • Add Model.to_bytes() and Model.from_bytes() methods, to support serialization that's compatible between Python versions.
  • Remove code depending on Chainer, and instead depend explicitly on the new cupy subpackage, for simpler GPU installation.
  • Improve accuracy for HashEmbed table, by using 4 conditionally independent keys.
  • Support padding in flatten and with_flatten ops.
  • Use the same hash function on both CPU and GPU, for model compatibility.

🔴 Bug fixes

  • HashEmbed now returns correct results for arrays of length not divisible by 16.
  • Provide .cu source files in the source distribution.
  • Remove unnecessary allocations from the CPU maxout op.
  • Fix issue #27: Remove Python2-specific code from setup.py.

- Python
Published by honnibal almost 9 years ago

thinc - v6.6.0: Improved GPU usage and examples

✨ Major features and improvements

  • Add GPU kernels for max and mean pool using variable-length sequences.
  • thinc.api.FeatureExtractor, for getting features from spaCy Doc objects.

🔴 Bug fixes

  • Improve multi-device handling
  • thinc.api.add now accepts a variable number of layers.
  • Improve Residual class.

⚠️ Backwards incompatibilities

📖 Documentation and examples

- Python
Published by ines about 9 years ago

thinc - v6.5.1: Improved linear class and Windows fix

✨ Major features and improvements

  • Add hash kernel linear class.

🔴 Bug fixes

  • Fix issue #22: Remove random_bytes method from Ops.
  • Fix termcolor dependency.

📖 Documentation and examples

👥 Contributors

Thanks to @rolando and @ogrisel for the pull requests!

- Python
Published by ines about 9 years ago

thinc - v6.5.0: Supervised similarity, fancier embedding and improvements to linear model

✨ Major features and improvements

  • Improve GPU support.
  • Add classes for siamese neural network architectures for supervised similarity.
  • Add HashEmbed class, an embedding layer which uses the hashing trick to support a larger vocabulary in a shorter table.
  • Add support for distinct feature columns in the Embed class.

🔴 Bug fixes

  • Fix model averaging for linear model.
  • Fix resume_training() method for linear model.
  • Fix L1 penalty for linear model.

📖 Documentation and examples

- Python
Published by honnibal about 9 years ago

thinc - v6.3.0: Efficiency improvements, argument checking and error messaging

✨ Major features and improvements

  • NEW: Add thinc.check module to specify argument constraints for functions and methods.
  • NEW: Add thinc.exceptions module with custom exception messaging.
  • Add LSUV initialisation.
  • Add averaged parameters, for reduced hyper-parameter sensitivity.
  • Improve efficiency of maxout, window extraction and dropout.

📋 Tests

  • Reorganise and improve tests.
  • Reach 100% coverage over the entire package.

- Python
Published by honnibal over 9 years ago

thinc - v6.2.0: Improve API and introduce overloaded operators

✨ Major features and improvements

  • NEW: Model now has define_operators() classmethod to overload operators for a given block.
  • Add chain(), clone() and concatenate() functions for use with overloaded operators.
  • Add describe module which provides class decorators for defining new layers.
  • Allow layers to calculate input and output sizes based on training data.

Together, these features allow very concise model definitions:

python with Model.define_operators({'**': clone, '>>': chain}): model = BatchNorm(ReLu(width)) ** depth >> Softmax()

⚠️ Backwards incompatibilities

  • Major revisions to previously undocumented neural network APIs (see above).

📋 Tests

  • Reorganise and improve tests for neural network functions.
  • Reach 100% coverage over the current neural network classes.

- Python
Published by honnibal over 9 years ago

thinc - v6.1.3: More neural network functions and training continuation

✨ Major features and improvements

  • NEW: Add several useful higher-order functions, including @layerize and @metalayerize decorators to turn functions into weightless layers.
  • NEW: Add batch normalization layer.
  • NEW: Add residual layer using pre-activation approach.
  • Simplify model setup and initialization.
  • Add ELU layer.

🔴 Bug fixes

  • The AveragedPerceptron class can now continue training after model loading. Previously, the weights were zeroed for each feature as soon as it was updated. This affected spaCy users, especially those adding new classes to the named entity recognizer.

📖 Documentation and examples

- Python
Published by honnibal over 9 years ago

thinc - v6.0.0: Add thinc.neural for NLP-oriented deep learning

✨ Major features and improvements

  • NEW: Add thinc.neural to develop neural networks for spaCy.
  • Introduce support for Affine, Maxout, ReLu and Softmax vector-to-vector layers.
  • Introduce support for efficient static word embedding layer with projection matrix and per-word-type memoisation.
  • Introduce support for efficient word vector convolution layer, which also supports per-word-type memoisation.
  • Introduce support for MeanPooling, MaxPooling and MinPooling. Add MultiPooling layer for concatenative pooling.
  • Introduce support for annealed dropout training.
  • Introduce support for classical momentum, Adam and Eve optimisers.
  • Introduce support for averaged parameters for each optimiser.

⚠️ Backwards incompatibilities

The Example class now holds a pointer to its ExampleC struct, where previously it held the struct value. This introduces a small backwards incompatibility in spaCy.

- Python
Published by honnibal over 9 years ago