Recent Releases of pytorch-widedeep

pytorch-widedeep - mps backend support and more rec models

Added support for MPS backend Added a series of models to the rec module: DCN, DCNv2, GDCN, AutoInt, AutoIntPlus Added a DIN preprocessor Reviewed the docs Reviewed the examples Other (minor and not so minor) fixes

Scientific Software - Peer-reviewed - Python
Published by jrzaurin about 1 year ago

pytorch-widedeep - The `rec` module

  • After a number of issues opened and questions in slack about recommendation algorithms in the library I decided to include a rec module that initially contains a small number of recommendation algorithms. These are:

  • Factorisation Machines (FM) and DeepFM

  • Field Aware Factorisation Machines (FFM) and DeepFFM

  • Extreme Deep Factorisation Machines (xDeepFM)

  • Deep Interest Networks (DIN)

We will add more in the near future.

  • In addition some bugs were fixed (https://github.com/jrzaurin/pytorch-widedeep/issues/232 and https://github.com/jrzaurin/pytorch-widedeep/issues/233)

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 1 year ago

pytorch-widedeep - Multiple tabular components

  1. Added support to multiple tabular models for different columns (that adds to multiple text and image cols from previous versions)
  2. Removed support for FDS and LDS
  3. Carries the possibility of saving the optimiser which was added in the version 1.6.2 (short-life and never published)

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 1 year ago

pytorch-widedeep - Patch to limit numpy to version lower than 2.0

This is a quick patch to fix numpy>=1.21.6, <2.0.0

Otherwise, is exactly the same as 1.6.0

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 1 year ago

pytorch-widedeep - Huggingface integration, multi-text and image column support and multi target loss functions

What's Changed

  • Huggingface integration by @jrzaurin in https://github.com/jrzaurin/pytorch-widedeep/pull/209
  • Multi text and image column support by @jrzaurin in https://github.com/jrzaurin/pytorch-widedeep/pull/215
  • Support for multi target loss functions by @jrzaurin in https://github.com/jrzaurin/pytorch-widedeep/pull/215
  • README has been almost completely re-written, with drawings of 7 possible architectures (where the boxes/component can be any of the models in the library) and fully runnable examples with a toy dataset that anyone can use as a starting point.

Full Changelog: https://github.com/jrzaurin/pytorch-widedeep/compare/v1.5.1...v1.6.0

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 1 year ago

pytorch-widedeep - Model Attributes named correctly

Mostly fixed issue #204

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 1 year ago

pytorch-widedeep - Embedding Methods for Numerical Features

Added two new embedding methods for numerical features described in On Embeddings for Numerical Features in Tabular Deep Learning and adjusted all models and functionalities accordingly

Scientific Software - Peer-reviewed - Python
Published by jrzaurin almost 2 years ago

pytorch-widedeep - The `load_from_folder` module

This release mainly adds the functionality to be able to deal with large datasets via the load_from_folder module.

This module is inspired by the ImageFolder class in the torchvision library but adapted to the needs of our library. See the docs for details.

Scientific Software - Peer-reviewed - Python
Published by jrzaurin about 2 years ago

pytorch-widedeep - Flash and Linear Attention mechanisms added to the TabTransformer

  1. Added Flash Attention
  2. Added Linear Attention
  3. Revisited and polished the docs

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 2 years ago

pytorch-widedeep - pytorch-widedeep in the context of recsys

  1. Added example scripts and notebooks on how to use the library in the context of recommendation systems using this notebook as example. This is a response to issue #133
  2. Used the opportunity to add the movielens 100k dataset to the library, so that now it can be imported from the datasets module
  3. Added a simple (not pre-trained) transformer model to to the text component
  4. Added citation file
  5. Fix a bug regarding the padding index not being 1 when using the fastai transforms

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 2 years ago

pytorch-widedeep - Feature Importance via attention weights

  • Added a new functionality to access feature importance via attention weights for all DL models for Tabular data except for the TabPerceiver. This functionality is accessed via the feature_importance attribute in the trainer (computed during training with a sample of observations) and at predict time via de explain method.
  • Fix all restore weights capabilities in all forms of training. Such capabilities are present in two callbacks, the EarlyStopping and the ModelCheckpoint Callbacks. Prior to this release there was a bug and the weights were not restored.

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 2 years ago

pytorch-widedeep - pytorch-widedeep: A flexible package for multimodal-deep-learning

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 2 years ago

pytorch-widedeep - HuggingFace model example and fixed bug related to the option of adding a FC head

  1. Fixed a bug related to the option of adding a FC head on top of the "backbone" models
  2. Added a notebook to illustrate how one could use a Hugginface model along with any other model in the library

Scientific Software - Peer-reviewed - Python
Published by jrzaurin almost 3 years ago

pytorch-widedeep - Fixed the implementation of the Additive Attention

Simple minor release fixing the implementation of the additive attention (see #110 )

Scientific Software - Peer-reviewed - Python
Published by jrzaurin about 3 years ago

pytorch-widedeep - Self-Supervised Pre-Training for Tabular models

There are a number of changes and new features in this release, here is a summary:

  1. Refactored the code related to the 3 forms of training in the library:

    • Supervised Training (via the Trainer class)
    • Self-Supervised pre-training: we have implemented two methods or routines for self-supervised pre-training. These are:
      • Encoder-Decoder Pre-Training (via the EncoderDecoderTrainer class): this is inspired by the TabNet paper
      • Constrastive-Denoising Pre-Training (via de ConstrastiveDenoising class): this is inspired by the SAINT paper
      • Bayesian or Probabilistic Training (via the BayesianTrainer: this is inspired by the paper Weight Uncertainty in Neural Networks

    Just as a reminder, the current deep learning models for tabular data available in the library are: - Wide - TabMlp - TabResNet - TabNet - TabTransformer - FTTransformer - SAINT - TabFastformer - TabPerceiver - BayesianWide - BayesianTabMlp

  2. The text related component has now 3 available models, all based on RNNs. There are reasons for that although the integration with the Hugginface Transformer library is the next step in the development of the library. The 3 models available are:

    • BasicRNN
    • AttentiveRNN
    • StackedAttentiveRNN

    The last two are based on Hierarchical Attention Networks for Document Classification. See the docs for details

  3. The image related component is now fully integrated with the latest torchvision release, with a new Multi-Weight Support API. Currently, the model variants supported by our library are:

    • resnet
    • shufflenet
    • resnext
    • wide_resnet
    • regnet
    • densenet
    • mobilenet
    • mnasnet
    • efficientnet
    • squeezenet

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 3 years ago

pytorch-widedeep - Move docs to mkdocs

Simply Update all documentation

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 3 years ago

pytorch-widedeep - Probabilistic Models and Label/Feature Distribution smoothing

This release fixes some minor bugs but mainly brings a couple of new functionalities:

  1. New experimental Attentive models, namely: ContextAttentionMLP and SelfAttentionMLP.
  2. 2 Probabilistic models based on Bayes by Backprop (BBP) as described in Weight Uncertainty in Neural Networks, namely: BayesianTabMlp and BayesianWide.
  3. Label and Feature Distribution Smoothing (LDS and FDS) for Deep Imbalanced Regression (DIR) as described in Delving into Deep Imbalanced Regression
  4. Better integration with torchvision for the deepimage component of a WideDeep model
  5. 3 Available models for the deeptext component of a WideDeep model. Namely: BasicRNN, AttentiveRNN and StackedAttentiveRNN

Scientific Software - Peer-reviewed - Python
Published by jrzaurin almost 4 years ago

pytorch-widedeep - Transformers without categorical data

This minor release simply fixes issue #53 related to the fact that SAINT, the FT-Transformer and the TabFasformer failed when the input data had no categorical columns

Scientific Software - Peer-reviewed - Python
Published by jrzaurin about 4 years ago

pytorch-widedeep - v1.0.9: The TabFormer Family Grows

Functionalities:

  • Added a new functionality called Tab2Vec that given a trained model and a fitted Tabular Preprocessor it will return an input dataframe transformed into embeddings

TabFormers: Increased the Tabformer (Transformers for Tabular Data) family

  • Added a proper implementation of the FT-Transformer with Linear Attention (as introduced in the Linformer paper)
  • Added a TabFastFormer model, an adaptation of the FastFormer for Tabular Data
  • Added a TabPerceiver model, an adaptation of the Perceiver for Tabular Data

Docs

  • Refined the docs to make them cleared and fix a few typos

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 4 years ago

pytorch-widedeep - SAINT and the FT-Transformer

The two main additions to the library are:

In addition

  • New DataLoader for imbalanced dataset. See here.
  • Integration with torchmetrics.

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 4 years ago

pytorch-widedeep - Tabnet and v1 ready

This release represents a major step forward for the library in terms of functionalities and flexibility:

  1. Ported TabNet from the fantastic implementation of the guys at dreamquark-ai.
  2. Callbacks are now more flexible and save more information.
  3. The save method in the Trainer is more flexible and transparent
  4. The library has extensively been tested via experiments against LightGBM (see here)

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 4 years ago

pytorch-widedeep - v0.4.8: WideDeep with the TabTransformer

This release represents an almost-complete refactor of the previous version and I consider the code in this version well tested and production-ready. The main reason why this release is not v1 is because I want to use it with a few more datasets, but at the same time I want the version to be public to see if others use it. Also, I want the changes from the last Beta and v1 to be not too significant.

This version is not backwards compatible (at all).

These are some of the structural changes:

  • Building of the model and training the model and now completely decoupled
  • Added the TabTransformer as a potential deeptabular component
  • Renamed many of the parameters so that they are consistent between models
  • Added the possibility of customising almost every single component: model component, losses, metrics and callbacks
  • Added R2 metrics for regression problems

Scientific Software - Peer-reviewed - Python
Published by jrzaurin almost 5 years ago

pytorch-widedeep - v0.4.7: individual components can run independently and image treatment replicates that of Pytorch

The treatment of the image datasets in WideDeepDataset replicates that of Pytorch. In particular this source code:

if isinstance(pic, np.ndarray):
    # handle numpy array
    if pic.ndim == 2:
        pic = pic[:, :, None]

In addition, I have added the possibility of using each of the model components in isolation and independently. This is, one could now use the wide, deepdense (either DeepDense or DeepDenseResnet), deeptext and deepimage independently.

Scientific Software - Peer-reviewed - Python
Published by jrzaurin about 5 years ago

pytorch-widedeep - v0.4.6: Added `DeepDenseResnet` and increased code coverage

As suggested in issue #26 , I have added the possibility of the deepdense component that receives the embeddings from categorical columns and the continuous columns being a series of Dense ResNet blocks. This is all available via the class DeepDenseResnet and used identically than before:

```python deepdense = DeepDenseResnet(...)

model = WideDeep(wide=wide, deepdense=deepdense) ```

In addition, code coverage has increased to 91%

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 5 years ago

pytorch-widedeep - v0.4.5: Faster, memory efficient Wide component

Version 0.4.5 includes a new implementation of the Wide Linear component via an Embedding layer. Previous versions implemented this component using a Linear layer that received one hot encoded features. For large datasets, this was slow and was not memory efficient (See #18 ). Therefore, we decided to replace such implementation with an Embedding layer that receives label encoded features. Note that although the two implementations are equivalent, the latter is indeed faster and moreover significantly more memory efficient.

Also mentioning that the printed loss in the case of Regression is no longer RMSE but MSE. This is done for consistency with the metrics saved in the History callback.

NOTE: this does not change a thing in terms of how one would use the package. pytorch-widedeep can be used in the exact same way as previous versions. However, since the model components have changed, models generated with previous versions are not compatible with this version.

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 5 years ago

pytorch-widedeep - v0.4.2: Added more metrics

Added Precision, Recall, FBetaScore and Fscore.

Metrics available are: Accuracy, Precision, Recall, FBetaScore and Fscore

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 5 years ago

pytorch-widedeep - v0.4.1. Added Docs

Added Documentation. Improved code quality and fixed a bug related to the Focal Loss

Scientific Software - Peer-reviewed - Python
Published by jrzaurin over 5 years ago

pytorch-widedeep -

Scientific Software - Peer-reviewed - Python
Published by jrzaurin almost 6 years ago