rapidae

Explore, compare and develop autoencoder models with a back-end agnostic framework

https://github.com/nahuelcostacortez/rapidae

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, sciencedirect.com, science.org, ieee.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary

Keywords

autoencoder benchmarking jax keras3 pytorch recurrent-autoencoder reproducibility tensorflow vae variational-autoencoder
Last synced: 6 months ago · JSON representation ·

Repository

Explore, compare and develop autoencoder models with a back-end agnostic framework

Basic Info
Statistics
  • Stars: 16
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
autoencoder benchmarking jax keras3 pytorch recurrent-autoencoder reproducibility tensorflow vae variational-autoencoder
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Rapidae: Python Library for Rapid Creation and Experimentation of Autoencoders

Documentation Status License made-with-python

🔗 Documentation | 🔗 PyPI Package

Description 📕

Rapidae is a Python library specialized in simplifying the creation and experimentation of autoencoder models. With a focus on ease of use, this library allows users to explore and develop autoencoder models in an efficient and straightforward manner.

I decided to develop this library to optimize my research workflow and provide a comprehensive resource for educators and learners exploring autoencoders.

As a researcher, I often found myself spending time on repetitive tasks, such as creating project structures or replicating baseline models. (I've lost count of how many times I've gone through the Keras VAE tutorial just to copy the model as a baseline for other experiments.)

As an educator, despite recognizing numerous fantastic online resources, I felt the need for a place where the features I consider important for teaching these models are consolidated: explanation, implementation, and versatility across different backends. The latter is particularly crucial, considering that PyTorch practitioners may find tedious to switch to TensorFlow, and vice versa. With the recently released Keras 3, Rapidae ensures that the user is met with a seamless and engaging experience, enabling to focus on model creation rather than backend specifics.

In summary, this library is designed to be simple enough for educational purposes, yet robust for researchers to concentrate on developing their models and conducting benchmark experiments in a unified environment.

[!NOTE] Shout out to Pythae, which provides an excellent library for experimenting with VAEs . If you're looking for a quick way to implement autoencoders for image applications, Pythae is probably your best option. Rapidae differs from Pythae in the following ways: - It is built on Keras 3, allowing you to experiment with and provide your implementations in either PyTorch, TensorFlow, or JAX. - The image models implemented in Rapidae are primarily designed for educational purposes. - Rapidae is intended to serve as a benchmarking library for models implemented in the sequential/time-series domain, as these are widely dispersed across various fields.

🚨Call for contributions🚨

If you want to add your model to the package or collaborate in the package development feel free to shoot me a message at costanahuel@uniovi.es or just open an issue or a pull request. I´ll be happy to collaborate with you.

Quick access:

Main features

  • Ease of Use: Rapidae has been designed to make the process of creating and experimenting with autoencoders as simple as possible, users can create and train autoencoder models with just a few lines of code.

  • Backend versatility: Rapidae relies on Keras 3.0, which is backend agnostic, allowing switching indistinctly between Tensorflow, Pytorch or Jax.

  • Customization: Easily customize model architecture, loss functions, and training parameters to suit your specific use case.

  • Experimentation: Conduct experiments with different hyperparameters and configurations to optimize the performance of your models.

Overview

Rapidae is structured as follows:

  • data: This module contains everything related to the acquisition and preprocessing of datasets.

  • models: This is the core module of the library. It includes the base architectures on which new ones can be created, several predefined architectures and a list of predefined default encoders and decoders.

  • pipelines: Pipelines are designed to perform a specific task or set of tasks such as data preprocessing or model training.

  • evaluate: Its main functionality is the evaluation of model performance. It also includes a tool utils for various tasks: latent space visualization, reconstructions, evaluation, etc.

Installation

The library has been tested with Python versions >=3.10, <3.12, therefore we recommend first creating a virtual environment with a suitable python version. Here´s an example with conda:

conda create -n rapidae python=3.10

Then, just activate the environment with conda activate rapidae and install the library.

[!NOTE] If you are using Google Colab, you are good to go (i.e. you do not need to create an environment). The library is fully compatible with Colab´s default environment.

With Pip

To install the latest stable release of this library run the following:

bash pip install rapidae

Note that you will also need to install a backend framework. Here are the official installation guidelines:

[!IMPORTANT] If you install TensorFlow, you should reinstall Keras 3 afterwards via pip install --upgrade keras. This is a temporary step while TensorFlow is pinned to Keras 2, and will no longer be necessary after TensorFlow 2.16. The cause is that tensorflow==2.15 will overwrite your Keras installation with keras==2.15.

From source code

You can also clone the repo to have fully access to all the code. Some features may not yet be available in the published stable version so this is the best way to stay up-to-date with the latest updates.

bash git clone https://github.com/NahuelCostaCortez/rapidae cd rapidae

Then you only have to install the requirements:

bash pip install -r requirements.txt

Available Models

Below is the list of the models currently implemented in the library.

| Models | Training example | Paper | Official Implementation | |:----------------------------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------:|:--------------------------------------------------------------------------:| | Autoencoder (AE) | Open In Colab | link | | Beta Variational Autoencoder (BetaVAE) | Open In Colab| link | | Contractive Autoencoder | Open In Colab | link | | Denoising Autoencoder | Open In Colab | link | link | Hierarchical Variational Autoencoder (HVAE) | SOON | link | link | ICFormer | SOON | link | link | interval-valued Variational Autoencoder (iVAE) | IN PROGRESS | | Recurrent Variational AutoEncoder (RVAE) | Open In Colab | link | link Recurrent Variational Encoder (RVE) | Open In Colab | link | link | Sparse Autoencoder | Open In Colab | link |
| Time VAE | Open In Colab | | link | Variational Autoencoder (VAE) | Open In Colab | link | link | Vector Quantised-Variational AutoEncoder (VQ-VAE) | Open In Colab | link | link |

Usage

Here you have a simple tutorial with the most relevant aspects of the library. In addition, in the examples folder, you will find a series of notebooks for each model and with particular use cases.

You can also use a web interface made with Streamlit where you can load datasets, configure models and hypeparameters, train, and evaluate the results. Check the web interface notebook.

Custom models and loss functions

You can provide your own autoencoder architecture. Here´s an example for defining a custom encoder and a custom decoder:

``` from rapidae.models.base import BaseEncoder, BaseDecoder from keras.layers import Dense

class CustomEncoder(BaseEncoder): def _init(self, inputdim, latentdim, **kwargs): # you can add more arguments, but al least these are required BaseEncoder.init_(self, inputdim=inputdim, latentdim=latent_dim)

    self.layer_1 = Dense(300)
    self.layer_2 = Dense(150)
    self.layer_3 = Dense(self.latent_dim)

def call(self, x):
    x = self.layer_1(x)
    x = self.layer_2(x)
    x = self.layer_3(x)
    return x

```

``` class CustomDecoder(BaseDecoder): def _init(self, inputdim, latentdim, **kwargs): # you can add more arguments, but al least these are required BaseDecoder.init_(self, inputdim=inputdim, latentdim=latent_dim)

    self.layer_1 = Dense(self.latent_dim)
    self.layer_2 = Dense(self.input_dim)

def call(self, x):
    x = self.layer_1(x)
    x = self.layer_2(x)
    return x

```

You can also provide a custom model. This is specially useful if you want to implement your own loss function.

``` from rapidae.models.base import BaseAE from keras.ops import mean from keras.losses import meansquarederror

class CustomModel(BaseAE): def init(self, inputdim, latentdim, encoder, decoder): # If you are adding your model to the source code there is no need to specify the encoder and decoder, just place them in the same directory as the model and the BaseAE constructor will initialize them BaseAE.init( self, inputdim=inputdim, latentdim=latentdim, encoder=encoder, decoder=decoder )

def call(self, x):
    # IMPLEMENT FORWARD PASS
    x = self.encoder(x)
    x = self.decoder(x)

    return x

def compute_loss(self, x=None, y=None, y_pred=None, sample_weight=None):
    '''
    Computes the loss of the model.
    x: input data
    y: target data
    y_pred: predicted data (output of call)
    sample_weight: Optional array of the same length as x, containing weights to apply to the model's loss for each sample
    '''
    # IMPLEMENT LOSS FUNCTION
    loss = mean(mean_squared_error(x, y_pred))

    return loss

```

Switching backends

Since Rapidae uses Keras 3, you can easily switch among Tensorflow, Pytorch and Jax (Tensorflow is the selected option by default).

You can export the environment variable KERAS_BACKEND or you can edit your local config file at ~/.keras/keras.json to configure your backend. Available backend options are: "jax", "tensorflow", "torch". Example:

bash export KERAS_BACKEND="torch"

In a notebook, you can do:

c import os os.environ["KERAS_BACKEND"] = "torch" import keras

Experiment tracking with wandb

If you want to add experiment tracking to rapidae models you can just create a Wandb callback and pass it to the TrainingPipeline as follows (this also applies to other experiment tracking frameworks):

``` wandb_cb = WandbCallback()

wandbcb.setup( trainingconfig=yourtrainingconfig, modelconfig=yourmodelconfig, projectname="yourwandbproject", entityname="yourwandb_entity", )

pipeline = TrainingPipeline(name="youpipelinename", model=model, callbacks=[wandb_cb]) ```

Documentation

Check out the full documentation for detailed information on installation, usage, examples and recipes: 🔗 Documentation Link

All documentation source and configuration files are located inside the docs directory.

Dealing with issues

If you are experiencing any issues while running the code or request new features/models to be implemented please open an issue on github.

Citation

If you find this work useful or incorporate it into your research, please consider citing it 🙏🏻.

@software{Costa_Rapidae, author = {Costa, Nahuel}, license = {Apache-2.0}, title = {{Rapidae}}, url = {https://github.com/NahuelCostaCortez/rapidae} }

Owner

  • Name: Nahuel Costa Cortez
  • Login: NahuelCostaCortez
  • Kind: user
  • Location: Gijón, Asturias, España
  • Company: University of Oviedo

PhD candidate in Artificial Intelligence.

Citation (CITATION.cff)

cff-version: 1.2.0
title: Rapidae
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: Nahuel
    family-names: Costa
    email: costanahuel@uniovi.es
    orcid: 'https://orcid.org/0000-0002-9189-2192'
repository-code: 'https://github.com/NahuelCostaCortez/rapidae'
url: 'https://rapidae.readthedocs.io/en/latest/'
abstract: >-
  Explore, compare and develop autoencoder models with a
  back-end agnostic framework
license: Apache-2.0

GitHub Events

Total
  • Watch event: 9
Last Year
  • Watch event: 9

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 108
  • Total Committers: 3
  • Avg Commits per committer: 36.0
  • Development Distribution Score (DDS): 0.065
Past Year
  • Commits: 108
  • Committers: 3
  • Avg Commits per committer: 36.0
  • Development Distribution Score (DDS): 0.065
Top Committers
Name Email Commits
Lucas Perez p****s@u****s 101
nahuel n****a@g****m 5
LucasP 7****8 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 14 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: rapidae

Rapidae: Python Library for Rapid Creation and Experimentation of Autoencoders

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 14 Last month
Rankings
Dependent packages count: 10.0%
Average: 38.0%
Dependent repos count: 66.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi
  • mkdocs ==1.5.3
  • mkdocs-gen-files ==0.5.0
  • mkdocs-literate-nav ==0.6.1
  • mkdocs-material ==9.5.4
  • mkdocstrings ==0.24.0
  • mkdocstrings-python ==1.8.0
  • pymdown-extensions ==10.7
pyproject.toml pypi
  • colorlog *
  • keras >=3.0.1
  • matplotlib *
  • numpy *
  • pandas *
  • requests *
  • scikit-learn *
requirements.txt pypi
  • colorlog ==6.7.0
  • keras ==3.0.1
  • matplotlib ==3.5.1
  • numpy ==1.26.2
  • pandas ==2.1.3
  • requests ==2.25.1
  • scikit-learn ==1.3.2