rapidae

Explore, compare and develop autoencoder models with a back-end agnostic framework

https://github.com/nahuelcostacortez/rapidae

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, sciencedirect.com, science.org, ieee.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary

Keywords

autoencoder benchmarking jax keras3 pytorch recurrent-autoencoder reproducibility tensorflow vae variational-autoencoder

Last synced: 10 months ago · JSON representation ·

Repository

Explore, compare and develop autoencoder models with a back-end agnostic framework

Basic Info

Host: GitHub
Owner: NahuelCostaCortez
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://rapidae.readthedocs.io/en/latest/
Size: 16.6 MB

Statistics

Stars: 16
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Topics

autoencoder benchmarking jax keras3 pytorch recurrent-autoencoder reproducibility tensorflow vae variational-autoencoder

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

Rapidae: Python Library for Rapid Creation and Experimentation of Autoencoders

🔗 Documentation | 🔗 PyPI Package

Description 📕

Rapidae is a Python library specialized in simplifying the creation and experimentation of autoencoder models. With a focus on ease of use, this library allows users to explore and develop autoencoder models in an efficient and straightforward manner.

I decided to develop this library to optimize my research workflow and provide a comprehensive resource for educators and learners exploring autoencoders.

As a researcher, I often found myself spending time on repetitive tasks, such as creating project structures or replicating baseline models. (I've lost count of how many times I've gone through the Keras VAE tutorial just to copy the model as a baseline for other experiments.)

As an educator, despite recognizing numerous fantastic online resources, I felt the need for a place where the features I consider important for teaching these models are consolidated: explanation, implementation, and versatility across different backends. The latter is particularly crucial, considering that PyTorch practitioners may find tedious to switch to TensorFlow, and vice versa. With the recently released Keras 3, Rapidae ensures that the user is met with a seamless and engaging experience, enabling to focus on model creation rather than backend specifics.

In summary, this library is designed to be simple enough for educational purposes, yet robust for researchers to concentrate on developing their models and conducting benchmark experiments in a unified environment.

[!NOTE] Shout out to Pythae, which provides an excellent library for experimenting with VAEs . If you're looking for a quick way to implement autoencoders for image applications, Pythae is probably your best option. Rapidae differs from Pythae in the following ways: - It is built on Keras 3, allowing you to experiment with and provide your implementations in either PyTorch, TensorFlow, or JAX. - The image models implemented in Rapidae are primarily designed for educational purposes. - Rapidae is intended to serve as a benchmarking library for models implemented in the sequential/time-series domain, as these are widely dispersed across various fields.

🚨Call for contributions🚨

If you want to add your model to the package or collaborate in the package development feel free to shoot me a message at costanahuel@uniovi.es or just open an issue or a pull request. I´ll be happy to collaborate with you.

Main features

Ease of Use: Rapidae has been designed to make the process of creating and experimenting with autoencoders as simple as possible, users can create and train autoencoder models with just a few lines of code.
Backend versatility: Rapidae relies on Keras 3.0, which is backend agnostic, allowing switching indistinctly between Tensorflow, Pytorch or Jax.
Customization: Easily customize model architecture, loss functions, and training parameters to suit your specific use case.
Experimentation: Conduct experiments with different hyperparameters and configurations to optimize the performance of your models.

Overview

Rapidae is structured as follows:

data: This module contains everything related to the acquisition and preprocessing of datasets.
models: This is the core module of the library. It includes the base architectures on which new ones can be created, several predefined architectures and a list of predefined default encoders and decoders.
pipelines: Pipelines are designed to perform a specific task or set of tasks such as data preprocessing or model training.
evaluate: Its main functionality is the evaluation of model performance. It also includes a tool utils for various tasks: latent space visualization, reconstructions, evaluation, etc.

Installation

The library has been tested with Python versions >=3.10, <3.12, therefore we recommend first creating a virtual environment with a suitable python version. Here´s an example with conda:

conda create -n rapidae python=3.10

Then, just activate the environment with conda activate rapidae and install the library.

[!NOTE] If you are using Google Colab, you are good to go (i.e. you do not need to create an environment). The library is fully compatible with Colab´s default environment.

With Pip

To install the latest stable release of this library run the following:

bash pip install rapidae

Note that you will also need to install a backend framework. Here are the official installation guidelines:

[!IMPORTANT] If you install TensorFlow, you should reinstall Keras 3 afterwards via pip install --upgrade keras. This is a temporary step while TensorFlow is pinned to Keras 2, and will no longer be necessary after TensorFlow 2.16. The cause is that tensorflow==2.15 will overwrite your Keras installation with keras==2.15.

From source code

You can also clone the repo to have fully access to all the code. Some features may not yet be available in the published stable version so this is the best way to stay up-to-date with the latest updates.

bash git clone https://github.com/NahuelCostaCortez/rapidae cd rapidae

Then you only have to install the requirements:

bash pip install -r requirements.txt

Available Models

Below is the list of the models currently implemented in the library.

Usage

Here you have a simple tutorial with the most relevant aspects of the library. In addition, in the examples folder, you will find a series of notebooks for each model and with particular use cases.

You can also use a web interface made with Streamlit where you can load datasets, configure models and hypeparameters, train, and evaluate the results. Check the web interface notebook.

Custom models and loss functions

You can provide your own autoencoder architecture. Here´s an example for defining a custom encoder and a custom decoder:

``` from rapidae.models.base import BaseEncoder, BaseDecoder from keras.layers import Dense

class CustomEncoder(BaseEncoder): def _init(self, inputdim, latentdim, **kwargs): # you can add more arguments, but al least these are required BaseEncoder.init_(self, inputdim=inputdim, latentdim=latent_dim)

    self.layer_1 = Dense(300)
    self.layer_2 = Dense(150)
    self.layer_3 = Dense(self.latent_dim)

def call(self, x):
    x = self.layer_1(x)
    x = self.layer_2(x)
    x = self.layer_3(x)
    return x

```

``` class CustomDecoder(BaseDecoder): def _init(self, inputdim, latentdim, **kwargs): # you can add more arguments, but al least these are required BaseDecoder.init_(self, inputdim=inputdim, latentdim=latent_dim)

    self.layer_1 = Dense(self.latent_dim)
    self.layer_2 = Dense(self.input_dim)

def call(self, x):
    x = self.layer_1(x)
    x = self.layer_2(x)
    return x

```

You can also provide a custom model. This is specially useful if you want to implement your own loss function.

``` from rapidae.models.base import BaseAE from keras.ops import mean from keras.losses import meansquarederror

class CustomModel(BaseAE): def init(self, inputdim, latentdim, encoder, decoder): # If you are adding your model to the source code there is no need to specify the encoder and decoder, just place them in the same directory as the model and the BaseAE constructor will initialize them BaseAE.init( self, inputdim=inputdim, latentdim=latentdim, encoder=encoder, decoder=decoder )

def call(self, x):
    # IMPLEMENT FORWARD PASS
    x = self.encoder(x)
    x = self.decoder(x)

    return x

def compute_loss(self, x=None, y=None, y_pred=None, sample_weight=None):
    '''
    Computes the loss of the model.
    x: input data
    y: target data
    y_pred: predicted data (output of call)
    sample_weight: Optional array of the same length as x, containing weights to apply to the model's loss for each sample
    '''
    # IMPLEMENT LOSS FUNCTION
    loss = mean(mean_squared_error(x, y_pred))

    return loss

```

Switching backends

Since Rapidae uses Keras 3, you can easily switch among Tensorflow, Pytorch and Jax (Tensorflow is the selected option by default).

You can export the environment variable KERAS_BACKEND or you can edit your local config file at ~/.keras/keras.json to configure your backend. Available backend options are: "jax", "tensorflow", "torch". Example:

bash export KERAS_BACKEND="torch"

In a notebook, you can do:

c import os os.environ["KERAS_BACKEND"] = "torch" import keras

Experiment tracking with wandb

If you want to add experiment tracking to rapidae models you can just create a Wandb callback and pass it to the TrainingPipeline as follows (this also applies to other experiment tracking frameworks):

``` wandb_cb = WandbCallback()

wandbcb.setup( trainingconfig=yourtrainingconfig, modelconfig=yourmodelconfig, projectname="yourwandbproject", entityname="yourwandb_entity", )

pipeline = TrainingPipeline(name="youpipelinename", model=model, callbacks=[wandb_cb]) ```

Documentation

Check out the full documentation for detailed information on installation, usage, examples and recipes: 🔗 Documentation Link

All documentation source and configuration files are located inside the docs directory.

Dealing with issues

If you are experiencing any issues while running the code or request new features/models to be implemented please open an issue on github.

Citation

If you find this work useful or incorporate it into your research, please consider citing it 🙏🏻.

@software{Costa_Rapidae, author = {Costa, Nahuel}, license = {Apache-2.0}, title = {{Rapidae}}, url = {https://github.com/NahuelCostaCortez/rapidae} }

Owner

Name: Nahuel Costa Cortez
Login: NahuelCostaCortez
Kind: user
Location: Gijón, Asturias, España
Company: University of Oviedo

Website: www.nahuelcosta.com
Twitter: nahucostacortez
Repositories: 2
Profile: https://github.com/NahuelCostaCortez

PhD candidate in Artificial Intelligence.

Citation (CITATION.cff)

cff-version: 1.2.0
title: Rapidae
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: Nahuel
    family-names: Costa
    email: costanahuel@uniovi.es
    orcid: 'https://orcid.org/0000-0002-9189-2192'
repository-code: 'https://github.com/NahuelCostaCortez/rapidae'
url: 'https://rapidae.readthedocs.io/en/latest/'
abstract: >-
  Explore, compare and develop autoencoder models with a
  back-end agnostic framework
license: Apache-2.0

GitHub Events

Total

Watch event: 9

Last Year

Watch event: 9

Committers

Last synced: over 2 years ago

All Time

Total Commits: 108
Total Committers: 3
Avg Commits per committer: 36.0
Development Distribution Score (DDS): 0.065

Past Year

Commits: 108
Committers: 3
Avg Commits per committer: 36.0
Development Distribution Score (DDS): 0.065

Top Committers

Name	Email	Commits
Lucas Perez	p**s@u**s	101
nahuel	n**a@g**m	5
LucasP	7****8	2

Committer Domains (Top 20 + Academic)

uniovi.es: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 14 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 3
Total maintainers: 1

pypi.org: rapidae

Rapidae: Python Library for Rapid Creation and Experimentation of Autoencoders

Homepage: https://github.com/NahuelCostaCortez/rapidae
Documentation: https://rapidae.readthedocs.io/
License: Apache Software License
Latest release: 0.0.4
published about 2 years ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 14 Last month

Rankings

Dependent packages count: 10.0%

Average: 38.0%

Dependent repos count: 66.1%

Maintainers (1)

nahuelcosta

Last synced: 10 months ago

Dependencies

docs/requirements.txt pypi

mkdocs ==1.5.3
mkdocs-gen-files ==0.5.0
mkdocs-literate-nav ==0.6.1
mkdocs-material ==9.5.4
mkdocstrings ==0.24.0
mkdocstrings-python ==1.8.0
pymdown-extensions ==10.7

pyproject.toml pypi

colorlog *
keras >=3.0.1
matplotlib *
numpy *
pandas *
requests *
scikit-learn *

requirements.txt pypi

colorlog ==6.7.0
keras ==3.0.1
matplotlib ==3.5.1
numpy ==1.26.2
pandas ==2.1.3
requests ==2.25.1
scikit-learn ==1.3.2

rapidae

Science Score: 67.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Rapidae: Python Library for Rapid Creation and Experimentation of Autoencoders

Description 📕

Quick access:

Main features

Overview

Installation

With Pip

From source code

Available Models

Usage

Custom models and loss functions

Switching backends

Experiment tracking with wandb

Documentation

Dealing with issues

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: rapidae

Rankings

Maintainers (1)

Dependencies