https://github.com/amacaluso/ssb-vae

Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing: we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. In addition, we propose a novel supervision approach in which the model uses its own predictions of the label distribution to implement the pairwise objective. Compared to the best baseline, this procedure yields similar performance in fully-supervised settings but improves significantly the results when labelled data is scarce.

Keywords

binary-variational-autoencoder deep-learning dimension-reduction neural-networks variational-autoencoder

Last synced: 5 months ago · JSON representation

Repository

Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing: we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. In addition, we propose a novel supervision approach in which the model uses its own predictions of the label distribution to implement the pairwise objective. Compared to the best baseline, this procedure yields similar performance in fully-supervised settings but improves significantly the results when labelled data is scarce.

Basic Info

Host: GitHub
Owner: amacaluso
License: apache-2.0
Language: Python
Default Branch: master
Homepage:
Size: 35 MB

Statistics

Stars: 4
Watchers: 2
Forks: 3
Open Issues: 0
Releases: 0

Topics

binary-variational-autoencoder deep-learning dimension-reduction neural-networks variational-autoencoder

Created over 5 years ago · Last pushed almost 5 years ago

https://github.com/amacaluso/SSB-VAE/blob/master/

# SSB-VAE: Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing

This repository contains the code to reproduce the results presented in the paper
[*Self-Supervised Bernoulli Autoencoders for Semi-Supervised Hashing*](https://arxiv.org/abs/2007.08799).

# Description

We investigate the robustness of hashing methods based on variational autoencoders
to the lack of supervision, focusing on two semi-supervised approaches currently in use.
In addition, we propose a novel supervision approach in which the model uses
its own predictions of the label distribution to implement the pairwise objective. Compared to the best
baseline, this procedure yields similar performance in
fully-supervised settings but improves significantly the results when labelled data is scarce.

# Usage

The code is organised in four different scripts, one per dataset.
Specifically, the scripts *test_model_[**data**].py* considers the dataset **data** and take as input
the following parameters:

- *M* is the index of the model considered. In particular, we compare three semi-supervised
methods based on variational autoencoders: *(M=1)* **VDHS-S** is a variational autoencoder
proposed in [[1]](#1) that employs Gaussian latent variables, unsupervised learning and pointwise supervision;
*(M=2)* **PHS-GS** is a variational autoencoder proposed in [[2]](#2) that assumes Bernoulli latent variables,
unsupervised learning, and both pointwise and pairwise supervision;
and *(M=3)* **SSB-VAE** is our proposed method based on Bernoulli latent variable, unsupervised learning, pointwise
supervision and self-supervision.

- *p* is the level (percentage) of supervision to consider when training the autoencoder based on semi-supervised approach.
- *a*, *b* and *g* a,b,g are the hyperparameters associated to the different components of the semi-supervised
loss. In particular, *a* is the coefficient of the pointwise component, *g* is associated to the pairwise component
and *b* is the weight associated to the KL divergence when computing the unsupervised loss
- *r* is the number of experiments to perform, given a set of parameters. This is used to compute an average performance
considering multiple initialisation of the same neural network. Notice that the results reported in the paper are
computing by averaging *r=5* experiments.
- *l* is the size of the latent sub-space generated by the encoder. This also corresponds to the number of bits of
the generated hash codes.
- *o* is the file where the results are stored

The script utils.py is used to import the needed packages and all of the custom routines for performance evaluation.

The script base_networks.py contains the custom routines to define all the components of a neural networks.

The script supervised_BAE.py defines the three types of autoencoder (*VDSH, PHS, SSB-VAE*).

The *.sh files allow to run the all the experiments reported in the paper. In particular,
the test_all_[**data**]-[**n**]bits.sh compute *r* times the prediction of the three methods (*VDSH, PHS, SSB-VAE*),
given a dataset (**data**) and the number of bits **n**, for different levels of supervision *p = 0.1, 0.2, ... , 0.9, 1.0*

The script post_processing.py allows to collect all the results provided by the *.sh files and it computes the
tables as reported in the paper.

## Requirements

Python 3.7

Tensorflow 2.1

## Execution

In order to obtain the results reported in the paper it is necessary execute all the *.sh files as follows:
```
# run all *.sh files
./test_all_20news-16bits.sh
./test_all_20news-32bits.sh
./test_all_snippets-16bits.sh
./test_all_snippets-32bits.sh
./test_all_TMC-16bits.sh
./test_all_TMC-32bits.sh
./test_all_cifar-16bits.sh
./test_all_cifar-32bits.sh

```

At the end of the computation, the csv files containing the results are generated according to the *-o*
parameter. Finally the script post_processing.py collects all the csv and save a new csv having the same format
of the two table reported in the paper.

## References
[1]
S. Chaidaroon and Y. Fang. Variational deep semantic hashing for text documents. SIGIR. 2017, pp. 7584.

[2] S. Z. Dadaneh et al. Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator. Proc. UAI. 2020

Owner

Name: Antonio Macaluso
Login: amacaluso
Kind: user
Location: Saarbrücken, Germany
Company: German Research Center for Artificial Intelligence

Website: https://amacaluso.github.io/
Twitter: macalcubo
Repositories: 11
Profile: https://github.com/amacaluso

Senior Researcher in Quantum Artificial Intelligence | PhD in Computer Science and Engineering

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: 5 months ago

All Time

Total issues: 1
Total pull requests: 0
Average time to close issues: 25 days
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/amacaluso/ssb-vae

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

https://github.com/amacaluso/SSB-VAE/blob/master/

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels