https://github.com/chris-santiago/vime
Reproducing the VIME framework for self- and semi-supervised learning to tabular domain.
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 1 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.4%) to scientific vocabulary
Keywords
hydra
pytorch
pytorch-lightning
self-supervised-learning
semi-supervised-learning
taskfile
Last synced: 5 months ago
·
JSON representation
Repository
Reproducing the VIME framework for self- and semi-supervised learning to tabular domain.
Basic Info
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
hydra
pytorch
pytorch-lightning
self-supervised-learning
semi-supervised-learning
taskfile
Created over 2 years ago
· Last pushed over 2 years ago
https://github.com/chris-santiago/vime/blob/master/
# VIME - PyTorch This repo reproduces VIME framework for self- and semi-supervised learning to tabular domain. *Authors: Jinsung Yoon, Yao Zhang, James Jordon, Mihaela van der Schaar* *Reference: Jinsung Yoon, Yao Zhang, James Jordon, Mihaela van der Schaar, "VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain," Neural Information Processing Systems (NeurIPS), 2020.* Original paper: https://proceedings.neurips.cc/paper/2020/hash/7d97667a3e056acab9aaf653807b4a03-Abstract.html Original repo: https://github.com/jsyoon0823/VIME/tree/master --------- ## About ### Initial Implementation This initial implementation follows the VIME self-supervised framework to train an encoder on unlabeled MNIST data, which is then used to train a semi-supervised MLP on a much smaller portion of labeled MNIST data. The final model is tested against the standard MNIST test set.  *Block diagram of the proposed self-supervised learning framework on tabular data. Credit: Yoon et al.* The final model used only 10% of MNIST training set (n=6,000) as labeled data for the semi-supervised learning and reached 93% classification accuracy on the test set. None of the hyperparameters were optimized for this initial work.  *Block diagram of the proposed semi-supervised learning framework on tabular data. Credit: Yoon et al.* Full configuration listed in `outputs/vime-encoder/train_self/2023-05-26/10-09-22/.hydra/config.yaml` for the self-supervised encoder and in `outputs/vime-learner/train_semi/2023-05-26/10-32-51/.hydra/config.yaml` for the semi-supervised learner. ## Install Clone this repository, create a new Conda environment and ```bash git clone https://github.com/chris-santiago/vime.git conda env create -f environment.yml cd vime pip install -e . ``` ## Use ### Prerequisites #### Task This project uses [Task](https://taskfile.dev/) as a task runner. Though the underlying Python commands can be executed without it, we recommend [installing Task](https://taskfile.dev/installation/) for ease of use. Details located in `Taskfile.yml`. #### Current commands ```bash > task -l task: Available tasks for this project: * check-config: Check Hydra configuration * train-multi: Launch multiple training jobs * train-self: Train the VIME encoder module * train-semi: Train the VIME semi-SL module * wandb: Login to Weights & Biases ``` #### PDM This project was built using [this cookiecutter](https://github.com/chris-santiago/cookie) and is setup to use [PDM](https://pdm.fming.dev/latest/) for dependency management, though it's not required for package installation. #### Hydra This project uses [Hydra](https://hydra.cc/docs/intro/) for managing configuration CLI arguments. See `vime/conf` for full configuration details. #### Weights and Biases This project is set up to log experiment results with [Weights and Biases](https://wandb.ai/). It expects an API key within a `.env` file in the root directory: ```toml WANDB_KEY=``` Users can configure different logger(s) within the `conf/trainer/default.yaml` file. ### Training - Run `task train-self` to train the self-supervised encoder. Once complete, check the `outputs/vime-encoder/train_self/../checkpoints` directory for path to saved checkpoint. - Copy and paste this checkpoint into the semi-supervised model config, located at `conf/model/learner.yaml` under the `nn.encoder_ckpt` key. - Run `task train-semi` to train the semi-supervised encoder. All results will populate to their respective output directories: ``` outputs vime-encoder train_self 2023-05-26 10-09-22 .hydra checkpoints wandb vime-learner train_semi 2023-05-26 10-32-51 .hydra checkpoints wandb ``` ## Documentation Documentation hosted on Github Pages: [https://chris-santiago.github.io/vime/](https://chris-santiago.github.io/vime/)
Owner
- Name: Chris Santiago
- Login: chris-santiago
- Kind: user
- Repositories: 64
- Profile: https://github.com/chris-santiago
GitHub Events
Total
Last Year
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| chris-santiago | c****o@g****u | 37 |
Committer Domains (Top 20 + Academic)
gatech.edu: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 5
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- chris-santiago (5)