https://github.com/catalyst-team/segmentation

Catalyst.Segmentation

Keywords

augmentation catalyst deep-learning docker image-processing image-segmentation jaccard machine-learning pipeline python pytorch reproducibility segmentation segmentation-pipeline

Keywords from Contributors

distributed-computing information-retrieval metric-learning recommender-system reinforcement-learning research text-classification text-segmentation

Last synced: 6 months ago · JSON representation

Repository

Catalyst.Segmentation

Basic Info

Host: GitHub
Owner: catalyst-team
License: apache-2.0
Language: Python
Default Branch: master
Homepage: https://github.com/catalyst-team/catalyst
Size: 196 KB

Statistics

Stars: 28
Watchers: 5
Forks: 10
Open Issues: 6
Releases: 0

Archived

Topics

augmentation catalyst deep-learning docker image-processing image-segmentation jaccard machine-learning pipeline python pytorch reproducibility segmentation segmentation-pipeline

Created almost 7 years ago · Last pushed over 4 years ago

Metadata Files

Readme Funding License

[![Catalyst logo](https://raw.githubusercontent.com/catalyst-team/catalyst-pics/master/pics/catalyst_logo.png)](https://github.com/catalyst-team/catalyst) **Accelerated DL & RL** [![Build Status](http://66.248.205.49:8111/app/rest/builds/buildType:id:Catalyst_Deploy/statusIcon.svg)](http://66.248.205.49:8111/project.html?projectId=Catalyst&tab=projectOverview&guest=1) [![CodeFactor](https://www.codefactor.io/repository/github/catalyst-team/catalyst/badge)](https://www.codefactor.io/repository/github/catalyst-team/catalyst) [![Pipi version](https://img.shields.io/pypi/v/catalyst.svg)](https://pypi.org/project/catalyst/) [![Docs](https://img.shields.io/badge/dynamic/json.svg?label=docs&url=https%3A%2F%2Fpypi.org%2Fpypi%2Fcatalyst%2Fjson&query=%24.info.version&colorB=brightgreen&prefix=v)](https://catalyst-team.github.io/catalyst/index.html) [![PyPI Status](https://pepy.tech/badge/catalyst)](https://pepy.tech/project/catalyst) [![Twitter](https://img.shields.io/badge/news-twitter-499feb)](https://twitter.com/CatalystTeam) [![Telegram](https://img.shields.io/badge/channel-telegram-blue)](https://t.me/catalyst_team) [![Slack](https://img.shields.io/badge/Catalyst-slack-success)](https://join.slack.com/t/catalyst-team-devs/shared_invite/zt-d9miirnn-z86oKDzFMKlMG4fgFdZafw) [![Github contributors](https://img.shields.io/github/contributors/catalyst-team/catalyst.svg?logo=github&logoColor=white)](https://github.com/catalyst-team/catalyst/graphs/contributors)

PyTorch framework for Deep Learning research and development. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing. Being able to research/develop something new, rather than write another regular train loop.
Break the cycle - use the Catalyst!

Project manifest. Part of PyTorch Ecosystem. Part of Catalyst Ecosystem: - Alchemy - Experiments logging & visualization - Catalyst - Accelerated Deep Learning Research and Development - Reaction - Convenient Deep Learning models serving

Catalyst at AI Landscape.

Catalyst.Segmentation

Note: this repo uses advanced Catalyst Config API and could be a bit out-of-day right now. Use Catalyst's minimal examples section for a starting point and up-to-day use cases, please.

You will learn how to build image segmentation pipeline with transfer learning using the Catalyst framework.

Goals

Install requirements
Prepare data
Run: raw data → production-ready model
Get results
Customize own pipeline

1. Install requirements

Using local environment:

bash pip install -r requirements/requirements.txt

Using docker:

This creates a build catalyst-segmentation with the necessary libraries: bash make docker-build

2. Get Dataset

Try on open datasets

You can use one of the open datasets

```bash export DATASET="isbi" rm -rf data/ mkdir -p data if [[ "$DATASET" == "isbi" ]]; then # binary segmentation # http://brainiac2.mit.edu/isbi_challenge/ download-gdrive 1uyPb9WI0t2qMKIqOjFKMv1EtfQ5FAVEI isbi_cleared_191107.tar.gz tar -xf isbi_cleared_191107.tar.gz &>/dev/null mv isbi_cleared_191107 ./data/origin elif [[ "$DATASET" == "voc2012" ]]; then # semantic segmentation # http://host.robots.ox.ac.uk/pascal/VOC/voc2012/ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar tar -xf VOCtrainval_11-May-2012.tar &>/dev/null mkdir -p ./data/origin/images/; mv VOCdevkit/VOC2012/JPEGImages/* $_ mkdir -p ./data/origin/raw_masks; mv VOCdevkit/VOC2012/SegmentationClass/* $_ fi ```

Use your own dataset

Prepare your dataset

#### Data structure Make sure, that final folder with data has the required structure: ```bash /path/to/your_dataset/ images/ image_1 image_2 ... image_N raw_masks/ mask_1 mask_2 ... mask_N ``` #### Data location * The easiest way is to move your data: ```bash mv /path/to/your_dataset/* /catalyst.segmentation/data/origin ``` In that way you can run pipeline with default settings. * If you prefer leave data in `/path/to/your_dataset/` * In local environment: * Link directory ```bash ln -s /path/to/your_dataset $(pwd)/data/origin ``` * Or just set path to your dataset `DATADIR=/path/to/your_dataset` when you start the pipeline. * Using docker You need to set: ```bash -v /path/to/your_dataset:/data \ #instead default $(pwd)/data/origin:/data ``` in the script below to start the pipeline.

3. Segmentation pipeline

Fast&Furious: raw data → production-ready model

The pipeline will automatically guide you from raw data to the production-ready model.

We will initialize Unet model with a pre-trained ResNet-18 encoder. During current pipeline model will be trained sequentially in two stages.

Binary segmentation pipeline

#### Run in local environment: ```bash CUDA_VISIBLE_DEVICES=0 \ CUDNN_BENCHMARK="True" \ CUDNN_DETERMINISTIC="True" \ WORKDIR=./logs \ DATADIR=./data/origin \ IMAGE_SIZE=256 \ CONFIG_TEMPLATE=./configs/templates/binary.yml \ NUM_WORKERS=4 \ BATCH_SIZE=256 \ bash ./bin/catalyst-binary-segmentation-pipeline.sh ``` #### Run in docker: ```bash export LOGDIR=$(pwd)/logs docker run -it --rm --shm-size 8G --runtime=nvidia \ -v $(pwd):/workspace/ \ -v $LOGDIR:/logdir/ \ -v $(pwd)/data/origin:/data \ -e "CUDA_VISIBLE_DEVICES=0" \ -e "USE_WANDB=1" \ -e "LOGDIR=/logdir" \ -e "CUDNN_BENCHMARK='True'" \ -e "CUDNN_DETERMINISTIC='True'" \ -e "WORKDIR=/logdir" \ -e "DATADIR=/data" \ -e "IMAGE_SIZE=256" \ -e "CONFIG_TEMPLATE=./configs/templates/binary.yml" \ -e "NUM_WORKERS=4" \ -e "BATCH_SIZE=256" \ catalyst-segmentation ./bin/catalyst-binary-segmentation-pipeline.sh ```

Semantic segmentation pipeline

#### Run in local environment: ```bash CUDA_VISIBLE_DEVICES=0 \ CUDNN_BENCHMARK="True" \ CUDNN_DETERMINISTIC="True" \ WORKDIR=./logs \ DATADIR=./data/origin \ IMAGE_SIZE=256 \ CONFIG_TEMPLATE=./configs/templates/semantic.yml \ NUM_WORKERS=4 \ BATCH_SIZE=256 \ bash ./bin/catalyst-semantic-segmentation-pipeline.sh ``` #### Run in docker: ```bash export LOGDIR=$(pwd)/logs docker run -it --rm --shm-size 8G --runtime=nvidia \ -v $(pwd):/workspace/ \ -v $LOGDIR:/logdir/ \ -v $(pwd)/data/origin:/data \ -e "CUDA_VISIBLE_DEVICES=0" \ -e "USE_WANDB=1" \ -e "LOGDIR=/logdir" \ -e "CUDNN_BENCHMARK='True'" \ -e "CUDNN_DETERMINISTIC='True'" \ -e "WORKDIR=/logdir" \ -e "DATADIR=/data" \ -e "IMAGE_SIZE=256" \ -e "CONFIG_TEMPLATE=./configs/templates/semantic.yml" \ -e "NUM_WORKERS=4" \ -e "BATCH_SIZE=256" \ catalyst-segmentation ./bin/catalyst-semantic-segmentation-pipeline.sh ```

The pipeline is running and you don’t have to do anything else, it remains to wait for the best model!

Visualizations

You can use W&B account for visualisation right after pip install wandb:

wandb: (1) Create a W&B account wandb: (2) Use an existing W&B account wandb: (3) Don't visualize my results w&b binary segmentation metrics

Tensorboard also can be used for visualisation:

bash tensorboard --logdir=/catalyst.segmentation/logs tf binary segmentation metrics

4. Results

All results of all experiments can be found locally in WORKDIR, by default catalyst.segmentation/logs. Results of experiment, for instance catalyst.segmentation/logs/logdir-191107-094627-2f31d790, contain:

checkpoints

The directory contains all checkpoints: best, last, also of all stages.
best.pth and last.pht can be also found in the corresponding experiment in your W&B account.

configs

The directory contains experiment`s configs for reproducibility.

logs

The directory contains all logs of experiment.
Metrics also logs can be displayed in the corresponding experiment in your W&B account.

code

The directory contains code on which calculations were performed. This is necessary for complete reproducibility.

5. Customize own pipeline

For your future experiments framework provides powerful configs allow to optimize configuration of the whole pipeline of segmentation in a controlled and reproducible way.

Configure your experiments

* Common settings of stages of training and model parameters can be found in `catalyst.segmentation/configs/_common.yml`. * `model_params`: detailed configuration of models, including: * model, for instance `ResnetUnet` * detailed architecture description * using pretrained model * `stages`: you can configure training or inference in several stages with different hyperparameters. In our example: * optimizer params * first learn the head(s), then train the whole network * The `CONFIG_TEMPLATE` with other experiment\`s hyperparameters, such as data_params and is here: `catalyst.segmentation/configs/templates/binary.yml`. The config allows you to define: * `data_params`: path, batch size, num of workers and so on * `callbacks_params`: Callbacks are used to execute code during training, for example, to get metrics or save checkpoints. Catalyst provide wide variety of helpful callbacks also you can use custom. You can find much more options for configuring experiments in [catalyst documentation.](https://catalyst-team.github.io/catalyst/)

Owner

Name: Catalyst-Team
Login: catalyst-team
Kind: organization
Location: World

Repositories: 23
Profile: https://github.com/catalyst-team

GitHub Events

Total

Last Year

Committers

Last synced: 8 months ago

All Time

Total Commits: 39
Total Committers: 5
Avg Commits per committer: 7.8
Development Distribution Score (DDS): 0.667

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
dependabot-preview[bot]	2****]	13
Yauheni Kachan	1****i	12
Sergey Kolesnikov	s**r@g**m	11
Yauheni Kachan	i**e@g**m	2
Evgeny Semyonov	l**b@y**u	1

Committer Domains (Top 20 + Academic)

yandex.ru: 1

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 6
Total pull requests: 61
Average time to close issues: about 1 month
Average time to close pull requests: 10 days
Total issue authors: 5
Total pull request authors: 3
Average comments per issue: 0.33
Average comments per pull request: 0.51
Merged pull requests: 26
Bot issues: 1
Bot pull requests: 44

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

BrunoKrinski (2)
Servando1990 (1)
ghost (1)
Scitator (1)
dependabot-preview[bot] (1)

https://github.com/catalyst-team/segmentation

Science Score: 10.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Catalyst.Segmentation

Goals

1. Install requirements

Using local environment:

Using docker:

2. Get Dataset

Try on open datasets

Use your own dataset

3. Segmentation pipeline

Fast&Furious: raw data → production-ready model

Visualizations

4. Results

checkpoints

configs

logs

code

5. Customize own pipeline

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels