https://github.com/agamiko/neural-based-data-augmentation
Improving generalization via style transfer-based data augmentation: Novel regularization method
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google, sciencedirect.com, ieee.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Keywords
Repository
Improving generalization via style transfer-based data augmentation: Novel regularization method
Basic Info
- Host: GitHub
- Owner: AgaMiko
- Default Branch: master
- Size: 3.11 MB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Improving generalization via style transfer-based data augmentation: Novel regularization method

Introduction
Currently, deep learning algorithms are considered as state-of-the-art in many classification tasks, and yet the problem of weak generalization is very common, widely mentioned, and still up-to-date.
The present paper focuses most on the data augmentation. In our method, new images are synthetized with neural style transfer (NST),
and the generated images are then used to train the convolutional neural network (CNN) in order to improve
its generalization abilities.
The main contributions of this paper are:
* The proposition of using neural style transfer for the data augmentation (ST-DA). This approach is presented on the skin lesion case study by transforming a benign skin lesion to a malignant lesion, and tested with dataset enrichment evaluation;
* Incorporating unlabeled, synthesized data into training by adding pseudo-labels generated by another CNN;
* Limiting the problem of noisy pseudo-labels in synthetic images used as a CNN training set by using only real images in validation and test sets;
* Evaluating the ability to enrich the training dataset with artificially generated data with Deep Taylor Decomposition,
* Proving that the ST-DA method significantly improves the performance and repeatability of training for deep neural networks.
ST-DA
How-to
Short and friendly how-to tutorial will be soon available here
Details
The result and details of the method will be able to be find soon in the original paper here: soon You can check instead our previous papers about data augmentation: * Data augmentation for improving deep learning in image classification problem, 2018 * Style transfer-based image synthesis as an efficient regularization technique in deep learning, 2019
Database
Download
The total databse size is 248 489 unalabeled generated dermoscopic images of skin lesions (224x224 px). * Few full-size examples can be found here * Database can be download soon here (soon)
If you use this database please star the repository and cite the following paper (soon):
"Improving generalization via style transfer-based data augmentation: Novel regularization method", by Agnieszka Mikołajczyk , Michał Grochowski, Arkadiusz Kwasigroch
Sources
The database was generated using following sources:
- Image generation:
- Style transfer original paper: A Neural Algorithm of Artistic Style is a first paper that presented Neural Style Transfer.
- Style transfer implementation: Implementation of Neural Style Transfer & Neural Doodles from the paper A Neural Algorithm of Artistic Style in Keras 2.0+
- Explainability method:
- Deep Taylor decomposition: DeepTaylor computes for each neuron a rootpoint, that is close to the input, but which's output value is 0, and uses this difference to estimate the attribution of each neuron recursively.
- Repository: iNNvestigate library contains implementations for the
SmoothGrad, DeConvNet, Guided BackProp, PatternNet, DeepTaylor, PatternAttribution, IntegratedGradients and DeepLIFT.
- Source database:
- ISIC Archive: The ISIC Archive contains over 23k images of skin lesions, labeled as 'benign' or 'malignant'. Those images were used to generate our database.
- ISIC Archive Downloader: A script to download the ISIC Archive of lesion images
- Previous papers about data augmentation:
- Similar projects:
- Generating skin lesions with GANs - Beating Melanoma with Deep Learning: letting the data speak
- Other:
- VGG8 Selected Technical Issues of Deep Neural Networks for Image Classification Purposes prestents the details of VGG8 architecture.
Owner
- Name: Agnieszka Mikołajczyk
- Login: AgaMiko
- Kind: user
- Location: Gdańsk
- Company: Gdansk University of Technology/ Voicelab.ai
- Website: https://amikolajczyk.netlify.com/
- Twitter: AgnMikolajczyk
- Repositories: 26
- Profile: https://github.com/AgaMiko
Machine Learning Scientist & Enthusiast🤖 https://twitter.com/AgnMikolajczyk LN: https://www.linkedin.com/in/agnieszkamikolajczyk/
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0