tfaip - a Generic and Powerful Research Framework for Deep Learning based on Tensorflow

tfaip - a Generic and Powerful Research Framework for Deep Learning based on Tensorflow - Published in JOSS (2021)

https://github.com/planet-ai-gmbh/tfaip

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 78% confidence

Last synced: 6 months ago · JSON representation

Repository

Python-based research framework for developing, organizing, and deploying Deep Learning models powered by Tensorflow.

Basic Info

Host: GitHub
Owner: Planet-AI-GmbH
License: gpl-3.0
Language: Python
Default Branch: master
Homepage: https://tfaip.readthedocs.io
Size: 2.77 MB

Statistics

Stars: 12
Watchers: 3
Forks: 3
Open Issues: 0
Releases: 10

Archived

Created over 5 years ago · Last pushed over 3 years ago

Metadata Files

Readme License

tfaip - A Generic and Powerful Research Framework for Deep Learning based on Tensorflow

tfaip is a Python-based research framework for developing, organizing, and deploying Deep Learning models powered by Tensorflow. It enables to implement both simple and complex scenarios that are structured and highly configurable by parameters that can directly be modified by the command line (read the docs). For example, the tutorial.full-scenario for learning MNIST allows to modify the graph during training but also other hyper-parameters such as the optimizer: ```bash export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found

Change the graph

tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation relu tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation tanh tfaip-train examples.tutorial.full --model.graph CNN --model.graph.filters 40 20 --model.graph.dense 100

Change the optimizer

tfaip-train examples.tutorial.full --trainer.optimizer RMSprop --trainer.optimizer.beta1 0.01 --trainer.optimizer.clipglobalnorm 1

...

```

A trained model can then easily be integrated in a workflow to predict provided data: python predictor = TutorialScenario.create_predictor("PATH_TO_TRAINED_MODEL", PredictorParams()) for sample in predictor.predict(data): print(sample.outputs)

In practice, tfaip follows the rules of object orientation, i.e., the code for a scenario (e.g., image-classification (MNIST), text recognition, NLP, etc.) is organized by implementing classes. By default, each Scenario must implement Model, and Data. See here for the complete code to run the upper example for MNIST and see here for the minimal setup.

Setup

To setup tfaip create a virtual Python (at least 3.7) environment and install the tfaip pip package: pip install tfaip: bash virtualenv -p python3 venv source venv/bin/activate pip install tfaip pip install tfaip[devel] # to install additional development/test requirements Have a look at the wiki for further setup instructions.

Run the Tutorial

After the setup succeeded, launch a training of the tutorial which is an implementation of the common MNIST scenario: ```bash export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found tfaip-train examples.tutorial.full

If you have a GPU, select it by specifying its ID

tfaip-train examples.tutorial.full --device.gpus 0 ```

Next Steps

Start reading the Minimum Tutorial, optionally have a look at the Full Tutorial to see more features. The docs provides a full description of tfaip.

To set up a new custom scenario, copy the general template and implement the abstract methods. Consider renaming the classes! Launch the training by providing the path or package-name of the new scenario which must be located in the PYTHONPATH!

Features of tfaip

tfaip provides different features which allow designing generic scenarios with maximum flexibility and high performance.

Code design

Fully Object-Oriented: Implement classes and abstract functions or overwrite any function to extend, adapt, or modify its default functionality.
Typing support: tfaip is fully typed with simplifies working with an IDE (e.g., use PyCharm!).
Using pythons dataclasses module to set up parameters which are automatically converted to parameters of the command line by our paiargparse package.

Data-Pipeline

Every scenario requires the setup of a data-pipeline to read and transform data. tfaip offers to easily implement and modify even complex pipelines by defining multiple DataProcessors which usually implement a small operation to map an input sample to an output sample. E.g., one DataProcessor loads the data (input=filename, output=image), another one applies normalization rules, again another one applies data augmentation, etc. The great advantage of this setup is that the data processors run in Python and can automatically be parallelized by tfaip for speed up by setting run_parallel=True.

Deep-Learning-Features

Since tfaip is based on Tensorflow the full API are available for designing models, graphs, and even data pipelines. Furthermore, tfaip supports additional common techniques for improving the performance of a Deep-Learning model out of the box:

Warm-starting (i.e., loading a pretrained model)
EMA-weights
Early-Stopping
Weight-Decay
various optimizers and learning-rate schedules

Contributing

We highly encourage users to contribute own scenarios and improvements of tfaip. Please read the contribution guidelines.

Benchmarks

All timings were obtained on a Intel Core i7, 10th Gen CPU.

MNIST

The following Table compares the MNIST Tutorial of Keras to the Minimum Tutorial. The keras code was adopted to use the same network architecture and hyperparemter settings (batch size of 16, 10 epochs of training).

Code | Time Per Epoch | Train Acc | Val Acc | Best Val Acc :---- | --------------: | ---------: | -------: | ------------: Keras | 16 s | 99.65% | 98.24% | 98.60% tfaip | 18 s | 99.76% | 98.66% | 98.66%

tfaip and Keras result in comparable accuracies, as to be expected since the actual code for training the graph is fundamentally identical. tfaip is however a bit slower due some overhead in the input pipeline and additional functionality (e.g., benchmarks, or automatic tracking of the best model). This overhead is negligible for almost any real-world scenario because due to a clearly larger network architecture, the computation times for inference and backpropagation become the bottleneck.

Data Pipeline

Integrating pure-python operations (e.g., numpy) into a tf.data.Datasetto apply high-level preprocessing is slow by default since tf.data.Dataset.map in cooperation with tf.py_function does not run in parallel and is therefore blocked by Python's GIL. tfaip curcumvents this issue by providing an (optional) parallelizable input pipeline. The following table shows the time in seconds for two different tasks:

PYTHON: applying some pure python functions on the data
NUMPY: applying several numpy operations on the data

| Mode | Task | Threads 1 | Threads 2 | Threads 4 | Threads 6 | |:---------------------|:--------------|--------------------:|--------------------:|--------------------:|--------------------:| | tf.pyfunction | PYTHON | 23.47| 22.78 | 24.38 | 25.76 | | _tfaip | PYTHON | 26.68| 14.48 | 8.11 | 8.13 | | tf.pyfunction | NUMPY | 104.10 | 82.78 | 76.33 | 77.56 | | _tfaip | NUMPY | 97.07 | 56.93 | 43.78 | 42.73 |

The PYTHON task clearly shows that tf.data.Dataset.map is not able to utilize multiple threads. The speed-up in the NUMPY tasks occurs possibly due to paralization in the numpy API to C.

Owner

Name: Planet AI GmbH
Login: Planet-AI-GmbH
Kind: organization
Location: Rostock

Website: www.planet-ai.de
Repositories: 2
Profile: https://github.com/Planet-AI-GmbH

Creating truly intelligent, cognitive systems

JOSS Publication

tfaip - a Generic and Powerful Research Framework for Deep Learning based on Tensorflow

Published

June 22, 2021

DOI

10.21105/joss.03297

Volume 6, Issue 62, Page 3297

Authors

Christoph Wick

Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany

Benjamin Kühn
Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany

Gundram Leifert
Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany

Konrad Sperfeld
Institute of Mathematics, University of Rostock, 18051 Rostock, Germany

Tobias Strauß
Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany

Jochen Zöllner

Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany, Institute of Mathematics, University of Rostock, 18051 Rostock, Germany

Tobias Grüning

Planet AI GmbH, Warnowufer 60, 18059 Rostock, Germany

Editor

Arfon Smith

GitHub Events

Total

Last Year

Committers

Last synced: 7 months ago

All Time

Total Commits: 72
Total Committers: 4
Avg Commits per committer: 18.0
Development Distribution Score (DDS): 0.194

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
C. Wick	w**r@g**m	58
jochen	j**r@p**e	9
planetai-gmbh	8****h	3
Arfon Smith	a****n	2

Committer Domains (Top 20 + Academic)

planet-ai.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 8
Total pull requests: 4
Average time to close issues: 6 months
Average time to close pull requests: about 1 month
Total issue authors: 4
Total pull request authors: 3
Average comments per issue: 3.38
Average comments per pull request: 1.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

bertsky (3)
andbue (3)
Het-Shah (1)
swiftRetreat (1)

Pull Request Authors

arfon (2)
JochenZoellner (1)
p42ul (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

devel_requirements.txt pypi

black ==21.6b0 development
flake8 * development
pre-commit * development
pytest * development
pytest-timeout * development
pytest-xdist * development

docs/requirements.txt pypi

Sphinx-Substitution-Extensions *
myst-parser *
sphinx *
sphinx-rtd-theme *

examples/requirements.txt pypi

Levenshtein *
opencv-python-headless *
transformers *

requirements.txt pypi

GitPython *
adabelief-tf *
dataclasses-json *
imageio *
nptyping *
opencv-python-headless *
openpyxl *
paiargparse ==1.1.2
pandas *
pillow *
pooch ==1.4.0
prettytable *
python-Levenshtein *
scikit-image *
tensorflow >=2.4.0,<2.7.0
tensorflow_addons >=0.12.0
tqdm *
typeguard *
xlrd ==1.2.0

tfaip - a Generic and Powerful Research Framework for Deep Learning based on Tensorflow

Science Score: 93.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.md

tfaip - A Generic and Powerful Research Framework for Deep Learning based on Tensorflow

Change the graph

Change the optimizer

...

Setup

Run the Tutorial

If you have a GPU, select it by specifying its ID

Next Steps

Features of tfaip

Code design

Data-Pipeline

Deep-Learning-Features

Contributing

Benchmarks

MNIST

Data Pipeline

Owner

JOSS Publication

tfaip - a Generic and Powerful Research Framework for Deep Learning based on Tensorflow

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies