deeplearn

General deep learning tools

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.6%) to scientific vocabulary

Keywords

bilstm-crf convolutional-neural-networks crf deep-learning pypi pypi-link pytorch pytorch-cnn utility vectorized-features

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 31% confidence

Last synced: 4 months ago · JSON representation ·

Repository

General deep learning tools

Basic Info

Host: GitHub
Owner: plandes
License: other
Language: Python
Default Branch: master
Homepage: https://plandes.github.io/deeplearn/
Size: 4.92 MB

Statistics

Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Topics

bilstm-crf convolutional-neural-networks crf deep-learning pypi pypi-link pytorch pytorch-cnn utility vectorized-features

Created over 6 years ago · Last pushed 4 months ago

Metadata Files

Readme Changelog Contributing License Citation

Deep Zensols Deep Learning Framework

This deep learning library was designed to provide consistent and reproducible results.

See the full documentation.
See the paper

Features: * Easy to configure and framework to allow for programmatic debugging of neural networks. * Reproducibility of results * All random seed state is persisted in the trained model files. * Persisting of keys and key order across train, validation and test sets. * Analysis of results with complete metrics available. * A vectorization framework that allows for pickling tensors. * Additional layers: * Full BiLSTM-CRF and stand-alone CRF implementation using easy to configure constituent layers. * Easy to configure N [deep convolution layer] with automatic dimensionality calculation and configurable pooling and batch centering. * Convolutional layer factory with dimensionality calculation. * Recurrent layers that abstracts RNN, GRU and LSTM. * N deep linear layers. * Each layer's configurable with activation, dropout and batch normalization. * Pandas integration to data load, easily manage, and report results. * Multi-process for time consuming CPU feature vectorization requiring little to no coding. * Resource and tensor deallocation with memory management. * Real-time performance and loss metrics with plotting while training. * Thorough unit test coverage. * Debugging layers using easy to configure Python logging module and control points. * A workflow and API to package and distribute models. Then automatically download, install and inference with them in (optionally) two separate code bases.

Much of the code provides convenience functionality to PyTorch. However, there is functionality that could be used for other deep learning APIs.

Documentation

See the full documentation.

Obtaining

The easiest way to install the command line program is via the pip installer: bash pip3 install zensols.deeplearn

Binaries are also available on pypi.

Workflow

This package provides a workflow for processing features, training and then testing a model. A high level outline of this process follows: 1. Container objects are used to represent and access data as features. 1. Instances of data points wrap the container objects. 1. Vectorize the features of each data point in to tensors. 1. Store the vectorized tensor features to disk so they can be retrieved quickly and frequently. 1. At train time, load the vectorized features in to memory and train. 1. Test the model and store the results to disk.

To jump right in, see the examples section. However, it is better to peruse the in depth explanation with the Iris example code follows: * The initial data processing, which includes data representation to batch creation. * Creating and configuring the model. * Using a facade to train, validate and test the model. * Analysis of results, including training/validation loss graphs and performance metrics.

Examples

The Iris example is the most basic example of how to use this framework. This example is detailed in the workflow documentation in detail.

There are also examples in the form of Juypter notebooks as well, which include the: * Iris notebook data set, which is a small data set of flower dimensions as a three label classification, * MNIST notebook for the handwritten digit data set, * debugging notebook.

Attribution

This project, or example code, uses: * PyTorch as the underlying framework. * Branched code from Torch CRF for the CRF class. * pycuda for Python integration with CUDA. * scipy for scientific utility. * Pandas for prediction output. * matplotlib for plotting loss curves.

Corpora used include: * Iris data set * Adult data set * MNIST data set

Torch CRF

The CRF class was taken and modified from Kemal Kurniawan's pytorch_crf GitHub repository. See the README.md module documentation for more information. This module was forked pytorch_crf with modifications. However, the modifications were not merged and the project appears to be inactive.

Important: This project will change to use it as a dependency pending merging of the changes needed by this project. Until then, it will remain as a separate class in this project, which is easier to maintain as the only class/code is the CRF class.

The pytorch_crf repository uses the same license as this repository, which the MIT License. For this reason, there are no software/package tainting issues.

Known Bugs

The batching process can "step on its own feet" by trying to download, uncompress and compile word vectors. This happens when batching is configured as a multi-processing (more than one worker), word vectors are used (i.e. loaded as a resource library), and the word vectors have not yet been downloaded and compiled into binary files.

Citation

If you use this project in your research please use the following BibTeX entry:

bibtex @inproceedings{landes-etal-2023-deepzensols, title = "{D}eep{Z}ensols: A Deep Learning Natural Language Processing Framework for Experimentation and Reproducibility", author = "Landes, Paul and Di Eugenio, Barbara and Caragea, Cornelia", editor = "Tan, Liling and Milajevs, Dmitrijs and Chauhan, Geeticka and Gwinnup, Jeremy and Rippeth, Elijah", booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)", month = dec, year = "2023", address = "Singapore, Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.nlposs-1.16", pages = "141--146" }

Changelog

An extensive changelog is available here.

Community

Please star the project and let me know how and where you use this API. Contributions as pull requests, feedback and any input is welcome.

License

MIT License

Owner

Name: Paul Landes
Login: plandes
Kind: user

Repositories: 90
Profile: https://github.com/plandes

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  DeepZensols: Deep Learning Framework
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
date-released: 2023-12-05
repository-code: https://github.com/plandes/deepnlp
authors:
  - given-names: Paul
    family-names: Landes
    email: landes@mailc.net
    affiliation: University of Illinois at Chicago
    orcid: 'https://orcid.org/0000-0003-0985-0864'
preferred-citation:
  type: conference-paper
  authors:
    - given-names: Paul
      family-names: Landes
      email: landes@mailc.net
      affiliation: University of Illinois at Chicago
      orcid: 'https://orcid.org/0000-0003-0985-0864'
    - given-names: Barbara
      family-names: Di Eugenio
      affiliation: University of Illinois at Chicago
    - given-names: Cornelia
      family-names: Caragea
      affiliation: University of Illinois at Chicago
  title: >-
    DeepZensols: A Deep Learning Natural Language Processing Framework for
    Experimentation and Reproducibility
  url: https://aclanthology.org/2023.nlposs-1.16/
  year: 2023
  conference:
    name: >-
      Proceedings of the 3rd Workshop for Natural Language Processing Open
      Source Software, Empirical Methods in Natural Language Processing
    city: Singapore
    country: SG
    date-start: 2023-12-05
    date-end: 2023-12-05

GitHub Events

Total

Delete event: 1
Push event: 52
Create event: 13

Last Year

Delete event: 1
Push event: 52
Create event: 13

Committers

Last synced: over 1 year ago

All Time

Total Commits: 1,150
Total Committers: 1
Avg Commits per committer: 1,150.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 68
Committers: 1
Avg Commits per committer: 68.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Paul Landes	l**s@m**t	1,150

Committer Domains (Top 20 + Academic)

mailc.net: 1

Issues and Pull Requests

Last synced: 4 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 165 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 43
Total maintainers: 1

pypi.org: zensols-deeplearn

This deep learning library was designed to provide consistent and reproducible results.

Homepage: https://github.com/plandes/deeplearn
Documentation: https://plandes.github.io/deeplearn
License: MIT
Latest release: 1.14.3
published 4 months ago

Versions: 43
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 165 Last month

Rankings

Dependent packages count: 9.0%

Downloads: 15.4%

Average: 25.0%

Dependent repos count: 50.6%

Maintainers (1)

zensols

Last synced: 4 months ago

Dependencies

src/python/requirements.txt pypi

matplotlib *
numpy *
pandas *
scikit-learn *
scipy *
torch *
torchvision *
zensols.install *
zensols.util *

.github/workflows/test.yml actions

actions/checkout v2.4.0 composite
actions/setup-python v2 composite

deeplearn

Science Score: 44.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Deep Zensols Deep Learning Framework

Documentation

Obtaining

Workflow

Examples

Attribution

Torch CRF

See Also

Known Bugs

Citation

Changelog

Community

License

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: zensols-deeplearn

Rankings

Maintainers (1)

Dependencies