https://github.com/cahya-wirawan/text-classification

Text Classification engine using several algorithms in machine learning

https://github.com/cahya-wirawan/text-classification

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.4%) to scientific vocabulary

Keywords

bayesian machine-learning svm tensorflow text-analysis text-classification
Last synced: 5 months ago · JSON representation

Repository

Text Classification engine using several algorithms in machine learning

Basic Info
  • Host: GitHub
  • Owner: cahya-wirawan
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 70.3 KB
Statistics
  • Stars: 10
  • Watchers: 3
  • Forks: 3
  • Open Issues: 1
  • Releases: 0
Topics
bayesian machine-learning svm tensorflow text-analysis text-classification
Created almost 9 years ago · Last pushed almost 9 years ago
Metadata Files
Readme License

README.md

Text Classification

The development of this repository has been moved to OpenTC, where it becomes a python package.

This is a text classification engine using several algorithms in machine learning. Following algorithms will be supported: - Naive Bayes (Scikit-learn) - Support Vector Machine (Scikit-learn) - Convolutional Neural Network (Tensorflow) - FastText (Facebookresearch)

The engine is running as a server listening on command and text to be classified. By default it listens on localhost port 3333, but it can be changed in the yaml configuration file.

Requirements

  • Python 3.0 or newer
  • Scikit-learn
  • Tensorflow 1.0 or newer

textclassificationd.py

synopsis

textclassificationd.py

Description

The daemon listens for incoming connections on TCP socket and classify files or text string on demand. It reads the configuration from /etc/textclassification/textclassification.yml

Commands

The command uses a newline character as the delimiter. If textclassificationd.py doesn't recognize the command, or the command doesn't follow the requirements specified below, it will reply with an error message, but still wait for the next commands (this behaviour can be changed in the future).

PING

Check the server's state. It should reply with "PONG".

VERSION

Print the program version

RELOAD

Reload the engine

LIST_CLASSIFIER

List the supported classifiers (at the moment there are three classifiers supported: Bayesian, Support Vector Machine and Convolutional Neural Network). It shows also the status of classifier, either True (enabled) or False (disabled).

SET_CLASSIFIER

Enabled or disabled the specific classifier

PREDICT_STREAM

Classify text streams. It uses a new line character as delimiter for every sentences.

PREDICT_FILE

Classify file. It uses a new line character as delimiter for every sentences

CLOSE

Close the connection

Owner

  • Name: Cahya Wirawan
  • Login: cahya-wirawan
  • Kind: user
  • Location: Vienna, Austria

System engineer, currently working on NLP, CV and Speech Recognition for fun and curiosity

GitHub Events

Total
Last Year

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 47
  • Total Committers: 1
  • Avg Commits per committer: 47.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Cahya Wirawan c****n@g****m 47

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • nareshraju93 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels