https://github.com/cahya-wirawan/text-classification
Text Classification engine using several algorithms in machine learning
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary
Keywords
Repository
Text Classification engine using several algorithms in machine learning
Basic Info
Statistics
- Stars: 10
- Watchers: 3
- Forks: 3
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Text Classification
The development of this repository has been moved to OpenTC, where it becomes a python package.
This is a text classification engine using several algorithms in machine learning. Following algorithms will be supported: - Naive Bayes (Scikit-learn) - Support Vector Machine (Scikit-learn) - Convolutional Neural Network (Tensorflow) - FastText (Facebookresearch)
The engine is running as a server listening on command and text to be classified. By default it listens on localhost port 3333, but it can be changed in the yaml configuration file.
Requirements
- Python 3.0 or newer
- Scikit-learn
- Tensorflow 1.0 or newer
textclassificationd.py
synopsis
textclassificationd.py
Description
The daemon listens for incoming connections on TCP socket and classify files or text string on demand. It reads the configuration from /etc/textclassification/textclassification.yml
Commands
The command uses a newline character as the delimiter. If textclassificationd.py doesn't recognize the command, or the command doesn't follow the requirements specified below, it will reply with an error message, but still wait for the next commands (this behaviour can be changed in the future).
PING
Check the server's state. It should reply with "PONG".
VERSION
Print the program version
RELOAD
Reload the engine
LIST_CLASSIFIER
List the supported classifiers (at the moment there are three classifiers supported: Bayesian, Support Vector Machine and Convolutional Neural Network). It shows also the status of classifier, either True (enabled) or False (disabled).
SET_CLASSIFIER
Enabled or disabled the specific classifier
PREDICT_STREAM
Classify text streams. It uses a new line character as delimiter for every sentences.
PREDICT_FILE
Classify file. It uses a new line character as delimiter for every sentences
CLOSE
Close the connection
Owner
- Name: Cahya Wirawan
- Login: cahya-wirawan
- Kind: user
- Location: Vienna, Austria
- Website: https://www.linkedin.com/in/cahyawirawan/
- Twitter: CahyaWr
- Repositories: 171
- Profile: https://github.com/cahya-wirawan
System engineer, currently working on NLP, CV and Speech Recognition for fun and curiosity
GitHub Events
Total
Last Year
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Cahya Wirawan | c****n@g****m | 47 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- nareshraju93 (1)