https://github.com/artificialzeng/keras-textclassification

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

https://github.com/artificialzeng/keras-textclassification

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.6%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of yongzhuo/Keras-TextClassification
Created almost 6 years ago · Last pushed over 4 years ago

https://github.com/ArtificialZeng/Keras-TextClassification/blob/master/

# [Keras-TextClassification](https://github.com/yongzhuo/Keras-TextClassification)

[![PyPI](https://img.shields.io/pypi/v/Keras-TextClassification)](https://pypi.org/project/Keras-TextClassification/)
[![Build Status](https://travis-ci.com/yongzhuo/Keras-TextClassification.svg?branch=master)](https://travis-ci.com/yongzhuo/Keras-TextClassification)
[![PyPI_downloads](https://img.shields.io/pypi/dm/Keras-TextClassification)](https://pypi.org/project/Keras-TextClassification/)
[![Stars](https://img.shields.io/github/stars/yongzhuo/Keras-TextClassification?style=social)](https://github.com/yongzhuo/Keras-TextClassification/stargazers)
[![Forks](https://img.shields.io/github/forks/yongzhuo/Keras-TextClassification.svg?style=social)](https://github.com/yongzhuo/Keras-TextClassification/network/members)
[![Join the chat at https://gitter.im/yongzhuo/Keras-TextClassification](https://badges.gitter.im/yongzhuo/Keras-TextClassification.svg)](https://gitter.im/yongzhuo/Keras-TextClassification?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)



# Install()

```bash
pip install Keras-TextClassification
```

```python
step2: download and unzip the dir of 'data.rar', : https://pan.baidu.com/s/1I3vydhmFEQ9nuPG2fDou8Q : rket
       cover the dir of data to anaconda, like '/anaconda/3.5.1/envs/tensorflow13/Lib/site-packages/keras_textclassification/data'
step3: goto # Train&Usage() and Predict&Usage()
```

# keras_textclassification,...
    - Electra-fineture(todo)
    - Albert-fineture
    - Xlnet-fineture
    - Bert-fineture
    - FastText
    - TextCNN
    - charCNN
    - TextRNN
    - TextRCNN
    - TextDCNN
    - TextDPCNN
    - TextVDCNN
    - TextCRNN
    - DeepMoji
    - SelfAttention
    - HAN
    - CapsuleNet
    - Transformer-encode
    - SWEM
    - LEAM
    - TextGCN(todo)


# run(, FastText)
    - 1. keras_textclassification/m01_FastText
    - 2. :  train.py,   : python train.py
    - 3. :  predict.py, : python predict.py
    - : pre trainrandom embedding100data

# run(/Embedding/test/sample)
    - bert,word2vec,randomtest/, word2vec(char or word), random-word,  bert(chinese_L-12_H-768_A-12),
    - multi_multi_class/text-cnnmulti-onehot
    - sentence_similarity/bert,data/sim_webank/
    - predict_bert_text_cnn.py
    - tet_char_bert_embedding.py
    - tet_char_bert_embedding.py
    - tet_char_xlnet_embedding.py
    - tet_char_random_embedding.py
    - tet_char_word2vec_embedding.py
    - tet_word_random_embedding.py
    - tet_word_word2vec_embedding.py

# keras_textclassification/data
    - 
      ** github: https://pan.baidu.com/s/1I3vydhmFEQ9nuPG2fDou8Q : rket
    - baidu_qa_2019qatitle17''
       - baike_qa_train.csv
       - baike_qa_valid.csv
    - byte_multi_news20181070fate233, : [byte_multi_news](https://github.com/fate233/toutiao-multilevel-text-classfication-dataset)
       -labels.csv
       -train.csv
       -valid.csv
    - embeddings
       - chinese_L-12_H-768_A-12/(,,
                                  keras-berternie([https://github.com/ArthurRizar/tensorflow_ernie](https://github.com/ArthurRizar/tensorflow_ernie)),
                                  bert-wwm(tf[https://github.com/ymcui/Chinese-BERT-wwm](https://github.com/ymcui/Chinese-BERT-wwm))
       - albert_base_zh/(brightmartalbert, https://github.com/brightmart/albert_zh)
       - chinese_xlnet_mid_L-24_H-768_A-12/(xlnet[https://github.com/ymcui/Chinese-PreTrained-XLNet],24)
       - term_char.txt(, , wiki, )
       - term_word.txt(, , )
       - w2v_model_merge_short.vec(, , , )
       - w2v_model_wiki_char.vec(, , , )
    - model
       - fast_text/

# 
  - 1. base((graph)(embedding)),
  - 2. keras_layerslayer, conf, data, data_preprocess,


# paper
* FastText:   [Bag of Tricks for Efcient Text Classication](https://arxiv.org/abs/1607.01759)
* TextCNN   [Convolutional Neural Networks for Sentence Classication](https://arxiv.org/abs/1408.5882)
* charCNN-kim   [Character-Aware Neural Language Models](https://arxiv.org/abs/1508.06615)
* charCNN-zhang:  [Character-level Convolutional Networks for Text Classication](https://arxiv.org/pdf/1509.01626.pdf)
* TextRNN   [Recurrent Neural Network for Text Classification with Multi-Task Learning](https://www.ijcai.org/Proceedings/16/Papers/408.pdf)
* RCNN      [Recurrent Convolutional Neural Networks for Text Classification](http://www.nlpr.ia.ac.cn/cip/~liukang/liukangPageFile/Recurrent%20Convolutional%20Neural%20Networks%20for%20Text%20Classification.pdf)
* DCNN:       [A Convolutional Neural Network for Modelling Sentences](https://arxiv.org/abs/1404.2188)
* DPCNN:      [Deep Pyramid Convolutional Neural Networks for Text Categorization](https://www.aclweb.org/anthology/P17-1052)
* VDCNN:      [Very Deep Convolutional Networks](https://www.aclweb.org/anthology/E17-1104)
* CRNN:        [A C-LSTM Neural Network for Text Classification](https://arxiv.org/abs/1511.08630)
* DeepMoji:    [Using millions of emojio ccurrences to learn any-domain represent ations for detecting sentiment, emotion and sarcasm](https://arxiv.org/abs/1708.00524)
* SelfAttention: [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
* HAN: [Hierarchical Attention Networks for Document Classification](https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf)
* CapsuleNet: [Dynamic Routing Between Capsules](https://arxiv.org/pdf/1710.09829.pdf)
* Transformer(encode or decode): [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
* Bert:                  [BERT: Pre-trainingofDeepBidirectionalTransformersfor LanguageUnderstanding]()
* Xlnet:                 [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
* Albert:                [ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS](https://arxiv.org/pdf/1909.11942.pdf)
* RoBERTa:               [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692)
* ELECTRA:               [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://openreview.net/pdf?id=r1xMH1BtvB)
* TextGCN:               [Graph Convolutional Networks for Text Classification](https://arxiv.org/abs/1809.05679)


# /
* :   [https://github.com/mosu027/TextClassification](https://github.com/mosu027/TextClassification)
* : [https://github.com/brightmart/text_classification](https://github.com/brightmart/text_classification)
* Kashgari: [https://github.com/BrikerMan/Kashgari](https://github.com/BrikerMan/Kashgari)
* Ipty : [https://github.com/lpty/classifier](https://github.com/lpty/classifier)
* keras: [https://github.com/ShawnyXiao/TextClassification-Keras](https://github.com/ShawnyXiao/TextClassification-Keras)
* keras: [https://github.com/AlexYangLi/TextClassification](https://github.com/AlexYangLi/TextClassification)
* CapsuleNet: [https://github.com/bojone/Capsule](https://github.com/bojone/Capsule)
* transformer: [https://github.com/CyberZHG/keras-transformer](https://github.com/CyberZHG/keras-transformer)
* keras_albert_model: [https://github.com/TinkerMob/keras_albert_model](https://github.com/TinkerMob/keras_albert_model)


# :
```python
from keras_textclassification import train
train(graph='TextCNN', # , , "ALBERT","BERT","XLNET","FASTTEXT","TEXTCNN","CHARCNN",
                       # "TEXTRNN","RCNN","DCNN","DPCNN","VDCNN","CRNN","DEEPMOJI",
                       # "SELFATTENTION", "HAN","CAPSULE","TRANSFORMER"
     label=17,         # , , 
     path_train_data=None, # , , csv, 'label,ques', keras_textclassification/data
     path_dev_data=None, # , , csv, 'label,ques', keras_textclassification/data
     rate=1,             # , 
     hyper_parameters=None) # , json, , embedding'char','random'
```


*!

Owner

  • Name: Dr. Artificial曾小健
  • Login: ArtificialZeng
  • Kind: user
  • Location: Beijing

LLM practitioner/engineer, AI/ML/DL Quant

GitHub Events

Total
Last Year