https://github.com/artificialzeng/keras-textclassification
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.6%) to scientific vocabulary
Repository
中文长文本分类、短句子分类、多标签分类、两句子相似度(Chinese Text Classification of Keras NLP, multi-label classify, or sentence classify, long or short),字词句向量嵌入层(embeddings)和网络层(graph)构建基类,FastText,TextCNN,CharCNN,TextRNN, RCNN, DCNN, DPCNN, VDCNN, CRNN, Bert, Xlnet, Albert, Attention, DeepMoji, HAN, 胶囊网络-CapsuleNet, Transformer-encode, Seq2seq, SWEM, LEAM, TextGCN
Basic Info
- Host: GitHub
- Owner: ArtificialZeng
- License: mit
- Language: Python
- Default Branch: master
- Homepage: https://blog.csdn.net/rensihui
- Size: 520 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
https://github.com/ArtificialZeng/Keras-TextClassification/blob/master/
# [Keras-TextClassification](https://github.com/yongzhuo/Keras-TextClassification)
[](https://pypi.org/project/Keras-TextClassification/)
[](https://travis-ci.com/yongzhuo/Keras-TextClassification)
[](https://pypi.org/project/Keras-TextClassification/)
[](https://github.com/yongzhuo/Keras-TextClassification/stargazers)
[](https://github.com/yongzhuo/Keras-TextClassification/network/members)
[](https://gitter.im/yongzhuo/Keras-TextClassification?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
# Install()
```bash
pip install Keras-TextClassification
```
```python
step2: download and unzip the dir of 'data.rar', : https://pan.baidu.com/s/1I3vydhmFEQ9nuPG2fDou8Q : rket
cover the dir of data to anaconda, like '/anaconda/3.5.1/envs/tensorflow13/Lib/site-packages/keras_textclassification/data'
step3: goto # Train&Usage() and Predict&Usage()
```
# keras_textclassification,...
- Electra-fineture(todo)
- Albert-fineture
- Xlnet-fineture
- Bert-fineture
- FastText
- TextCNN
- charCNN
- TextRNN
- TextRCNN
- TextDCNN
- TextDPCNN
- TextVDCNN
- TextCRNN
- DeepMoji
- SelfAttention
- HAN
- CapsuleNet
- Transformer-encode
- SWEM
- LEAM
- TextGCN(todo)
# run(, FastText)
- 1. keras_textclassification/m01_FastText
- 2. : train.py, : python train.py
- 3. : predict.py, : python predict.py
- : pre trainrandom embedding100data
# run(/Embedding/test/sample)
- bert,word2vec,randomtest/, word2vec(char or word), random-word, bert(chinese_L-12_H-768_A-12),
- multi_multi_class/text-cnnmulti-onehot
- sentence_similarity/bert,data/sim_webank/
- predict_bert_text_cnn.py
- tet_char_bert_embedding.py
- tet_char_bert_embedding.py
- tet_char_xlnet_embedding.py
- tet_char_random_embedding.py
- tet_char_word2vec_embedding.py
- tet_word_random_embedding.py
- tet_word_word2vec_embedding.py
# keras_textclassification/data
-
** github: https://pan.baidu.com/s/1I3vydhmFEQ9nuPG2fDou8Q : rket
- baidu_qa_2019qatitle17''
- baike_qa_train.csv
- baike_qa_valid.csv
- byte_multi_news20181070fate233, : [byte_multi_news](https://github.com/fate233/toutiao-multilevel-text-classfication-dataset)
-labels.csv
-train.csv
-valid.csv
- embeddings
- chinese_L-12_H-768_A-12/(,,
keras-berternie([https://github.com/ArthurRizar/tensorflow_ernie](https://github.com/ArthurRizar/tensorflow_ernie)),
bert-wwm(tf[https://github.com/ymcui/Chinese-BERT-wwm](https://github.com/ymcui/Chinese-BERT-wwm))
- albert_base_zh/(brightmartalbert, https://github.com/brightmart/albert_zh)
- chinese_xlnet_mid_L-24_H-768_A-12/(xlnet[https://github.com/ymcui/Chinese-PreTrained-XLNet],24)
- term_char.txt(, , wiki, )
- term_word.txt(, , )
- w2v_model_merge_short.vec(, , , )
- w2v_model_wiki_char.vec(, , , )
- model
- fast_text/
#
- 1. base((graph)(embedding)),
- 2. keras_layerslayer, conf, data, data_preprocess,
# paper
* FastText: [Bag of Tricks for Efcient Text Classication](https://arxiv.org/abs/1607.01759)
* TextCNN [Convolutional Neural Networks for Sentence Classication](https://arxiv.org/abs/1408.5882)
* charCNN-kim [Character-Aware Neural Language Models](https://arxiv.org/abs/1508.06615)
* charCNN-zhang: [Character-level Convolutional Networks for Text Classication](https://arxiv.org/pdf/1509.01626.pdf)
* TextRNN [Recurrent Neural Network for Text Classification with Multi-Task Learning](https://www.ijcai.org/Proceedings/16/Papers/408.pdf)
* RCNN [Recurrent Convolutional Neural Networks for Text Classification](http://www.nlpr.ia.ac.cn/cip/~liukang/liukangPageFile/Recurrent%20Convolutional%20Neural%20Networks%20for%20Text%20Classification.pdf)
* DCNN: [A Convolutional Neural Network for Modelling Sentences](https://arxiv.org/abs/1404.2188)
* DPCNN: [Deep Pyramid Convolutional Neural Networks for Text Categorization](https://www.aclweb.org/anthology/P17-1052)
* VDCNN: [Very Deep Convolutional Networks](https://www.aclweb.org/anthology/E17-1104)
* CRNN: [A C-LSTM Neural Network for Text Classification](https://arxiv.org/abs/1511.08630)
* DeepMoji: [Using millions of emojio ccurrences to learn any-domain represent ations for detecting sentiment, emotion and sarcasm](https://arxiv.org/abs/1708.00524)
* SelfAttention: [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
* HAN: [Hierarchical Attention Networks for Document Classification](https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf)
* CapsuleNet: [Dynamic Routing Between Capsules](https://arxiv.org/pdf/1710.09829.pdf)
* Transformer(encode or decode): [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
* Bert: [BERT: Pre-trainingofDeepBidirectionalTransformersfor LanguageUnderstanding]()
* Xlnet: [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
* Albert: [ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS](https://arxiv.org/pdf/1909.11942.pdf)
* RoBERTa: [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692)
* ELECTRA: [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://openreview.net/pdf?id=r1xMH1BtvB)
* TextGCN: [Graph Convolutional Networks for Text Classification](https://arxiv.org/abs/1809.05679)
# /
* : [https://github.com/mosu027/TextClassification](https://github.com/mosu027/TextClassification)
* : [https://github.com/brightmart/text_classification](https://github.com/brightmart/text_classification)
* Kashgari: [https://github.com/BrikerMan/Kashgari](https://github.com/BrikerMan/Kashgari)
* Ipty : [https://github.com/lpty/classifier](https://github.com/lpty/classifier)
* keras: [https://github.com/ShawnyXiao/TextClassification-Keras](https://github.com/ShawnyXiao/TextClassification-Keras)
* keras: [https://github.com/AlexYangLi/TextClassification](https://github.com/AlexYangLi/TextClassification)
* CapsuleNet: [https://github.com/bojone/Capsule](https://github.com/bojone/Capsule)
* transformer: [https://github.com/CyberZHG/keras-transformer](https://github.com/CyberZHG/keras-transformer)
* keras_albert_model: [https://github.com/TinkerMob/keras_albert_model](https://github.com/TinkerMob/keras_albert_model)
# :
```python
from keras_textclassification import train
train(graph='TextCNN', # , , "ALBERT","BERT","XLNET","FASTTEXT","TEXTCNN","CHARCNN",
# "TEXTRNN","RCNN","DCNN","DPCNN","VDCNN","CRNN","DEEPMOJI",
# "SELFATTENTION", "HAN","CAPSULE","TRANSFORMER"
label=17, # , ,
path_train_data=None, # , , csv, 'label,ques', keras_textclassification/data
path_dev_data=None, # , , csv, 'label,ques', keras_textclassification/data
rate=1, # ,
hyper_parameters=None) # , json, , embedding'char','random'
```
*!
Owner
- Name: Dr. Artificial曾小健
- Login: ArtificialZeng
- Kind: user
- Location: Beijing
- Website: https://blog.csdn.net/sinat_37574187?type=blog
- Repositories: 171
- Profile: https://github.com/ArtificialZeng
LLM practitioner/engineer, AI/ML/DL Quant