https://github.com/chapzq77/roberta_zh
RoBERTa中文预训练模型: RoBERTa for Chinese
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.1%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
RoBERTa中文预训练模型: RoBERTa for Chinese
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of brightmart/roberta_zh
Created almost 7 years ago
· Last pushed almost 7 years ago
https://github.com/chapzq77/roberta_zh/blob/master/
RoBERTa for Chinese, TensorFlow & PyTorch RoBERTa ------------------------------------------------- RoBERTaBERTState of The ArtBert TensorFlowRoBERTaPyTorch *** 2019-09-08 PyTorchbert-wwmxlnet *** RoBERTa- ------------------------------------------------- ###### ** RoBERTa-zh-Large ** RoBERTa-zh-Large: Google Drive TensorFlowBert RoBERTa-zh-Large: Google Drive PyTorchBertPyTorch RoBERTa 24/1230G3100(token)2.5(instance) 24XLNet XLNet_zh RoBERTa_zh_L12 TensorFlowBert PyTorchBertPyTorch --------------------------------------------------------------- Roberta_l24_zh_base TensorFlowBert 24base10G What is RoBERTa: ------------------------------------------------- A robustly optimized method for pretraining natural language processing (NLP) systems that improves on Bidirectional Encoder Representations from Transformers, or BERT, the self-supervised method released by Google in 2018. RoBERTa, produces state-of-the-art results on the widely used NLP benchmark, General Language Understanding Evaluation (GLUE). The model delivered state-of-the-art performance on the MNLI, QNLI, RTE, STS-B, and RACE tasks and a sizable performance improvement on the GLUE benchmark. With a score of 88.5, RoBERTa reached the top position on the GLUE leaderboard, matching the performance of the previous leader, XLNet-Large. (Introduction from Facebook blog) Release Plan ------------------------------------------------- 124RoBERTa(roberta_l24_zh)30G 98 212RoBERTa(roberta_l12_zh)30G 98 36RoBERTa(roberta_l6_zh) 30G 98 4PyTorch(roberta_l6_zh_pytorch) 98 530G(bert,xlent,gpt2) 6 914 Performance ------------------------------------------------- ### CCF-Sentiment-Analysis | | F1 | | :------- | :---------: | | BERT | 80.3 | | Bert-wwm-ext | 80.5 | | XLNet | 79.6 | | Roberta-mid | 80.5 | | Roberta-large (max_seq_length=512, split_num=1) | 81.25 | guodayCCF ### XNLI | | | | | :------- | :---------: | :---------: | | BERT | 77.8 (77.4) | 77.8 (77.5) | | ERNIE | 79.7 (79.4) | 78.6 (78.2) | | BERT-wwm | 79.0 (78.4) | 78.2 (78.0) | | BERT-wwm-ext | 79.4 (78.6) | 78.7 (78.3) | | XLNet | 79.2 | 78.7 | | RoBERTa-zh-base | 79.8 |78.8 | | **RoBERTa-zh-Large** | **80.2 (80.0)** | **79.9 (79.5)** | RoBERTa_l24_zhPerformance; BERT-wwm-extXLNet; RoBERTa-zh-base12RoBERTa ### LCQMC(Sentence Pair Matching) | | (Dev) | (Test) | | :------- | :---------: | :---------: | | BERT | 89.4(88.4) | 86.9(86.4) | | ERNIE | 89.8 (89.6) | **87.2** (87.0) | | BERT-wwm |89.4 (89.2) | 87.0 (86.8) | | BERT-wwm-ext | - |- | | RoBERTa-zh-base | 88.7 | 87.0 | | **RoBERTa-zh-Large** | **89.9**(89.6) | **87.2**(86.7) | | RoBERTa-zh-Large(20w_steps) | 89.7| 87.0 | RoBERTa_l24_zhPerformance ? RoBERTa Chinese Version ------------------------------------------------- RoBERTaRoBERTa 1(Model Input Format and Next Sentence PredictionDOC-SENTENCES) 230G3100(token 32016(instance) Cloud TPU v3-256 24TPU v3-8(128G) 48kbatch size 5 mask(whole word mask)MaskWordPiecemaskmaskMask dynamic maskmaskdynamic mask ##### Whole Word Mask | | | | :------- | :--------- | | | probability | | | probability | | Mask | [MASK] [MASK] pro [MASK] ##lity | | Mask | [MASK] [MASK] [MASK] [MASK] [MASK] [MASK] [MASK] | Sentence Pair MatchingLCQMC ------------------------------------------------- LCQMC241010 tensorFlow 1 git clone https://github.com/brightmart/roberta_zh 2(roberta_zh) RoBERTaroberta_zh_largeroberta_zh/roberta_zh_large : export BERT_BASE_DIR=./roberta_zh_large export MY_DATA_DIR=./data/lcqmc python run_classifier.py \ --task_name=lcqmc_pair \ --do_train=true \ --do_eval=true \ --data_dir=$MY_DATA_DIR \ --vocab_file=$BERT_BASE_DIR/vocab.txt \ --bert_config_file=$BERT_BASE_DIR/bert_config_large.json \ --init_checkpoint=$BERT_BASE_DIR/roberta_zh_large_model.ckpt \ --max_seq_length=128 \ --train_batch_size=64 \ --learning_rate=2e-5 \ --num_train_epochs=3 \ --output_dir=./checkpoint_lcqmc task_namelcqmc_pairrun_classifier.pyprocessor,processorslcqmc PyTorchissue 9 Learning Curve -------------------------------------------------#### QQ: 836811304 If you have any question, you can raise an issue, or send me an email: brightmart@hotmail.com; You can also send pull request to report you performance on your task or add methods on how to load models for PyTorch and so on. ------------------------------------------------- skyhawk1990 ##### Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC) Reference ------------------------------------------------- 1RoBERTa: A Robustly Optimized BERT Pretraining Approach 2Pre-Training with Whole Word Masking for Chinese BERT 3BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 4LCQMC: A Large-scale Chinese Question Matching Corpus
Owner
- Name: 周奇
- Login: chapzq77
- Kind: user
- Repositories: 3
- Profile: https://github.com/chapzq77
#### QQ: 836811304
If you have any question, you can raise an issue, or send me an email: brightmart@hotmail.com;
You can also send pull request to report you performance on your task or add methods on how to load models for PyTorch and so on.
-------------------------------------------------