https://github.com/chapzq77/roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese

https://github.com/chapzq77/roberta_zh

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

RoBERTa中文预训练模型: RoBERTa for Chinese

Basic Info
  • Host: GitHub
  • Owner: chapzq77
  • Default Branch: master
  • Homepage:
  • Size: 234 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of brightmart/roberta_zh
Created almost 7 years ago · Last pushed almost 7 years ago

https://github.com/chapzq77/roberta_zh/blob/master/

RoBERTa for Chinese, TensorFlow & PyTorch

RoBERTa 
-------------------------------------------------
RoBERTaBERTState of The ArtBert

TensorFlowRoBERTaPyTorch

*** 2019-09-08  PyTorchbert-wwmxlnet ***

RoBERTa-
-------------------------------------------------
###### **  RoBERTa-zh-Large **
RoBERTa-zh-Large:  Google Drive    TensorFlowBert 

RoBERTa-zh-Large:  Google Drive    PyTorchBertPyTorch

RoBERTa 24/1230G3100(token)2.5(instance)



24XLNet XLNet_zh

RoBERTa_zh_L12 TensorFlowBert   PyTorchBertPyTorch

---------------------------------------------------------------

Roberta_l24_zh_base TensorFlowBert 

24base10G



What is RoBERTa:
-------------------------------------------------
    A robustly optimized method for pretraining natural language processing (NLP) systems that improves on Bidirectional Encoder Representations from Transformers, or BERT, the self-supervised method released by Google in 2018. 
    
    RoBERTa, produces state-of-the-art results on the widely used NLP benchmark, General Language Understanding Evaluation (GLUE). The model delivered state-of-the-art performance on the MNLI, QNLI, RTE, STS-B, and RACE tasks and a sizable performance improvement on the GLUE benchmark. With a score of 88.5, RoBERTa reached the top position on the GLUE leaderboard, matching the performance of the previous leader, XLNet-Large. 
    
    (Introduction from Facebook blog)

 Release Plan
-------------------------------------------------
124RoBERTa(roberta_l24_zh)30G        98

212RoBERTa(roberta_l12_zh)30G        98

36RoBERTa(roberta_l6_zh) 30G         98

4PyTorch(roberta_l6_zh_pytorch)                98

530G(bert,xlent,gpt2)       

6                                     914

 Performance 
-------------------------------------------------
### CCF-Sentiment-Analysis

|  | F1 |
| :------- | :---------: |
| BERT | 80.3 |
| Bert-wwm-ext | 80.5 | 
| XLNet | 79.6 | 
| Roberta-mid | 80.5 |
| Roberta-large (max_seq_length=512, split_num=1) | 81.25 |

guodayCCF

### XNLI

|  |  |  |
| :------- | :---------: | :---------: |
| BERT | 77.8 (77.4) | 77.8 (77.5) | 
| ERNIE | 79.7 (79.4) | 78.6 (78.2) | 
| BERT-wwm | 79.0 (78.4) | 78.2 (78.0) | 
| BERT-wwm-ext | 79.4 (78.6) | 78.7 (78.3) |
| XLNet | 79.2  | 78.7 |
| RoBERTa-zh-base | 79.8 |78.8  |
| **RoBERTa-zh-Large** | **80.2 (80.0)** | **79.9 (79.5)** |

RoBERTa_l24_zhPerformance; 

BERT-wwm-extXLNet; RoBERTa-zh-base12RoBERTa

###  LCQMC(Sentence Pair Matching)

|  | (Dev) | (Test) |
| :------- | :---------: | :---------: |
| BERT | 89.4(88.4) | 86.9(86.4) | 
| ERNIE | 89.8 (89.6) | **87.2** (87.0) | 
| BERT-wwm |89.4 (89.2) | 87.0 (86.8) | 
| BERT-wwm-ext | - |-  |
| RoBERTa-zh-base | 88.7 | 87.0  |
| **RoBERTa-zh-Large** | **89.9**(89.6) | **87.2**(86.7) |
| RoBERTa-zh-Large(20w_steps) | 89.7| 87.0 |

RoBERTa_l24_zhPerformance

? 

RoBERTa Chinese Version
-------------------------------------------------
RoBERTaRoBERTa

    1(Model Input Format and Next Sentence PredictionDOC-SENTENCES)
    
    230G3100(token
    
    
    
    32016(instance) Cloud TPU v3-256 24TPU v3-8(128G)
    
    48kbatch size
    
    5

mask(whole word mask)MaskWordPiecemaskmaskMask

dynamic maskmaskdynamic mask

#####  Whole Word Mask

|  |  |
| :------- | :--------- |
|  | probability |
|  |          probability  |
| Mask |     [MASK]   [MASK]       pro [MASK] ##lity  |
| Mask |     [MASK] [MASK]  [MASK] [MASK]      [MASK] [MASK] [MASK]  |

Sentence Pair MatchingLCQMC
-------------------------------------------------

LCQMC241010

tensorFlow

    1 git clone https://github.com/brightmart/roberta_zh
    
    2(roberta_zh)
    
      RoBERTaroberta_zh_largeroberta_zh/roberta_zh_large
    
    :
  
    export BERT_BASE_DIR=./roberta_zh_large
    export MY_DATA_DIR=./data/lcqmc
    python run_classifier.py \
      --task_name=lcqmc_pair \
      --do_train=true \
      --do_eval=true \
      --data_dir=$MY_DATA_DIR \
      --vocab_file=$BERT_BASE_DIR/vocab.txt \
      --bert_config_file=$BERT_BASE_DIR/bert_config_large.json \
      --init_checkpoint=$BERT_BASE_DIR/roberta_zh_large_model.ckpt \
      --max_seq_length=128 \
      --train_batch_size=64 \
      --learning_rate=2e-5 \
      --num_train_epochs=3 \
      --output_dir=./checkpoint_lcqmc
    
    task_namelcqmc_pairrun_classifier.pyprocessor,processorslcqmc

PyTorchissue 9

Learning Curve 
-------------------------------------------------



#### QQ: 836811304

If you have any question, you can raise an issue, or send me an email: brightmart@hotmail.com;

You can also send pull request to report you performance on your task or add methods on how to load models for PyTorch and so on.





-------------------------------------------------
 skyhawk1990


##### Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC)




Reference
-------------------------------------------------
1RoBERTa: A Robustly Optimized BERT Pretraining Approach

2Pre-Training with Whole Word Masking for Chinese BERT

3BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

4LCQMC: A Large-scale Chinese Question Matching Corpus

Owner

  • Name: 周奇
  • Login: chapzq77
  • Kind: user

GitHub Events

Total
Last Year