https://github.com/cyberagentailab/japanese-nli-model
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.4%) to scientific vocabulary
Keywords
bert
japanese
natural-language-processing
natural-language-understanding
nli
nlp
roberta
sentence-transformers
transformers
Last synced: 5 months ago
·
JSON representation
Repository
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
Basic Info
- Host: GitHub
- Owner: CyberAgentAILab
- License: cc-by-4.0
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://huggingface.co/cyberagent/xlm-roberta-large-jnli-jsick
- Size: 33.2 KB
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
bert
japanese
natural-language-processing
natural-language-understanding
nli
nlp
roberta
sentence-transformers
transformers
Created over 3 years ago
· Last pushed over 3 years ago
https://github.com/CyberAgentAILab/japanese-nli-model/blob/main/
# Japanese Natural Language Inference Model
This repository provides the code for [Japanese NLI model](https://huggingface.co/cyberagent/xlm-roberta-large-jnli-jsick), a fine-tuned masked language model.
## Performance
The model showed performance comparable with those reported in [JGLUE](https://github.com/yahoojapan/JGLUE) [Kurihara et al. 2022] and [JSICK](https://github.com/verypluming/JSICK) [Yanaka and Mineshima 2022] papers, in terms of overall accuracy:
| Model | JGLUE-JNLI valid [%] | JSICK test [%] |
|:-------------------------------:|:----:|:-----:|
| [Kurihara et al. 2022] | 91.9 | N/A |
| [Yanaka and Mineshima 2022] | N/A | 89.1 |
| ours using both JNLI and JSICK | 90.9 | 89.0 |
## References
- Hitomi Yanaka and Koji Mineshima. [Compositional Evaluation on Japanese Textual Entailment and Similarity](https://arxiv.org/abs/2208.04826). TACL2022.
- Kentaro Kurihara, Daisuke Kawahara, and Tomohide Shibata. [JGLUE: Japanese General Language Understanding Evaluation](https://aclanthology.org/2022.lrec-1.317/). LREC2022.
- Nils Reimers and Iryna Gurevych. [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://aclanthology.org/D19-1410/). EMNLP-IJCNLP2019.
- Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmn, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. [Unsupervised Cross-lingual Representation Learning at Scale](https://aclanthology.org/2020.acl-main.747/). ACL2020.
## Appendix: Hyperparameters
### random seeds
Yes, we tested only a single run :(
```python
torch.manual_seed(0)
random.seed(0)
np.random.seed(0)
```
### dataset order
1. JSICK
1. JGLUE
### labels
We converted string label into integer using the following mapping:
```python
label2int = {"contradiction": 0, "entailment": 1, "neutral": 2}
```
### CrossEncoder
We mimicked `batch_size=128` using gradient accumulation `32 * 4 = 128`.
```python
batch_size=32,
shuffle=True,
epochs=3,
accumulation_steps=4,
optimizer_params={'lr': 5e-5},
warmup_steps=math.ceil(0.1 * len(data)),
```
Owner
- Name: CyberAgent AI Lab
- Login: CyberAgentAILab
- Kind: organization
- Location: Japan
- Website: https://cyberagent.ai/ailab/
- Twitter: cyberagent_ai
- Repositories: 7
- Profile: https://github.com/CyberAgentAILab
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1