Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.9%) to scientific vocabulary
Repository
bart-small model release page
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
bart-small
bart-small is a variant of bart-base with reduced size. Most of the hyperparamters are the same as bart-base apart from those defining the size of the model. In particular, for both the encoder and the decoder we applied the following changes:
- Max positional embeddings: 512
- Hidden size: 512
- Attention heads: 8
- FF size: 2048
Get the model
Thanks to the transformers library, using bart-small is as simple as:
```python from transformers import BartModel, BartConfig, BartTokenizerFast
config = BartConfig.frompretrained('lucadiliello/bart-small') model = BartModel.frompretrained('lucadiliello/bart-small') tokenizer = BartTokenizerFast.from_pretrained('lucadiliello/bart-small') ```
Pre-Training
Training hyperparameters:
- GPUs: 8x A100 with deepspeed in FP32
- total batch size: 1024
- number of training steps: 200k
- max sequence length: 512
- denoising:
- probability: 0.3
- max number of spans per sample: 200
- whole word denoising (similar to BERT's whole word masking)
- span length distribution: poisson (λ=2.5 words)
- sentences shuffling
- optimization:
- AdamW
- lr: triangular with peak 1e-04
- warmup steps: 10K
- weight decay: 0.01
Datasets: - BookCorpus - CC-News - OpenWebText - English Wikipedia
Benchmarks
Summarization
| CNN/DailyMail | XSum | ||||
|---|---|---|---|---|---|
| R1 | R2 | RL | R1 | R2 | RL |
| 40.2 | 18.2 | 37.6 | 34.8 | 13.0 | 27.8 |
Owner
- Name: Luca Di Liello
- Login: lucadiliello
- Kind: user
- Location: San Francisco
- Company: Amazon Alexa
- Website: lucadiliello.github.io
- Repositories: 59
- Profile: https://github.com/lucadiliello
Applied Scientist at Amazon Alexa AI.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: bart-small
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Luca
family-names: Di Liello
email: luca.diliello@unitn.it
affiliation: University of Trento
orcid: 'https://orcid.org/0000-0002-9970-5048'
repository-code: 'https://github.com/lucadiliello/bart-small'
url: 'https://huggingface.co/lucadiliello/bart-small'
abstract: >-
BART-Small is a lighter version of BART-Base with less
attention heads, smaller FFT and a smaller hidden-size.
keywords:
- 'LM, Denoising, BART'
license: GPL-2.0
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1