https://github.com/bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Basic Info
Statistics
- Stars: 996
- Watchers: 37
- Forks: 99
- Open Issues: 20
- Releases: 0
Topics
Metadata Files
README.md
bigscience
Research workshop on large language models - The Summer of Language Models 21
At the moment we have 2 code repos:
- https://github.com/bigscience-workshop/Megatron-DeepSpeed - this is our flagship code base
- https://github.com/bigscience-workshop/bigscience - (this repo) for everything else - docs, experiments, etc.
Currently, the most active segments of this repo are:
- JZ - Lots of information about our work environment which helps evaluate, plan and get things done
- Experiments - many experiments are being done. Documentation, result tables, scripts and logs are all there
- Datasets info
- Train - all the information about the current trainings (see below for the most important ones)
We have READMEs for specific aspects, such as: - hub integration
Trainings
While we keep detailed chronicles of experiments and findings for some of the main trainings, here is a doc that contains a summary of the most important findings: Lessons learned
Train 1 - 13B - unmodified Megatron gpt2 - baseline
- the full spec and discussions
- the training script
- checkpoints and logs:
- chronicles
You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:
```
perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -sI $u]=~/content-length: (\d+)/; \
print qx[curl -sr $b-$e -L $u] if $e>$b; $b=$e; sleep 300}' \
https://huggingface.co/bigscience/tr1-13B-logs/resolve/main/main_log.txt
```
Train 3
Architecture and scaling baseline runs: no fancy tricks, just GPT2. Here are links to the respective tensorboards:
| Size | 1B3 | 760M | 350M | 125M | |--------------------- |----- |------ |------ |------ | | C4 + low warmup | a | b | c | | | OSCAR + low warmup | f | | | | | C4 + high warmup | e | | | | | OSCAR + high warmup | d (current baseline) | g | h | i | | Pile + high warmup | m | j | k | l |
Train 8
104B - unmodified Megatron gpt2 - with extra-wide hidden size to learn how to deal with training instabilities
- the full spec and discussions
- the training script
- checkpoints and logs:
- chronicles
You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:
perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -sI $u]=~/content-length: (\d+)/; \
print qx[curl -sr $b-$e -L $u] if $e>$b; $b=$e; sleep 300}' \
https://cdn-lfs.huggingface.co/bigscience/tr8-104B-logs/b2cc478d5ae7c9ec937ea2db1d2fe09de593fa2ec38c171d6cc5dca094cd79f9
Train 11
This is the current main training
tr11-176B-ml
- the full spec and discussions
- the training script
- checkpoints and logs:
- chronicles-prequel
- chronicles
You can watch the training logs live by running this tail -f like script over remote log file that gets synced to the hub once an hour:
perl -e '$u=shift; $b=0; while(1){($e)=qx[curl -LsI $u]=~/2 200.*?content-length: (\d+)/s; \
print qx[curl -Lsr $b-$e $u] if $e>$b; $b=$e; sleep 300}' \
https://huggingface.co/bigscience/tr11-176B-ml-logs/resolve/main/logs/main/main_log.txt
Owner
- Name: BigScience Workshop
- Login: bigscience-workshop
- Kind: organization
- Email: bigscience-contact@googlegroups.com
- Website: https://bigscience.huggingface.co
- Twitter: BigScienceW
- Repositories: 28
- Profile: https://github.com/bigscience-workshop
Research workshop on large language models - The Summer of Language Models 21
GitHub Events
Total
- Issues event: 1
- Watch event: 23
- Fork event: 2
Last Year
- Issues event: 1
- Watch event: 23
- Fork event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Stas Bekman | s****s@s****g | 680 |
| Muennighoff | n****f@g****m | 171 |
| TevenLeScao | t****o@g****m | 110 |
| thomasw21 | 2****1 | 74 |
| SaulLu | l****m@g****m | 27 |
| ontocord | o****d@g****m | 14 |
| Iz Beltagy | b****y@a****g | 9 |
| Victor SANH | v****h@g****m | 4 |
| HugoLaurencon | h****n@g****m | 3 |
| Max Ryabinin | m****0@g****m | 2 |
| Younes Belkada | 4****a | 2 |
| Christopher Akiki | c****i@p****m | 1 |
| Nouamane Tazi | n****8@g****m | 1 |
| Pierre Colombo | P****o | 1 |
| Thomas Wolf | t****f | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 19
- Total pull requests: 53
- Average time to close issues: 5 months
- Average time to close pull requests: 19 days
- Total issue authors: 18
- Total pull request authors: 12
- Average comments per issue: 0.79
- Average comments per pull request: 0.94
- Merged pull requests: 40
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- robertLiuLinFeng (2)
- RomanCast (1)
- OhadRubin (1)
- misska1 (1)
- robinfang7 (1)
- BlinkDL (1)
- henan991201 (1)
- cmsflash (1)
- sashavor (1)
- ViktorThink (1)
- yu202147657 (1)
- zhangyipin (1)
- stas00 (1)
- kamalkraj (1)
- celsofranssa (1)
Pull Request Authors
- thomasw21 (21)
- SaulLu (11)
- Muennighoff (5)
- younesbelkada (3)
- TevenLeScao (3)
- cakiki (2)
- ibeltagy (2)
- adammoody (1)
- NouamaneTazi (1)
- EIFY (1)
- eltociear (1)
- thomwolf (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Sphinx ==1.8.5
- bump2version ==0.5.11
- coverage ==4.5.4
- flake8 ==3.7.8
- pip ==19.2.3
- pytest ==4.6.5
- pytest-runner ==5.1
- tox ==3.14.0
- twine ==1.14.0
- watchdog ==0.9.0
- wheel ==0.33.6