Recent Releases of cramming

cramming - New Torch 2.1 Version

This release is the new version for torch 2.1. The code is nicer to read, has fewer dependencies (no more flash attention installations), data can now be easily streamed, and training is faster.

The new checkpoints are about 2% better on GLUE with the same budget.

- Python
Published by JonasGeiping over 2 years ago

cramming - Old Version

This release is the old version, usable with PyTorch 1.13.

- Python
Published by JonasGeiping over 2 years ago