Geiping, J., & Goldstein, T. (2022). Cramming: Training a Language Model on a Single GPU in One Day (Version 1.0.0) [Computer software]. https://doi.org/10.48550/arXiv.2212.14034