Recent Releases of hlb-gpt

hlb-gpt - v0.4.0 beta (<100s)

This is a big one. New attention block, new architecture scales, one-parameter scaling to 1.5B, and so much more.

(twitter thread will be updated here when possible).

- Python
Published by tysam-code about 2 years ago

hlb-gpt - v0.3.0 beta (~136-140s)

Hiya there! In this release, we upgrade the MLP a bit to include the SiGLU activation function (over the default non-linearly-gated GELU function), convert the network over to pure bfloat16 (from a mixed precision dynamic), and perform various optimizations to bring our training time down another 18-22 seconds or so (woop woop!) For more info, check out the twitter thread detailing some of the tweaks for this patch (https://twitter.com/hi_tysam/status/1639975149951672321)! <3 :D :)))) <3 🎆 🎇 🎇 🎆

- Python
Published by tysam-code about 3 years ago

hlb-gpt - v0.2.0 beta

Hi there! In this release, we add sequence length scheduling and make a few other tweaks! For more info on the sequence length scheduling (and the relevant supporting changes), please see the release tweet at https://twitter.com/hi_tysam/status/1637691454012153856?cxt=HHwWgICzgevsn7otAAAA

- Python
Published by tysam-code about 3 years ago

hlb-gpt - beta v0.1.0

Greetings. In this release (originally from 3/12/23), we add a few features that cuts the training time nearly in half. This tag also includes a hotfix to restore backwards compatibility for people with torch versions less than 2.0.

For a more detailed summary of this release, please check out https://twitter.com/hi_tysam/status/1635123488674697218?cxt=HHwWhMDSpcqJkLEtAAAA

- Python
Published by tysam-code over 3 years ago

hlb-gpt - baseline 0.0.0

Hi hi hiya there! <3 :D Feel free to check out the README.md on this tag, it has the best summary of this release that I could probably give (also, so much typing and proofreading today, as always on release days I suppose, I am beat! :'D)

- Python
Published by tysam-code over 3 years ago