Recent Releases of allamo
allamo - v6.0.0
- Major refactor
- Add support for FSDP2 and TP
- Add support for various activation functions and introduce LRA function
- Add support for FlexAttention, xFormers, FlashAttention3. Improve custom mask and sliding window handling
Full Changelog: https://github.com/chrisociepa/allamo/compare/v5.0.0...v6.0.0
- Python
Published by chrisociepa 12 months ago
allamo - v5.0.0
- Added a hook for external program invocation after saving regular checkpoints
- Implemented support for SFT dataset packing with correct RoPE encoding and without cross-contamination
- Added support for a new data format: ALM
- Introduced support for DPO and DPO-Positive training methods
- Added optional sample buffering in the dataloader
- Added new utility scripts for data preparation and tokenizer replacement
- Fixed bugs in main training scripts and utility scripts
Full Changelog: https://github.com/chrisociepa/allamo/compare/v4.1.0...v5.0.0
- Python
Published by chrisociepa over 1 year ago
allamo - v4.1.0
- Enhanced checkpoint management
- Resolved issues with saving checkpoint configurations in JSON format during HF model imports
- Expanded HF model export capabilities to encompass extra configuration parameters
- Improved DataLoader's memory efficiency by releasing memory more effectively between dataset loads
- Discontinued support for the legacy (Simple) DataLoader (breaking change)
Full Changelog: https://github.com/chrisociepa/allamo/compare/v4.0.0...v4.1.0
- Python
Published by chrisociepa almost 2 years ago
allamo - v3.1.0
- Added option to configure FSDP sharding strategy
- Added optional logging of MD5 checksum for model checkpoint at the end of each epoch
- Added option to ignore and overwrite the last checkpoint backup
- Bug fixes in the export script
Full Changelog: https://github.com/chrisociepa/allamo/compare/v3.0.0...v3.1.0
- Python
Published by chrisociepa about 2 years ago
allamo - v3.0.0
- Change checkpoint configuration storage to JSON file format (breaking change - use
convert_config_checkpoint_to_json.pyscript to convert your checkpoints) - Add support for complex sample formats in DataLoader
- Move util scripts to the scripts directory
Full Changelog: https://github.com/chrisociepa/allamo/compare/v2.2.2...v3.0.0
- Python
Published by chrisociepa about 2 years ago
allamo - v2.2.1
- Improvements in scripts for importing Hugging Face model weights
- Enhancements in the script for exporting weights to the Hugging Face format, including added support for Mistral models and setting the output model data type
- Refactoring of RMSNorm implementation within the model
- Truncation of overly long samples in the DataLoader
Full Changelog: https://github.com/chrisociepa/allamo/compare/v2.2.0...v2.2.1
- Python
Published by chrisociepa about 2 years ago
allamo - v2.2.0
- Scripts to import Llama and Mistral weights from HuggingFace
- Prevent enabling installed FlashAttention2 by default
- Allow configuration of the model's
intermediate_size - Reset training upon checkpoint loading if the loaded
iter_numexceedsmax_iters - Adjust the allocated buffer size for rotary embeddings to match the
block_size
Full Changelog: https://github.com/chrisociepa/allamo/compare/v2.1.0...v2.2.0
- Python
Published by chrisociepa about 2 years ago