Fast bare-bones BPE for modern tokenizer training
Fast and customizable text tokenization library with BPE and SentencePiece support