Yang, S., & Zhang, Y. (2024). FLA: A Triton-Based Library for Hardware-Efficient Implementations of Linear Attention Mechanism (Version 0.1) [Computer software]. https://github.com/fla-org/flash-linear-attention