🌐 Alibaba QwenSignificantQwen
FlashQLA: High-Performance Linear Attention Kernels Built on TileLang
FlashQLA is a high-performance linear attention kernel library built on TileLang, specifically optimized for agentic AI on personal devices and edge hardware. The implementation achieves 2-3x forward pass speedup and 2x…