🤗 HuggingFaceSignificantHot Aisle
Kimi K2.6 + DFlash: 5.6x Throughput Leap to 508 Tokens/Second on 8x MI300X
Hot Aisle's DFlash optimization achieves 508 tokens/second on 8x MI300X GPUs when running Kimi K2.6, representing a 5.6x throughput improvement over baseline autoregressive serving at 90 tok/s. The optimization maintains…