List of AI News about DFlash
| Time | Details |
|---|---|
|
2026-06-24 11:50 |
DFlash Boosts Qwen inference 4x with zero loss
According to @_avichawla, DFlash speculative decoding lifted a 122B Qwen model from 250 to 1000+ tokens sec with zero quality loss by parallel drafting. |
|
2026-05-10 06:58 |
DFlash Speculative Decoding Delivers 8.5x Speed
According to @_avichawla, DFlash speeds LLM inference 8.5x via parallel draft tokens, maintaining accuracy and integrating with vLLM, SGLang, and Transformers. |