List of AI News about GPTQ
| Time | Details |
|---|---|
|
2026-06-24 11:50 |
DFlash Boosts Qwen inference 4x with zero loss
According to @_avichawla, DFlash speculative decoding lifted a 122B Qwen model from 250 to 1000+ tokens sec with zero quality loss by parallel drafting. |