The500Feed.Live

Everything going on in AI - updated daily from 500+ sources

← Back to The 500 Feed
📄 ResearchMay 20, 2026

PulseCol: Periodically Refreshed Column-Sparse Attention for Accelerating Diffusion Language Models

Inference in diffusion large language models (dLLMs) is computationally expensive, as full self-attention must be repeatedly executed at each step of the denoising process without KV cache. Recent sparse attention methods for dLLMs mitigate this cost via block-sparse computation, which is applied on...

Read Original Article →

Source

http://arxiv.org/abs/2605.20813v1