The500Feed.Live
Everything going on in AI - updated daily from 500+ sources
📄 ResearchJune 4, 2026
Where does Absolute Position come from in decoder-only Transformers?
RoPE-trained transformers distinguish absolute position in their attention patterns, even though RoPE encodes only relative offsets in the inner product. We trace this leakage to two architectural components, The causal mask is responsible for the first: its per-query softmax denominator depends on ...
Read Original Article →