The500Feed.Live

Everything going on in AI - updated daily from 500+ sources

← Back to The 500 Feed
📄 ResearchJune 15, 2026

LOCUS: Local Visual Cue Search for Enhancing Fine-Grained Perception in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) remain unreliable on fine-grained visual perception, even when high-resolution inputs preserve the necessary local details. We identify this limitation as visual context rot: decisive evidence may exist in the full image, yet fail to be reliably selected and ...

Read Original Article →

Source

http://arxiv.org/abs/2606.16586v1