The500Feed.Live

Everything going on in AI - updated daily from 500+ sources

← Back to The 500 Feed
📄 ResearchJune 16, 2026

Grounding Spoken LLMs in Multi-Speaker Audio via Diarization Conditioning

We propose diarization-conditioned spoken language models (SLMs), a strategy for extending SLMs to far-field multi-talker audio. Rather than adapting the decoder via Serialized Output Training, which risks catastrophic forgetting, we condition the acoustic encoder on diarization masks to extract tar...

Read Original Article →

Source

http://arxiv.org/abs/2606.18134v1