Unified Data Selection for LLM Reasoning

Effectively training Large Language Models (LLMs) for complex, long-CoT reasoning is often bottlenecked by the need for massive high-quality reasoning data. Existing methods are either computationally expensive or fail to reliably distinguish high- from low-quality reasoning samples. To address this...

Read Original Article →

Source

http://arxiv.org/abs/2605.22389v1