InfoSFT: Learn More and Forget Less with Information-Aware Token Weighting

Supervised fine-tuning (SFT) provides the standard approach for teaching LLMs new behaviors from offline expert demonstrations. However, standard SFT uniformly fits all samples -- including those with low likelihood under the base model -- which can disproportionately drive training updates toward o...

Read Original Article →

Source

http://arxiv.org/abs/2605.14967v1