The500Feed.Live

Everything going on in AI - updated daily from 500+ sources

← Back to The 500 Feed
📄 ResearchMay 14, 2026

Reinforcement Learning with Semantic Rewards Enables Low-Resource Language Expansion without Alignment Tax

Extending large language models (LLMs) to low-resource languages often incurs an "alignment tax": improvements in the target language come at the cost of catastrophic forgetting in general capabilities. We argue that this trade-off arises from the rigidity of supervised fine-tuning (SFT), which enfo...

Read Original Article →

Source

http://arxiv.org/abs/2605.14366v1