Reinforcing VLAs in Task-Agnostic World Models

Post-training Vision-Language-Action (VLA) models via reinforcement learning (RL) in learned world models has emerged as an effective strategy to adapt to new tasks without costly real-world interactions. However, while using imagined trajectories reduces the sample complexity of policy training, ex...

Read Original Article →

Source

http://arxiv.org/abs/2605.12334v1