Cost of Structural Learning Under Censored Feedback: A Threshold-Bandit Approach

In many multi-agent applications, tasks yield rewards only when executed by a coalition meeting an unknown size threshold; otherwise, feedback is fully censored. This censorship creates an identifiability problem: agents cannot distinguish stochastic failure from insufficient coordination. We formal...

Read Original Article →

Source

http://arxiv.org/abs/2605.27076v1