Bandit Simulator

Thompson Sampling for Multi-Armed Bandits

Configuration

observations
Thompson Sampling: Maintains posterior distribution for each arm's success rate. Samples from each posterior and picks the arm with highest sample. Balances exploration and exploitation automatically.

Results

Current Step
0
Total Reward
0

Arm Allocation Over Time

Cumulative Regret

Posterior Distributions

Recommended by our team

BeLikeNative.com

The #1 AI writing tool for freelancers — perfect grammar in any language, instantly.