How to Do Thompson Sampling Using Python

I wrote an article titled “How to Do Thompson Sampling Using Python” in the July 2019 issue of Visual Studio Magazine. See http://visualstudiomagazine.1105cms01.com/articles/2019/06/01/thompson-sampling.aspx.

Imagine you’re in a casino standing in front of three slot machines. You have 10 free plays. Each machine pays $1 if you win or $0 if you lose. Each machine pays out according to a different probability distribution and these distributions are unknown to you. Your goal is to win as much money as possible. This is an example of a multi-armed bandit problem.

The scenario maps to many real-world problems. For example, suppose you have three different Internet advertising strategies and you want to determine which of them is the best as quickly as possible. Or suppose you work for a medical company and you want to determine which of three new drugs is the most effective.

There are many different algorithms for multi-armed bandit problems. Thompson sampling is one approach and for many problems the algorithm works better than alternatives such as epsilon-first, epsilon-greedy, and UCB1 (upper constraint bound).

The Thompson sampling algorithms keeps track of the number of wins and losses for each machine / alternative. Then on each trial, the algorithm samples from the mathematical beta function which gives a probability that each machine will win. The machine with the highest probability is played, the win or loss is recorded and a new machine is selected with the updated result counts.



A sampling of artists named Thompson to go with my post about Thompson sampling. From left: Adrian, Carol, Arthur, Mildred

This entry was posted in Machine Learning. Bookmark the permalink.

1 Response to How to Do Thompson Sampling Using Python

  1. Thorsten Kleppe's avatar Thorsten Kleppe says:

    This was a crazy nice lesson. I was testing the code in python, and with a bit of understanding I was trying to implement the algorithm in my home language mql4, and it was working instantly.

    Ok, instead of 50 lines I need 150 now, but thats ok.

    For that I have taken your really nice pseudo random number generator.
    https://jamesmccaffrey.wordpress.com/2019/05/20/a-pseudo-pseudo-random-number-generator/

    And with the beta distribution sampling for C# it was really easy to get the hard parts.
    https://jamesmccaffrey.wordpress.com/2019/06/27/implementing-beta-distribution-sampling-using-c/

    Wow, I have to say I dont understand completly how it works, because your work was so fantastic, the code was writing from alone with all your prefire.

    One more thing, I could not download the code for the “Simplified Naive Bayes Classification Using C#” example, could you fix this please.

    Thanks for everything and have a nice weekend 🙂
    T.K.

Comments are closed.