Monday, September 17, 2012

A/B Testing vs MAB algorithms - It's complicated

Several months ago, several blog posts appeared on Hacker News comparing A/B testing and multi-armed bandit techniques.  If you want to review the posts and the discussion, see 20 lines of code that beat A/B testing every time, then Why multi-armed bandit algorithm is not "better" than A/B testing and finally Why Multi-armed Bandit algorithms are superior to A/B testing (with Math).  I participated in those discussions, and ever since then I've been wanting to write up my thoughts once I had them in a compact enough form to do so.

That has taken an unfortunately long time.  In fact I've given up on saying everything that I want to say in a compact form, and will try to only say what I think is most important.  And even that has wound up less compact than I'd like...

First a disclaimer.  Website optimization has been a large part of what I've done in the last decade, and I've been a heavy user of A/B testing.  See Effective A/B Testing for a well-regarded tutorial that I did on it several years ago.  I have much less experience with multi-armed bandit approaches.  I don't believe that I am biased.  But if I were, it is clear what my bias would be.

Here is a quick summary of what the rest of this post will attempt to convince you of.
  1. If you have actual traffic and are not running tests, start now.  I don't actually expand on this fact, but  it is trivially true.  If you've not started testing, it would be a shock if you can't find at least one 5-10% improvement in your business within 6 months of trying it.  Odds are that you'll find several.  What is that worth to you?
  2. A/B testing is an effective optimization methodology.
  3. A good multi-armed bandit strategy provides another effective optimization methodology.  Behind the scenes there is more math, more assumptions, and the possibility of better theoretical characteristics than A/B testing.
  4. Despite this, if you want to be confident in your statistics, want to be able to do more complex analysis, or have certain business problems, A/B testing likely is a better fit.
  5. And finally if you want an automated "set and forget" approach, particularly if you need to do continuous optimization, bandit approaches should be considered first.

That summary requires a lot of justification.  Read on for that.