Bandit: Explore; Exploit; Expect Value

A tool useful for doing Multi-armed bandit analysis of real life choices.

The Multi-Armed Bandit Optimization Problem gets its name from the problem of trying to get the best returns from a set of slot machines with unknown average payouts. You want to know which machine has the best expected value but knowing that information takes time – time you could be using to cash in on machines that look like they might be the best. 

In many ways, normal life embeds this problem constantly. We spend time on things we make progress on initially, ignoring things that don’t immediately go our way because it’s okay to either wait for our current strategy to sour (as bad options do) or relying on good options to be good more frequently at the beginning. The problem, as described in probability theory, is just a way of being really specific about that process. 

And while our instincts work pretty well for choosing our professions, hobbies, and friends, I’ve noticed that on fleeting or less important things this balancing act, of exploring new options versus exploiting having found a good option, is handled less-than-well. What funny thing can I say to get people to like me, I frequently wonder. While it’d be nice to have the information, maximizing laughter is clearly the goal. And this is the tool for that job. 

Essentially what I’m saying is, prepare for some gut-busters if you meet me in person. I’ll be the guy in the lab coat. 

Bandit