Created: November 15, 2022
Modified: March 24, 2025

kelly criterion

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

We are given the opportunity to bet some fraction of our wealth on a coin flip with probability $p$ . We can repeat this as many times as we like. How much should we bet?

This seems like a tricky question because it has the form of a sequential decision problem: we can imagine a branching tree of decisions for how much to bet at each step, depending on how the previous bets have gone. Just as finding the optimal first move in chess would require solving the entire game, in the worst case determining the optimal first bet might require gaming out its potential repercussions into the infinite future.

The answer also depends on our utility function wrt money. However there are (at least) two natural choices for which the sequential decision problem simplifies dramatically: linear and logarithmic utility.

Formally, let $w_t$ be a random variable for our wealth after $t$ steps, and $r_t \in \{1, -1\}$ represent the coinflip at step $t$ . If at each step we bet fraction $b_t$ of our current wealth, then our wealth will be multiplied by $(1 + r_t b_t)$ , that is, either $(1 + b_t)$ or $(1 - b_t)$ depending on the outcome of the coinflip. Then our wealth after $T$ steps will grow as

w_T = w_0 \cdot \prod_{t=1}^T (1 + r_t b_t).

In the case of linear utility, we try to maximize

\mathbb{E}[w_T] = w_0 \cdot \mathbb{E}\left[\prod_{t=1}^T (1 + r_t b_t)\right] =w_0 \cdot \prod_{t=1}^T \mathbb{E}\left[1 + r_t b_t\right]

where we can move the expectation inside the product because each step is independent of the others. This shows that we can simply maximize $\mathbb{E}[r_tb_t]$ independently at each step, which simply means taking $b_t = 1$ (betting the entire bankroll) at all steps assuming that the $r_t$ 's have positive expectation. This is a counterintuitive strategy, since in the vast majority of cases we will lose everything, but those losses are offset by the enormous compounding of wealth in the world where we win all bets (which happens with probability $p^T$ ).

The other (and debatably more natural) assumption is that of logarithmic utility, where we try to maximize

\begin{align*} \mathbb{E}[\log w_T] &= \mathbb{E}\left[\log \prod_{t=1}^T (1 + r_t b_t)\right] + \log w_0\\&= \mathbb{E}\left[\sum_{t=1}^T \log(1 + r_t b_t)\right] + \log w_0\\ &= \sum_{t=1}^T \mathbb{E} \log(1 + r_t b_t) + \log w_0\end{align*}

Taking logs converts compounding into a simple sum, where linearity of expectation works so that we need only consider the expectation at a single step, i.e., we need only maximize the expected growth rate

\mathbb{E} \log(1 + r_t b_t) = p\log(1 + b) + (1 - p) \log (1 - b).

This can be solved by straightforward calculus, setting the derivative with respect to $b$ equal to zero,

\frac{p}{1 + b} = \frac{1 - p}{1-b}

and solving for the optimal bet (the Kelly criterion),

b = 2p - 1.

Let's plug in numbers for intuition. If $p=0.51$ , the Kelly criterion says that we should bet $2 \cdot 0.51 - 1 = 0.02$ of our wealth at each round. On the other hand, if $p=0.8$ , we should bet $2 \cdot 0.8 - 1 = 0.6$ of our wealth.

Since Kelly betting is maximizing the expected logarithmic growth rate of your wealth, it follows that, over time, it gives you more money than any other betting procedure with probability approaching 1. More rigorously:

Your overall growth is the product of individual growth rates, so the average growth rate is the geometric mean of the growth
Thus your average log growth rate is the arithmetic mean of the individual log growth rates
As $t$ goes to infinity, we can view this arithmetic mean as an expectation, which converges to the expected value with high probability by the law of large numbers
Any other procedure (that doesn't optimize the expected log growth rate) will yield on average a lower expected log growth rate. But the actual log growth rate converges to its expectation over time! And this is uniquely true of the log growth rate since that's what gets to us a sum over timesteps that we can frame as an expectation.

In this sense, the Kelly criteria is an instance of geometric rationality: maximizing a geometric expectation rather than an arithmetic expectation.

A more memorable formulation

In general the optimal fraction of our bankroll to bet can be expressed as

f^* = \frac{\text{edge}}{\text{odds}} = \frac{p^*\cdot x - (1 - p^*)}{x} = \frac{p^* - 2 p^* q - q}{1-q}

where $p^*$ is the true probability of winning, assumed known to me, and $x = \frac{1-q}{q}$ are the betting odds ( $x:1$ ) implying a book (or market) probability $q$ .

e.g., if we know there is a $p^* = .8$ probability of winning but the bookie thinks it's $q = 2/3$ , they'll offer odds of $1/2 : 1$ ( $x = 1/2$ ), so our edge is

Alternative arguments for the Kelly criterion

There are some alternative goals that also lead to the Kelly criterion: for example, maximizing wealth in the median outcome. See:

Another take is that it corresponds to proportional representation of subagents with differing beliefs about the future. https://www.lesswrong.com/posts/o3RLHYviTE4zMb9T9/tyranny-of-the-epistemic-majority

I have two future selves, one for a heads outcome and one for a tails outcome. Say $\mathbb{E}[r_t] = 0.6$ : the heads-outcome self is more likely to exist than the tails-outcome self. If I have $100, and want to maximize expected utility, I bet it all on heads. But in a sense that's unfair. Maybe I prefer proportional representation. My heads-outcome self should get $60 (to bet on heads), and my tails-outcome self should get $40 (to bet on tails). The net effect is that I bet $20 on heads.

In general, I will bet a portion $b = p - (1 - p) = 2p - 1$ of my bankroll on heads. This is the Kelly criterion!

This aggregates nicely. If Kelly also has $100 and believes there's a 90% chance of tails, she will bet $90 - $10 = $80 on tails. Now Mary, who has $200, sees both of us and doesn't know who's right. The best she can do is average our probabilities: I believe $p=0.6$ , Kelly believes $p = 0.1$ , so Mary believes $p = 0.35$ and thus bets $200 * (0.35 - 0.65) = -$60 on heads, i.e., $60 on tails. This is equivalent to the net bet of me and Kelly!

The nice conclusion here is that we have an aggregation procedure where we let sub-agents bet independently, in proportion to how likely they are to be correct. And it doesn't matter where we draw the boundaries of the system!

It also turns out that the resulting evolution of the bankrolls is exactly Bayesian updating. That is: Mary started with a belief that my hypothesis ( $p=0.6$ ) and Kelly's hypothesis ( $p=0.1$ ) were equally likely to be correct (and odds ratio of $1/1$ ). The actual outcome of heads was six times more likely under my hypothesis, so she updates her odds ratio to $6/1 \cdot 1/1 = 6$ in favor of me (i.e., if these are the two hypotheses, I have a 6/7 chance of being correct), and now believes $p = 6/7 * 0.6 + 1/7 * 0.1 \approx 0.53$ .

This is equivalent to noticing that Kelly and I started with equal bankrolls, but my bankroll increased by $20 (my $2p-1$ ), while Kelly's decreased by $80 (her $2p - 1$ ). So my new bankroll is $120 and hers is $20, i.e., my bankroll is now six times as large as hers. This is exactly equivalent to Mary's Bayesian updating!

scratch for thinking!

This is all cool but I don't feel like I totally grok it.

one way to get at this is to start by saying I have a bunch of subagents with different beliefs. they start with equal 'influence' (probability of being right). how do I aggregate those beliefs?

let each subagent bet on its beliefs. it has a distribution over future states of the world. if the world is just a coinflip, this reduces to a probability $p$ and we can say that subagents bet influence using the Kelly criterion. then the aggregate belief is the average of their $p's$ , and the updates are Bayesian updates.

does this work for multi-valued outcomes? say we have possible future states $x \in \{1, \ldots, N\}$ and a distribution $p(x) = (p_1, \ldots, p_n)$ . How do you bet on this? I guess for each possible outcome you choose how much to bet on it, so we have bets $b_1, \ldots, b_n$ . You might get even odds for each bet, but maybe there's some market rate where the bookie puts probability $q_i$ on each outcome. if that outcome comes up, you win a quantity $b_i \cdot \frac{1 - q_i}{q_i}$ proportional to your bet and the odds against that bet.

actually it seems equivalent to consider the proportional representation argument? you have $n$ different 'subselves' each of which control resources proportional to their probability $p_i$ of existing. You let each of them bet on its preferred outcome. This is effectively spreading your bets: you take $b = p$ , betting on each outcome proportional to probability. Then each one bets all its money ( $b_i = p_i$ ) and wins $p_i \cdot \frac{1 - q_i}{q_i}$ .

todo:

work out the max log growth rate thing. it's gotta correspond to a cross-entropy maximization?

kelly criterion

A more memorable formulation

Alternative arguments for the Kelly criterion

Links to this note

logical induction

geometric rationality

Meta