Modified: April 03, 2025
geometric rationality
This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.Notes on Scott Garrabrant's sequence: https://www.lesswrong.com/s/4hmf7rdfuXDJkxhfg
The geometric integral is to products what the ordinary integral is to sums. We can convert between the two just by working in log space. Many uses of logs in mathematics are really geometric integrals in disguise. The geometric expectation
is just the expectation in log space. To geometrically maximize a quantity is to maximize its expected log (note that whether we exponentiate back at the end is irrelevant to the maximization).
A general pattern is that maximizing an (arithmetic) expected probability, with respect to the distribution under which the expectation is taken,
gives you a one-hot or delta distribution at the most probable outcome (), while the corresponding geometric maximization
does probability matching (). This follows from the observation that this is equivalently just minimizing the cross-entropy .
This will generally cash out to arithmetic mechanisms that look like "winner-take-all" systems and geometric mechanisms that look more "fair", with proportional representation.
- the kelly criterion geometrically maximizes long-term wealth
- what does this look like? long-term wealth is itself a product. when we maximize its expected log, the product decomposes and we end up just maximizing the expected log single-step growth rate. but we could also think of this as: geometric expectations commute with products just as ordinary expectations commute with sums. so it turns out that geometrically maximizing wealth is equivalent to geometrically maximizing the single-step growth rate. which looks like considering possible futures in a way that maximizes the geometric mean of their possible growth rates (weighted by probability) rather than their arithmetic mean.
- instantiating the general pattern: Kelly is like letting each possible world 'bet' its probabilistic portion of the bankroll, while maximizing linear utility is letting the most probable world bet the entire bankroll.
- Nash bargaining geometrically maximizes utility, under uncertainty over which player you will be incarnated as.
- given multiple agents and their utilities, if we're maximizing expected/total utility, you maybe just want to give all the cookies to the one person who claims to get the most value from them (the cookie monster). but maximizing geometric utility, everyone gets the same number of cookies. (if we had a nonuniform veil of ignorance --- you're more likely to be incarnated as one sort of person than another --- then they would get proportionately more cookies)
And also: Thompson sampling geometrically maximizes the probability that you choose the 'best' action, under your posterior uncertainty of what world you're in.
The intuition here is that it's doing exploration by probability matching (taking each action with probability proportional to the number of worlds in which that's the best action), rather than pure exploitation (taking the single action that's most likely to be the bestGarrrbrant calls this the 'plurality' policy, which is itself different from just maximizing expected utility. Expected utility doesn't distinguish between uncertainty over possible worlds, and uncertainty over action values within each possible world --- it's just one big expectation. But for plurality (and Thompson sampling) it does matter where we draw the boundaries between possible worlds, since there is an inner maximization to determine the 'best action' in each world.).
- how does this work? Scott claims we can write Thompson sampling as acting according to where is the best action under a given hypothesis for what world you're in (and a given percept sequence ). This would be equivalent to maximizing the expected log probability of the best action:Conventionally I think of Thompson sampling as equivalently, either:
- sampling a world from the posterior, and taking the best action in that world, or