Created: April 02, 2025
Modified: April 02, 2025

logical induction

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

Sources:

https://www.lesswrong.com/posts/y5GftLezdozEHdXkL/an-intuitive-guide-to-garrabrant-induction
https://www.lesswrong.com/posts/jtMXj24Masrnq3SpS/logical-induction-for-software-engineers#Defeating_one_trading_algorithm
Garrabrant et al., Logical Induction (2016) https://arxiv.org/abs/1609.03543

General task: we want to reason about logical uncertainty. What is the nth digit of pi? Is large number $x$ prime? etc.

Why is Bayesian reasoning insufficient for this?

Naively we could think of taking the ten hypotheses for the nth digit of pi, and running some computation. We have a prior (presumably uniform) on these hypotheses, and imagine that each step of the computation gives us some evidence. Then we update the probabilities as the computation goes.

I think the sense in which this is true is that Bayesian reasoning takes place in a joint probability model. If I observe one variable, a Bayesian reasoner (by definition) knows the conditional distribution of all other variables. But in practice that conditional distribution can be intractable to compute.

In a sense, it's logically incoherent to have a probability model over digits of pi. We have to forget that we're talking about pi, and just treat this purely as a formal set of ten hypotheses. In order to actually identify the 'pi' we're talking about with the mathematical constant, we'd have bring mathematics itself into our probability space. We'd have a connected structure of propositions following from (say) the ZFC axioms. And in that joint distribution, there is no more meaningful notion of conditional probabilities --- every proposition is unconditionally true (probability 1) or false (probability 0); no additional evidence is needed or possible.

The logical induction criterion

Like a Bayesian reasoner, a logical inductor updates its beliefs following some evidence stream. It's not a full-fledged agent: it only has beliefs, not goals. But these beliefs are probabilistic and so we can view them as betting odds - we assume that the agent is always willing to make bets that have at least even odds under its current beliefs.

The prototypical stream of evidence we'll consider is proofs of increasing length in Peano arithmetic. On day one, we see all proofs of length 1, day two we see all proofs of length 2, and so on. On day $n$ , the agent is uncertain about all propositions not yet proven.

A belief-updating system is a logical inductor if it cannot be efficiently Dutch booked. Any non-Bayesian reasoner will somehow be susceptible to Dutch booking; there exists some strategy for taking an unbounded amount of its money via a series of bets. A logical inductor is one for which such a strategy can't be efficiently computed.

We can show that such a system (if it exists) will have nice properties:

its probabilities (prices) converge in the limit (they do not oscillate)
it is coherent: the probabilities converge to the actual truth values
for any efficiently computable sequence of true theorems, the reasoner will eventually learn that by day $n$ it should assign probability 1 to the $P(n)$ th theorem in the sequence, where $P$ is any polynomial. (since if it didn't learn this, it could be exploited by a strategy that bets on the truth of that $P(n)$ theorem)
It will respect logical relationships: for any efficiently computable sequence of mutually exclusive statements, we will eventually learn that on day $n$ we must assign probabilities that sum to $1$ for all of the first $P(n)$ elements of that sequence (since, again, it could otherwise be Dutch booked by a strategy that trades on these relationships)
It is non-dogmatic: in the limit, it will not assign probabilities of 0 or 1 to statements that have not been proven false or true, respectively

(I don't understand the claim that a system that believes a false statement with probability 1, on only a finite number of days, will lose only a finite amount of money? it seems like it should be willing to bet at unbounded odds. I guess the setup is that we restrict to a finite bet at each timestep, so the only way to lose infinite money is to be wrong infinitely often?)

Note that a logical inductor, by this definition, does not itself have to be efficiently computable. And indeed, the construction for Garrabrant induction is not efficiently computable. It effectively does a brute-force search over market prices for all sentences.

Construction

Can we actually construct an algorithm that satisfies this logical induction criterion? It turns out the answer is yes.

The rough idea seems to be that we take a set of trading 'experts' (all efficiently computable trading strategies) and discard those that lose sufficient amounts of money.

The construction goes like this:

First, for any given trading algorithm, we establish a routine that 'defeats' that algorithm (prevents it from making unbounded money in the limit)
Then we show how to defeat an ensemble of trading algorithms.
Then we show how to efficiently enumerate all possible efficient trading algorithms, and employ our construction so that by step $n$ we defeat the first $n$ such algorithms.

Since any efficient trading algorithm must be at some position $n$ in this enumeration, it follows that it can make money only on the first $n$ days. Thus no efficient algorithm can make an unbounded amount of money.

The notion of a 'trading algorithm' is somewhat restricted. We break it down into:

trading algorithm: a poly-time program that outputs 'trading policies'
trading policy: a list of pairs of (sentence, trading formula)
trading formula: a continuous function built from a set of six primitives. It takes as input a list of lists of (sentence, credence) pairs. The primitives are:
- buy a number of tokens equal to the credence of a sentence
- buy a constant number of tokens
- take the {sum, product, max, reciprocal} of two (one for reciprocal) trading formulas

(TODO: I'm a bit confused about how these formulas work, why it's necessary to specify them this way, and what if any generality we give up)

Any given trading algorithm will take a list of credences (market prices) and output a list of trade volumes. Our goal is to set the credences / prices so that the algorithm only ever buys at $1 or sells at $0 (and thus cannot make any money).

To do this, we consider an "price adjustment" map that increases the price of sentences that the algorithm buys, and decreases the price of sentences that it sells. The prices max out at 1 and bottom out at 0, so prices for $k$ sentences are points in the hypercube $[0, 1]^k$ . Since the trading algorithm is (by assumption) a continuous function of the prices, this map is a continuous map on the hypercube, and so by Brouwer's fixed point theorem, this map has a fixed point. At the fixed point, we know that our adjustment map isn't changing the price of any sentence. This can only be because one of the following is true:

the trading volume is zero
the price is already $1 (and the trader is buying)
the price is already $0 (and the trader is selling) Thus, if we set prices at this fixed point we will be guaranteed to not lose money to this trader.

There is unfortunately not an efficient way to find this fixed point. Garrabrant induction therefore does the most inefficient possible thing: it enumerates all rational-valued price sets, until it finds one that is within some tolerance of the fixed point. The tolerance is exponentially decreasing, so the total profit obtained from it is bounded.

One way to think about this is that the 'price adjustment' function is kind of an automated market maker. In real trading, a trader that buys will drive up prices until they no longer want to buy, and a trader that sells will drive down prices, until they no longer want to sell. The trick here is that we can effectively front-run the trader: using access to its source code, we simulate its behavior to predict the price at which it would stop trading, and then pre-emptively set the price at that level. Thus the trader doesn't actually make any money.

Now suppose we have multiple traders. We'll construct a firm of traders such that, by defeating the firm, we are guaranteed to defeat each individual trader.

First we apply a transformation to limit each trader to a finite budget. On each day, we scale each trader's trades by (max(1, (amount we could lose) / total budget including winnings so far))^{-1}), so that trades are rescaled only if there is a risk of losing more than the total budget. This preserves the exploitation property: it can be shown that, for any trader that wins unbounded money, there is some budget $b^*$ such that they will continue to win unbounded money after this transformation. We might not know what this $b$ is, but we can enumerate all possible $b$ 's, constructing an exponentially decaying sum of transformed traders, where each budget $b$ is given weight $1/2^b$ so that the total losses are still bounded, but the $b^*$ -transformed trader has fixed weight $1/2^{b^*}$ so they continue to win an unbounded amount.

Now we enumerate the traders. On day $n$ , we will use the first $n$ traders (with budgets bounded as above), with each trader $k \in \{1, \ldots, n\}$ weighted as $1/2^{k}$ . Thus each strategy is incorporated with a fixed coefficient, so that, if any trader wins an unbounded amount of money, the combined strategy also wins an unbounded amount of money.

Now we directly apply our construction for defeating a single trader, developed above, to this combined trader. By defeating the combined trader, we establish that no trader in its ensemble wins an unbounded amount of money.

Can we think of this ensemble trader more procedurally? I want there to be a parliament of experts, so that experts are downweighted as they lose more money. The tricky thing is that an expert that eventually makes unbounded money, might still lose an arbitrarily large (but finite) amount of money, on the way to making it back. So we effectively make a bunch of 'copies' of each expert, each one of which gets dropped if it loses more than some budget $b$ , and we weight these copies according to $2^{-b}$ . An expert that has lost $b$ thus has copies with remaining budgets $\sum_{\beta = b}^\infty \frac{\beta}{2^\beta}$ , which o3-mini claims is equal to $\frac{2(b + 1)}{2^b}$ . This seems like not quite a multiplicative weights update, but it's similar in spirit.

It seems like it then makes sense to think of logical induction as:

consider a set of trading algorithms.
give each one a fixed weight, so that the total weight is finite.
as algorithms lose money, downweight them exponentially.
take the 'market equilibrium price' (the fixed point) at which this ensemble would be indifferent to trading.

The fixed point isn't necessarily from the traders trading with each other: recall that it was defined even for a single trader!

Does it make sense to think about the fixed point price as representing the (aggregate) credence of a trader? For a probabilistic agent, the point at which the agent is indifferent to trades is their probabilistic belief. the 'trading strategies' here are not really probabilistic agents: they are not required to bet in any given situation, there may be many fixed points at which they are indifferent to trades, and they do not themselves need to be rationally consistent or unexploitable. They also incorporate an additional element of bankroll management that is not itself part of a probabilistic reasoner (to get a trading agent from a probabilistic reasoner, you'd need to add on some such strategy, eg betting according to the kelly criterion). But in some loose sense I think it does make sense to think of each trader's fixed point as representing implicit probabilities.

The interesting thing is that when you combine the traders, you don't just aggregate their (implied) probabilities. You aggregate their bets. If you constructed a trader as (probabilistic belief + betting strategy), the betting strategy is only relevant away from the fixed point - the fixed point itself is just the probability. But an ensemble of traders is an ensemble of their betting strategies - an aggressive bettor will move the (implicit) ensemble probability much more than a timid one.

Solving for the market equilibrium in a logical inductor is complicated. Is there a tractable way to do this?

We could think of the market maker as a 'dumb' bettor that will always do trades at a current 'market price', but increases or decreases its price according to each trade it does. Because it's dumb money, it will probably lose money in expectation against any reasonable trading strategy. But all the money it loses is "fake" - the market making happens "in simulation" using the source code of the trading strategy, in order to ultimately establish an equilibrium price.

I wonder if we can just run an actual market, with simulated traders, including the market maker?

logical induction

Why is Bayesian reasoning insufficient for this?

The logical induction criterion

Construction

Meta