The standard [ Markov decision process ] formalism includes a reward function ; the total (discounted) reward across a trajectory is its…
Different experimental conditions may give rise to different outcomes . For example, let the variable indicate whether a person is…
A nice paper that gets at some subtleties of calibration: Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison. Experts Don't…