Introduced by Geoff Hinton (1999): Products of Experts . Each expert produces a probability distribution. These are combined by…
The idea of 'projection' in psychology means to assume that someone else has the same flaws, or foibles, or motivations as you do. It struck…
So the mechanism is if you have tokens you can choose to stake them. And in order to run anetwork node you must stake some number of tokens…
The policy gradient theorem says that For simplicity we'll assume a fixed initial state and fixed-length finite trajectories, but the…
References: Tegmark and Omohundro, Provably safe systems: the only path to controllable AGI (2023). https://arxiv.org/abs/2309.01933 they…
Proximal methods in optimization The proximal operator of a [ convex ] function is defined as the minimizer of plus a distance penalty…
references: paper: https://arxiv.org/abs/1707.06347 great blog post on implementation details: https://iclr-blog-track.github.io/2022/0…
[ 5-MeO-DMT ] [ mescaline ] [ psilocybin ]
Toilet: A bidet. Cold water, warm-water (if a hose from your toilet can reach your sink's plumbing), or internally heated. It saves toilet…
It's tempting to use [ natural gradient ] ascent to optimize a variational distribution. We could also consider using it to optimize the…
A portfolio containing a long (European) call and short (European) put [ option ] with the same strike price and expiry date is equivalent…
A five-sided carbon ring with one nitrogen: C4H4NH.
General procedure for setting up a new Python project. Create a new git repo and clone into a directory my_new_project Add files…
"Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin." - John von Neumann Young man, in…
Formally, a random variable is a (measurable) function defined on outcomes from a [ probability space ] . That is, in any possible…
a powerful tool for establishing [ causality ]
The rate equation or master equation for a continuous-time Markov [ stochastic process ] describes how the probability density of the…
From a [ utilitarian ] perspective, all of morality follows from improving global utility, and it follows that it'd be better to do this…
In no particular order. Items may move to [ previously read ] if I read them or former reading inbox if I decide I'm not currently…
One model you could have of reading a book is that the book contains information, and once you've read it, you now possess that information…
Why do I want to write more? Because: writing forces thoughts to crystallize. It forces me to draw conclusions about what I believe and who…
[ Nielsen's notes on ASI xrisk ] introduced the thought experiment: If you ask an all-knowing oracle a question like "Can you give me a…
See also [ family recipes ]. Roast chicken and vegetables: preheat oven to ~400. cover a spatchcocked chicken with salted garlic butter at…
See also [ family recipes ]. Roast chicken and vegetables: preheat oven to ~400. cover a spatchcocked chicken with salted garlic butter at…
The best way to recruit people is to convince them that they will learn and grow by working with your team. Pitches that have 'worked' for…
thoughts on reinforced self-training paper: https://arxiv.org/abs/2308.08998 the basic idea is very simple. we sample additional…
Note : see [ reinforcement learning notation ] for a guide to the notation I'm attempting to use through my RL notes. Three paradigmatic…
https://andyljones.com/posts/rl-debugging.html https://www.reddit.com/r/reinforcementlearning/comments/9sh77q/what_are_your_best_tips_for…
see: [ steering language models ], [ direct preference optimization ] We are given a bunch of pairwise preference evaluations, of the form…
There tends to be a lot going on in RL algorithms, with a whole mess of different quantities defined across timesteps. It's useful to try to…
[ relationship advice ]
see also (maybe combine with?) [ relationship ] Accept [ bids ] as much as possible. Praise your partner in public (and in private). Stay in…
Suppose we want a [ transformer ] to evaluate the inequality returning if and otherwise. For integer , this can be done with a…
The selection operation y = where(c, a, b) returns How can a [ transformer ] layer implement this operation? One approach is to is to use…
When I was younger---in college or in grad school---I was sometimes conflicted about whether I should prioritize trying to get to correct…
If a model with data has normalizing constant , then the replica trick says that This allows us to analyze the average log-normalizer…
In modern ML, representation learning is the art of trying to find useful abstractions, embodied as encoding networks. We can learn…
To be a successful researcher it's incredibly important to find and join your [ research community ]. Go to conferences (especially to small…
This note lists some ideas and directions for research I'm interested in or excited about. Some are more fleshed out than others, some more…
(see also: [ impact ]) I've been feeling depressed partly because the actual PhD research I did was (in my view) pointless, and more broadly…
People who do research have a very ground-level, zoomed-in view of their field. They know where the current obstacles are, how incredibly…
Reservoir samplers solve the following task: sample items without replacement from a stream of unknown length . Because the length is…
Teachers or centers I'd be interested to do a retreat with/at: Tucker Peck Michael Taft Tina Rasmussen (Cloud Mountain 13-day retreats…
References: The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" https://arxiv.org/abs/2309.12288 Studying Large Language…
References: Ludwig Winkler's post on Reverse time stochastic differential equations . Suppose we have a [ stochastic differential equation…
stray thoughts about reward functions (probably related to the [ agent ] abstraction and the [ intentional stance ]) one can make a…
When thinking about the [ reward ] function for a real-world AI system, there is always some causal process that determines reward. For…
Silver, Singh, Precup, and Sutton argue that Reward is enough : maximizing a reward signal implies, on its own, a very broad range of…
Suppose we have a [ Markov decision process ] in which we get reward only at the very end of a long trajectory. Until that point, we have no…
See also: [ cooperative inverse reinforcement learning ], [ love is value alignment ]
four [ right effort ]s: Restraint : avoid unwholesome situations that might give rise to or trigger unwholesome states and patterns. For…
References: Liu, Zaharia, Abbeel. Ring Attention with Blockwise Transformers for Near-Infinite Context (2023). https://arxiv.org/abs/231…
Things that might be useful to log in a [ reinforcement learning ] algorithm: Return of each trajectory. (summarize as mean/std/min/max…
Implement MuZero or something similar. What are the 'state of the art' RL algorithms? What is known and not known about [ value alignment ]?
Suppose we want to maximize reward, but we only get a couple bits of reward data every few hundreds/thousands of actions, whereas we get…
Deriving here just for my own edification. At each timestep a rocket ejects mass at velocity relative to its current reference frame. At…
About 5% of people are gay, so in any given community it's about twenty times harder for a gay person to find a partner than for a straight…
SuccessfulFriend highlighted this distinction which I should really read more about. At a high level it's about the distinction between…
Language is a really natural way to tell AI systems what we want them to do. Some current examples: [ GPT ]-3 and successors (InstructGPT…
In chemistry, a salt is a neutral-ish (not too acidic nor basic) compound held together by an [ ionic bond ]. Salts can be formed by [ acid…
Scheduled sampling is a training procedure for sequence models that attempts to mitigate [ exposure bias ] - the problem in which generation…
The score function is the gradient of a log-density with respect to its parameters: It is the direction that we would move the parameters…
Aapo Hyvärinen: Estimation of Non-Normalized Statistical Models by Score Matching (2005) https://jmlr.org/papers/volume6/hyvarinen05a…
[ unearned confidence ] [ agency and confidence ]
https://forum.effectivealtruism.org/posts/QhPyQTXuGt58Nzxnu/you-are-probably-underestimating-how-good-self-love-can-be
Sometimes it's necessary and right to prioritize my own interests, even if [ global utility ] is ultimately the only metric. Developing…
I need to genuinely care about other people and want the best for them, both in general, and for specific people in my life. Why? Obviously…
Traditional Buddhism describes six "sense bases" or gates: the eye, ear, nose, tongue, body, and mind. Western science usually omits the…
When we talk about "the self", or having a "sense of self", what do we mean? There is an interpretation in terms of [ consciousness ] - that…
When we talk about "the self", or having a "sense of self", what do we mean? There is an interpretation in terms of [ consciousness ] - that…
Take the statement 'human-level AI is possible'. As a kid, I saw this as obviously true. We can simulate physics, and brains are physical…
There are 14 kinds of serotonin receptors; most (but not all) are [ G protein ]-coupled. The central nervous system has almost all of them…
Shadow work means, roughly speaking, a practice of noticing, loving, and integrating the parts of yourself that you've repressed (your…
Shadow work means, roughly speaking, a practice of noticing, loving, and integrating the parts of yourself that you've repressed (your…
Shard theory's basic ontology of RL holds that shards are contextually activated, behavior-steering computations in neural networks…
References: https://generative.ink/posts/simulators/ It seems pretty clear that the intelligence emerging from [ language model ]s is not…
The performance of an investment can be modeled as where the 'market return' is that of some sufficiently broad index such as the S&P 50…
In Korea (and maybe also Japan?) it's common for young guys to bond through physical touch and affection: hugging, holding hands, sitting in…
It's weird that we lie down every day to cease our consciousness, and sometimes to hallucinate. There are physiological benefits to sleep…
It's not a terrible summation of [ depression ] that it starts from seeing no way to achieve your goals. Sometimes that's because it's…
I've been really unhappy about how TFP is developed. It's felt pedantic. I waste a lot of effort thinking about things I don't want to think…
This is the advice I wish I'd had. It's catered to my preferences; caveat emptor. Packing: (for men) bring one pair of long pants, for…
paper 1: http://redwood.berkeley.edu/bruno/papers/VR.pdf basic idea: find a basis such that any given image (or whatever signal) can be…
References: https://redwood.berkeley.edu/wp-content/uploads/2020/08/KanervaP_SDMrelated_models1993.pdf A sparse distributed memory consists…
References: Jacobs, Jordan, Nowlan, Hinton. Adaptive Mixtures of Local Experts (1991) Shazeer et al. Outrageously Large Neural Networks…
References: Jacobs, Jordan, Nowlan, Hinton. Adaptive Mixtures of Local Experts (1991) Shazeer et al. Outrageously Large Neural Networks…
I used to be able to say 'superintelligent AI is possible'. Now in industry the notion of 'possible' is 'something I can myself do': by…
What would true wireheading feel like? People have this impression that it'd be thin, exhausting, artificial, ultimately isolating and not…
A stablecoin is a dollar-denominated liability registered on a [ blockchain ]. It can be backed by USD reserves, as Tether allegedly is…
Julian Shapiro recommends keeping a set of six 'Starting principles' that you use to make decisions. That's about all that you can…
A common pattern in [ reinforcement learning ] pedagogy is to develop some idea first in the context of estimating state values , and then…
A [ stochastic process ] is (strictly) stationary if all of its joint distributions are invariant under time displacement. It is wide…
I've always been an evening person more than a [ morning person ]. I often stay up until 1 or 2am, and in the absence of hard constraints it…
Getting language models to align their output with human preferences would be highly useful for [ computational life coach ]ing. What's the…
SDEs are typically written in terms of the differential of a Weiner process (Brownian motion), e.g., Although Weiner processes are nowhere…
A stochastic process is a collection of [ random variable ]s defined on a common [ probability space ] . Equivalently, it is a joint…
Pasting a quote from Adam Smith by way of HN (source http://www.econlib.org/library/Smith/smMS7.html , I should read the whole thing…) that…
A stopping time for a stochastic process is a time-valued That is, integer-valued for discrete-time processes and real-valued for…
A lot of confused discussion around large organizations comes from conflating individual motivations with larger-scale 'structural…
In kindergarten stats, you learn how to build a model that takes in data (a feature vector, image, sound file, etc) and predicts a single…
Revolutionary ideas must live in the blind spots of the current intellectual conversation; otherwise people would already be using them…
Note naming The general goal is to minimize the use of aliasing in links. In case where these guidelines suggest an unnatural or uncommon…
substantive questions I've had these are things I've wondered about that were never answered properly in the classes in which I learned them…
Substituted tryptamines - PsychonautWiki Tryptamine consists of an [ indole ] moeity plus a two-carbon (ethyl) chain with an amine group. We…
A sugar is any molecule with the empirical formula C(N)H(2N)O(N). These are like alkanes, which are C(N)H(2N + 2), except that each carbon…
A -dimensional vector can represent distinct orthogonal features, but due to the weirdness of [ high-dimension ]al geometry, it can…
What have I learned in 2.5 years at Google? What did I not realize? The model of research. How low the expectations are. How fake it felt to…
Refs: https://opentheory.net/Qualia_Formalism_and_a_Symmetry_Theory_of_Valence.pdf
My Supernote A5X syncs through Dropbox, but unfortunately Dropbox doesn't support Windows ARM64 machines like the Surface Pro X. Here's my…
Buddhist (Pali) term referring to craving, longing, desire for the world to be other than as it is. This includes craving good things and…
There's a famous quote attributed to Eleanor Roosevelt: "great minds discuss ideas, average ones events, mediocre ones discuss people". This…
A set of methods for maintaining an " attitude of spacious passion ". The particular methods are contingent; if you could maintain the…
A general issue with [ temporal difference ] learning methods, which 'update a guess towards a guess', is that they can end up 'chasing…
Something that confused for me for a while is that people in certain communities talk about 'teacher forcing' as though it's a trick or a…
Dave's principles of effective teaching. Motivation is by far the most important thing. A student who wants to learn will learn even with a…
As a researcher, I wonder if there's a 'critical point' of growing an idea when it's important to be [ teaching ] it, whether formally or…
working with Sinclair, Klein, Abbeel, they’ve all got great experience and advice especially for large classes You don’t have to give the…
Rob wants to firm up his foundations. He wants to understand relevant stats, probabilistic models, inference, and maybe work our way up to…
Epistemic status: either this is true or TV is maybe one of the greatest contributions to human utility ever. Unclear. The average American…
From David Silver's slides : TD-learning 'updates a guess towards a guess'. Sutton and Barto define the temporal difference error as the…
Class 1 forehand grip racquet in right hand, as if picking it up from the ground. hand at bottom of handle. Grip the flat side of the…
This page (first brainstormed in an Otter note) is for issues where I feel pulled in several directions. Different principles seem to yield…
Every in machine learning talks about tensors, but no one really understands what they are. This page collects several definitions and…
The tensor product of two vector spaces (defined on the same scalar field, we'll assume ) is the vector space of formal sums of…
This post gives a nice, mathematically clear development of basic terms in statistical mechanics. Highlights: Think of a physical system as…
thai holy basil / hot basil thai basil turmeric rice noodles: thin (pad thai) or wide (pad see ew / pad kee mao) oyster sauce, fish sauce…
Status: in conflict with [ negative utility ] ? See also: the [ hedonic treadmill ]. Evolutionary, 'pain' exists to motivate you to get out…
I used to think that there was a 'best' way to motivate an area. For example, in VI, the ELBO is derived from the KL divergence between a…
[ Tucker Peck ] often mentions this as a thing Sharon Salzberg would say. What does it mean? I don't know - I should ask Tucker to clarify…
It's almost never worth worrying about whether an individual action is the right thing to do. It's like trying to dance while worrying at…
In order for a group of people, like an academic field, or a political elite, to meaningfully converse about a complex topic, they have to…
Metaphor connected to the observation that [ all models are wrong ]. Borges, On Exactitude in Science : ...In that Empire, the Art of…
A point made by [ Michael Taft ] in various talks, e.g. The World is Inside You (also the '[ emptiness ] of perception' described by [ Dan…
A point made by [ Michael Taft ] in various talks, e.g. The World is Inside You (also the '[ emptiness ] of perception' described by [ Dan…
Comparing myself to SuccessfulFriend, I might be tempered to think that because he is interested in antitrust law, zoning reform, political…
Andrew Gelman believes that in certain areas of research , like the social sciences, everything is connected. "I’m not expressing…
In every field, there is a store of 'standard' advice that is handed down from mentors to ambitious youngsters. In computer science grad…
The Feynmannian/Sagan/Tyson "scientific" view is that the [ purpose ] of life is understanding : the world is a giant mystery, with layers…
It exists, but is [ empty ], insubstantial, a [ fabrication ]. Foregrounding this view is an important part of [ awakening ] or…
[ things are deeply wrong ]
If I'm managing someone, I want them to be coming up with their own ideas and owning them. Owning their ideas means they will themselves…
tl;dr : the ideas we need to build intelligent systems may be different from those we need to understand them. Both are important, but…
Several ideas here: When I try to tell a story about what I'd like to change about my life, at a high level, I can come at it from different…
For several reasons: multiple object-level causes a telescoping tower of causes at increasing levels of generality or abstraction 'because…
If two statements that both seem true conflict with each other, then it seems like you have a paradox. But the world itself is just as it is…
Pointed out in this tweet: https://twitter.com/AmandaAskell/status/1311776280128479238 but also in many other places over the years…
AI is going to work. Obviously lots of people believe this. But most 'AI' companies and 'AI' investors are hyping applications of current…
No matter what other priorities or any incredibly important goals arise in my life, whether through work, family, or other circumstances…
Write Write regularly: under routine circumstances, at least a few minutes per day. This could be filling in nodes of this graph, blogging…
See also: [ the system is bad ] I find it hard to be okay with a 'normal' life, because that would imply some level of acceptance of the…
These might not be the best thing to do at any point, but they're better than doing nothing. And doing them can create a sense of progress…
See also [ writing inbox ] See also [ ongoing projects ] See, first and foremost, the backlinks below. Crypto trading model. Write a system…
This is one of those things that sounds cliche but is still profound and #fundamental: this is all. There's no great reward in the future…
I want kids, eventually. I want to be able to talk with them, to build a relationship, to see the world through someone else's eyes. I want…
The [ agent ] model of intelligence imposes a sharp distinction between the agent and its environment, where the agent 'chooses' actions…
let's say the signal we see after the intervention is modeled as the combination of the counterfactual forecast and an intervention effect…
[ impermanence ] [ dukkha ] (unsatisfactoriness) [ no-self|annita ] (no-self) Daniel Ingram's summary: things "come and go, don't satisfy…
(fellow student) Smitha has these post-its on her desk: what are you doing? why is it important? are you making progress? I think these…
This is personal mental image for [ emptiness ] that has been really resonant for me, arising from an experience taking [ MDMA ] with a…
It helps a lot to write down the things you think someone should know about working in a new environment. Even if a new person would figure…
I feel an obligation to try to do big things with my life, because I've had access to rare opportunities. If ten thousand randomly selected…
Television: Arcane ken burns on the vietnam war WandaVision For All Mankind Severance: https://m.imdb.com/title/tt11280740/ Borgen Diplomat…
Television: Arcane ken burns on the vietnam war WandaVision For All Mankind Severance: https://m.imdb.com/title/tt11280740/ Borgen Diplomat…
How should a machine learning model represent text? Word-level and character-level features are obvious options, but both have drawbacks…
Sometimes mentioned as a potential approach to [ AI safety ]. Gwern: Why Tool AIs want to be Agent AIs (roughly: because treating…
Notes on Toolformer: Language Models Can Teach Themselves to Use Tools The basic method is: "Given just a handful of human-written examples…