So the mechanism is if you have tokens you can choose to stake them. And in order to run anetwork node you must stake some number of tokens…
Modified: November 13, 2021.
The policy gradient theorem says that For simplicity we'll assume a fixed initial state and fixed-length finite trajectories, but the…
Modified: April 02, 2022.
References: Tegmark and Omohundro, Provably safe systems: the only path to controllable AGI (2023). https://arxiv.org/abs/2309.01933 they…
Modified: September 06, 2023.
Proximal methods in optimization The proximal operator of a [ convex ] function is defined as the minimizer of plus a distance penalty…
Modified: July 07, 2022.
references: paper: https://arxiv.org/abs/1707.06347 great blog post on implementation details: https://iclr-blog-track.github.io/2022/0…
Modified: July 21, 2022.
Modified: .
[ 5-MeO-DMT ] [ mescaline ] [ psilocybin ]
Modified: October 04, 2021.
Toilet: A bidet. Cold water, warm-water (if a hose from your toilet can reach your sink's plumbing), or internally heated. It saves toilet…
Modified: September 25, 2021.
Modified: June 27, 2021.
It's tempting to use [ natural gradient ] ascent to optimize a variational distribution. We could also consider using it to optimize the…
Modified: October 25, 2020.
A portfolio containing a long (European) call and short (European) put [ option ] with the same strike price and expiry date is equivalent…
Modified: October 26, 2021.
A five-sided carbon ring with one nitrogen: C4H4NH.
Modified: May 14, 2021.
General procedure for setting up a new Python project. Create a new git repo and clone into a directory my_new_project Add files…
Modified: April 09, 2022.
Modified: .
Modified: .
"Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin." - John von Neumann Young man, in…
Modified: February 11, 2022.
Formally, a random variable is a (measurable) function defined on outcomes from a [ probability space ] . That is, in any possible…
Modified: August 27, 2022.
a powerful tool for establishing [ causality ]
Modified: February 07, 2022.
The rate equation or master equation for a continuous-time Markov [ stochastic process ] describes how the probability density of the…
Modified: August 28, 2022.
From a [ utilitarian ] perspective, all of morality follows from improving global utility, and it follows that it'd be better to do this…
Modified: June 07, 2021.
In no particular order. Items may move to [ previously read ] if I read them or former reading inbox if I decide I'm not currently…
Modified: August 28, 2023.
One model you could have of reading a book is that the book contains information, and once you've read it, you now possess that information…
Modified: February 23, 2020.
Modified: February 18, 2020.
Why do I want to write more? Because: writing forces thoughts to crystallize. It forces me to draw conclusions about what I believe and who…
Modified: May 16, 2022.
[ Nielsen's notes on ASI xrisk ] introduced the thought experiment: If you ask an all-knowing oracle a question like "Can you give me a…
Modified: September 29, 2023.
See also [ family recipes ]. Roast chicken and vegetables: preheat oven to ~400. cover a spatchcocked chicken with salted garlic butter at…
Modified: March 03, 2022.
The best way to recruit people is to convince them that they will learn and grow by working with your team. Pitches that have 'worked' for…
Modified: July 19, 2021.
Modified: .
thoughts on reinforced self-training paper: https://arxiv.org/abs/2308.08998 the basic idea is very simple. we sample additional…
Modified: October 24, 2024.
Note : see [ reinforcement learning notation ] for a guide to the notation I'm attempting to use through my RL notes. Three paradigmatic…
Modified: April 23, 2022.
https://andyljones.com/posts/rl-debugging.html https://www.reddit.com/r/reinforcementlearning/comments/9sh77q/what_are_your_best_tips_for…
Modified: March 28, 2022.
see: [ steering language models ], [ direct preference optimization ] We are given a bunch of pairwise preference evaluations, of the form…
Modified: .
There tends to be a lot going on in RL algorithms, with a whole mess of different quantities defined across timesteps. It's useful to try to…
Modified: April 23, 2022.
[ relationship advice ]
Modified: February 10, 2022.
see also (maybe combine with?) [ relationship ] Accept [ bids ] as much as possible. Praise your partner in public (and in private). Stay in…
Modified: July 13, 2020.
Modified: December 01, 2022.
Suppose we want a [ transformer ] to evaluate the inequality returning if and otherwise. For integer , this can be done with a…
Modified: February 13, 2023.
The selection operation y = where(c, a, b) returns How can a [ transformer ] layer implement this operation? One approach is to is to use…
Modified: February 12, 2023.
When I was younger---in college or in grad school---I was sometimes conflicted about whether I should prioritize trying to get to correct…
Modified: February 11, 2022.
Modified: .
If a model with data has normalizing constant , then the replica trick says that This allows us to analyze the average log-normalizer…
Modified: October 22, 2022.
In modern ML, representation learning is the art of trying to find useful abstractions, embodied as encoding networks. We can learn…
Modified: February 11, 2022.
To be a successful researcher it's incredibly important to find and join your [ research community ]. Go to conferences (especially to small…
Modified: February 25, 2022.
This note lists some ideas and directions for research I'm interested in or excited about. Some are more fleshed out than others, some more…
Modified: February 21, 2023.
Modified: .
(see also: [ impact ]) I've been feeling depressed partly because the actual PhD research I did was (in my view) pointless, and more broadly…
Modified: February 11, 2022.
People who do research have a very ground-level, zoomed-in view of their field. They know where the current obstacles are, how incredibly…
Modified: January 16, 2021.
Reservoir samplers solve the following task: sample items without replacement from a stream of unknown length . Because the length is…
Modified: May 10, 2022.
Teachers or centers I'd be interested to do a retreat with/at: Tucker Peck Michael Taft Tina Rasmussen (Cloud Mountain 13-day retreats…
Modified: August 27, 2021.
References: The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" https://arxiv.org/abs/2309.12288 Studying Large Language…
Modified: January 10, 2024.
References: Ludwig Winkler's post on Reverse time stochastic differential equations . Suppose we have a [ stochastic differential equation…
Modified: August 27, 2022.
stray thoughts about reward functions (probably related to the [ agent ] abstraction and the [ intentional stance ]) one can make a…
Modified: April 06, 2023.
When thinking about the [ reward ] function for a real-world AI system, there is always some causal process that determines reward. For…
Modified: April 12, 2023.
Silver, Singh, Precup, and Sutton argue that Reward is enough : maximizing a reward signal implies, on its own, a very broad range of…
Modified: March 02, 2022.
Suppose we have a [ Markov decision process ] in which we get reward only at the very end of a long trajectory. Until that point, we have no…
Modified: March 03, 2022.
See also: [ cooperative inverse reinforcement learning ], [ love is value alignment ]
Modified: June 12, 2021.
four [ right effort ]s: Restraint : avoid unwholesome situations that might give rise to or trigger unwholesome states and patterns. For…
Modified: October 03, 2024.
References: Liu, Zaharia, Abbeel. Ring Attention with Blockwise Transformers for Near-Infinite Context (2023). https://arxiv.org/abs/231…
Modified: February 19, 2024.
Things that might be useful to log in a [ reinforcement learning ] algorithm: Return of each trajectory. (summarize as mean/std/min/max…
Modified: April 11, 2022.
Implement MuZero or something similar. What are the 'state of the art' RL algorithms? What is known and not known about [ value alignment ]?
Modified: February 21, 2022.
Suppose we want to maximize reward, but we only get a couple bits of reward data every few hundreds/thousands of actions, whereas we get…
Modified: March 03, 2022.
Deriving here just for my own edification. At each timestep a rocket ejects mass at velocity relative to its current reference frame. At…
Modified: September 06, 2022.
About 5% of people are gay, so in any given community it's about twenty times harder for a gay person to find a partner than for a straight…
Modified: May 22, 2021.
Modified: February 14, 2021.
SuccessfulFriend highlighted this distinction which I should really read more about. At a high level it's about the distinction between…
Modified: July 13, 2020.
Language is a really natural way to tell AI systems what we want them to do. Some current examples: [ GPT ]-3 and successors (InstructGPT…
Modified: April 07, 2022.
In chemistry, a salt is a neutral-ish (not too acidic nor basic) compound held together by an [ ionic bond ]. Salts can be formed by [ acid…
Modified: January 22, 2022.
Modified: .
Modified: .
Scheduled sampling is a training procedure for sequence models that attempts to mitigate [ exposure bias ] - the problem in which generation…
Modified: October 13, 2022.
The score function is the gradient of a log-density with respect to its parameters: It is the direction that we would move the parameters…
Modified: July 21, 2022.
Aapo Hyvärinen: Estimation of Non-Normalized Statistical Models by Score Matching (2005) https://jmlr.org/papers/volume6/hyvarinen05a…
Modified: .
Modified: February 22, 2022.
[ unearned confidence ] [ agency and confidence ]
Modified: February 07, 2022.
https://forum.effectivealtruism.org/posts/QhPyQTXuGt58Nzxnu/you-are-probably-underestimating-how-good-self-love-can-be https://twitter.com…
Modified: October 24, 2024.
Modified: .
Sometimes it's necessary and right to prioritize my own interests, even if [ global utility ] is ultimately the only metric. Developing…
Modified: March 07, 2022.
I need to genuinely care about other people and want the best for them, both in general, and for specific people in my life. Why? Obviously…
Modified: April 01, 2022.
Traditional Buddhism describes six "sense bases" or gates: the eye, ear, nose, tongue, body, and mind. Western science usually omits the…
Modified: .
When we talk about "the self", or having a "sense of self", what do we mean? There is an interpretation in terms of [ consciousness ] - that…
Modified: February 03, 2025.
Take the statement 'human-level AI is possible'. As a kid, I saw this as obviously true. We can simulate physics, and brains are physical…
Modified: January 23, 2022.
There are 14 kinds of serotonin receptors; most (but not all) are [ G protein ]-coupled. The central nervous system has almost all of them…
Modified: May 14, 2021.
Shadow work means, roughly speaking, a practice of noticing, loving, and integrating the parts of yourself that you've repressed (your…
Modified: November 07, 2023.
Shard theory's basic ontology of RL holds that shards are contextually activated, behavior-steering computations in neural networks…
Modified: .
Modified: .
Modified: .
References: https://generative.ink/posts/simulators/ It seems pretty clear that the intelligence emerging from [ language model ]s is not…
Modified: February 16, 2023.
The performance of an investment can be modeled as where the 'market return' is that of some sufficiently broad index such as the S&P 50…
Modified: November 30, 2022.
In Korea (and maybe also Japan?) it's common for young guys to bond through physical touch and affection: hugging, holding hands, sitting in…
Modified: February 06, 2023.
It's weird that we lie down every day to cease our consciousness, and sometimes to hallucinate. There are physiological benefits to sleep…
Modified: July 19, 2022.
It's not a terrible summation of [ depression ] that it starts from seeing no way to achieve your goals. Sometimes that's because it's…
Modified: May 16, 2022.
Modified: .
Modified: May 08, 2020.
I've been really unhappy about how TFP is developed. It's felt pedantic. I waste a lot of effort thinking about things I don't want to think…
Modified: April 10, 2021.
This is the advice I wish I'd had. It's catered to my preferences; caveat emptor. Packing: (for men) bring one pair of long pants, for…
Modified: May 13, 2022.
paper 1: http://redwood.berkeley.edu/bruno/papers/VR.pdf basic idea: find a basis such that any given image (or whatever signal) can be…
Modified: .
References: https://redwood.berkeley.edu/wp-content/uploads/2020/08/KanervaP_SDMrelated_models1993.pdf A sparse distributed memory consists…
Modified: March 29, 2024.
References: Jacobs, Jordan, Nowlan, Hinton. Adaptive Mixtures of Local Experts (1991) Shazeer et al. Outrageously Large Neural Networks…
Modified: February 13, 2023.
I used to be able to say 'superintelligent AI is possible'. Now in industry the notion of 'possible' is 'something I can myself do': by…
Modified: February 23, 2020.
What would true wireheading feel like? People have this impression that it'd be thin, exhausting, artificial, ultimately isolating and not…
Modified: March 24, 2023.
Modified: .
A stablecoin is a dollar-denominated liability registered on a [ blockchain ]. It can be backed by USD reserves, as Tether allegedly is…
Modified: March 13, 2022.
Julian Shapiro recommends keeping a set of six 'Starting principles' that you use to make decisions. That's about all that you can…
Modified: February 07, 2022.
A common pattern in [ reinforcement learning ] pedagogy is to develop some idea first in the context of estimating state values , and then…
Modified: March 29, 2022.
A [ stochastic process ] is (strictly) stationary if all of its joint distributions are invariant under time displacement. It is wide…
Modified: August 28, 2022.
I've always been an evening person more than a [ morning person ]. I often stay up until 1 or 2am, and in the absence of hard constraints it…
Modified: March 14, 2023.
Getting language models to align their output with human preferences would be highly useful for [ computational life coach ]ing. What's the…
Modified: July 18, 2021.
SDEs are typically written in terms of the differential of a Weiner process (Brownian motion), e.g., Although Weiner processes are nowhere…
Modified: August 29, 2022.
Modified: .
A stochastic process is a collection of [ random variable ]s defined on a common [ probability space ] . Equivalently, it is a joint…
Modified: August 27, 2022.
Pasting a quote from Adam Smith by way of HN (source http://www.econlib.org/library/Smith/smMS7.html , I should read the whole thing…) that…
Modified: February 10, 2022.
A stopping time for a stochastic process is a time-valued That is, integer-valued for discrete-time processes and real-valued for…
Modified: August 27, 2022.
Modified: .
Modified: March 14, 2022.
Modified: August 02, 2021.
A lot of confused discussion around large organizations comes from conflating individual motivations with larger-scale 'structural…
Modified: December 14, 2022.
In kindergarten stats, you learn how to build a model that takes in data (a feature vector, image, sound file, etc) and predicts a single…
Modified: March 03, 2022.
Revolutionary ideas must live in the blind spots of the current intellectual conversation; otherwise people would already be using them…
Modified: January 24, 2022.
Note naming The general goal is to minimize the use of aliasing in links. In case where these guidelines suggest an unnatural or uncommon…
Modified: July 22, 2022.
substantive questions I've had these are things I've wondered about that were never answered properly in the classes in which I learned them…
Modified: February 17, 2022.
Substituted tryptamines - PsychonautWiki Tryptamine consists of an [ indole ] moeity plus a two-carbon (ethyl) chain with an amine group. We…
Modified: May 14, 2021.
Modified: February 25, 2022.
Modified: .
A sugar is any molecule with the empirical formula C(N)H(2N)O(N). These are like alkanes, which are C(N)H(2N + 2), except that each carbon…
Modified: February 10, 2022.
A -dimensional vector can represent distinct orthogonal features, but due to the weirdness of [ high-dimension ]al geometry, it can…
Modified: September 14, 2022.
What have I learned in 2.5 years at Google? What did I not realize? The model of research. How low the expectations are. How fake it felt to…
Modified: July 10, 2020.
Modified: October 03, 2021.
Refs: https://opentheory.net/Qualia_Formalism_and_a_Symmetry_Theory_of_Valence.pdf
Modified: August 06, 2023.
My Supernote A5X syncs through Dropbox, but unfortunately Dropbox doesn't support Windows ARM64 machines like the Surface Pro X. Here's my…
Modified: August 27, 2022.
Buddhist (Pali) term referring to craving, longing, desire for the world to be other than as it is. This includes craving good things and…
Modified: September 03, 2022.
There's a famous quote attributed to Eleanor Roosevelt: "great minds discuss ideas, average ones events, mediocre ones discuss people". This…
Modified: February 10, 2022.
A set of methods for maintaining an " attitude of spacious passion ". The particular methods are contingent; if you could maintain the…
Modified: March 23, 2022.
A general issue with [ temporal difference ] learning methods, which 'update a guess towards a guess', is that they can end up 'chasing…
Modified: April 23, 2022.
Something that confused for me for a while is that people in certain communities talk about 'teacher forcing' as though it's a trick or a…
Modified: October 13, 2022.
Dave's principles of effective teaching. Motivation is by far the most important thing. A student who wants to learn will learn even with a…
Modified: April 11, 2020.
As a researcher, I wonder if there's a 'critical point' of growing an idea when it's important to be [ teaching ] it, whether formally or…
Modified: April 05, 2020.
working with Sinclair, Klein, Abbeel, they’ve all got great experience and advice especially for large classes You don’t have to give the…
Modified: February 07, 2022.
Rob wants to firm up his foundations. He wants to understand relevant stats, probabilistic models, inference, and maybe work our way up to…
Modified: January 25, 2022.
Epistemic status: either this is true or TV is maybe one of the greatest contributions to human utility ever. Unclear. The average American…
Modified: January 24, 2022.
From David Silver's slides : TD-learning 'updates a guess towards a guess'. Sutton and Barto define the temporal difference error as the…
Modified: April 04, 2022.
Modified: .
Class 1 forehand grip racquet in right hand, as if picking it up from the ground. hand at bottom of handle. Grip the flat side of the…
Modified: .
This page (first brainstormed in an Otter note) is for issues where I feel pulled in several directions. Different principles seem to yield…
Modified: February 10, 2022.
Every in machine learning talks about tensors, but no one really understands what they are. This page collects several definitions and…
Modified: July 18, 2022.
The tensor product of two vector spaces (defined on the same scalar field, we'll assume ) is the vector space of formal sums of…
Modified: July 18, 2022.
This post gives a nice, mathematically clear development of basic terms in statistical mechanics. Highlights: Think of a physical system as…
Modified: April 15, 2022.
Modified: .
thai holy basil / hot basil thai basil turmeric rice noodles: thin (pad thai) or wide (pad see ew / pad kee mao) oyster sauce, fish sauce…
Modified: May 16, 2022.
Modified: February 19, 2020.
Status: in conflict with [ negative utility ] ? See also: the [ hedonic treadmill ]. Evolutionary, 'pain' exists to motivate you to get out…
Modified: March 20, 2022.
I used to think that there was a 'best' way to motivate an area. For example, in VI, the ELBO is derived from the KL divergence between a…
Modified: January 23, 2022.
[ Tucker Peck ] often mentions this as a thing Sharon Salzberg would say. What does it mean? I don't know - I should ask Tucker to clarify…
Modified: January 25, 2024.
It's almost never worth worrying about whether an individual action is the right thing to do. It's like trying to dance while worrying at…
Modified: May 16, 2022.
In order for a group of people, like an academic field, or a political elite, to meaningfully converse about a complex topic, they have to…
Modified: February 25, 2022.
Metaphor connected to the observation that [ all models are wrong ]. Borges, On Exactitude in Science : ...In that Empire, the Art of…
Modified: .
A point made by [ Michael Taft ] in various talks, e.g. The World is Inside You (also the '[ emptiness ] of perception' described by [ Dan…
Modified: February 03, 2025.
Comparing myself to SuccessfulFriend, I might be tempered to think that because he is interested in antitrust law, zoning reform, political…
Modified: August 15, 2020.
Andrew Gelman believes that in certain areas of research , like the social sciences, everything is connected. "I’m not expressing…
Modified: June 08, 2021.
In every field, there is a store of 'standard' advice that is handed down from mentors to ambitious youngsters. In computer science grad…
Modified: March 06, 2020.
The Feynmannian/Sagan/Tyson "scientific" view is that the [ purpose ] of life is understanding : the world is a giant mystery, with layers…
Modified: February 25, 2022.
It exists, but is [ empty ], insubstantial, a [ fabrication ]. Foregrounding this view is an important part of [ awakening ] or…
Modified: March 23, 2023.
[ things are deeply wrong ]
Modified: August 27, 2021.
If I'm managing someone, I want them to be coming up with their own ideas and owning them. Owning their ideas means they will themselves…
Modified: July 10, 2020.
tl;dr : the ideas we need to build intelligent systems may be different from those we need to understand them. Both are important, but…
Modified: February 26, 2022.
Several ideas here: When I try to tell a story about what I'd like to change about my life, at a high level, I can come at it from different…
Modified: May 07, 2020.
Modified: February 10, 2022.
For several reasons: multiple object-level causes a telescoping tower of causes at increasing levels of generality or abstraction 'because…
Modified: February 07, 2022.
If two statements that both seem true conflict with each other, then it seems like you have a paradox. But the world itself is just as it is…
Modified: January 23, 2022.
Modified: July 13, 2020.
Pointed out in this tweet: https://twitter.com/AmandaAskell/status/1311776280128479238 but also in many other places over the years…
Modified: February 10, 2022.
AI is going to work. Obviously lots of people believe this. But most 'AI' companies and 'AI' investors are hyping applications of current…
Modified: January 13, 2023.
No matter what other priorities or any incredibly important goals arise in my life, whether through work, family, or other circumstances…
Modified: February 23, 2020.
Write Write regularly: under routine circumstances, at least a few minutes per day. This could be filling in nodes of this graph, blogging…
Modified: February 15, 2020.
See also: [ the system is bad ] I find it hard to be okay with a 'normal' life, because that would imply some level of acceptance of the…
Modified: January 25, 2022.
Modified: February 10, 2022.
These might not be the best thing to do at any point, but they're better than doing nothing. And doing them can create a sense of progress…
Modified: September 12, 2021.
See also [ writing inbox ] See also [ ongoing projects ] See, first and foremost, the backlinks below. Crypto trading model. Write a system…
Modified: November 13, 2021.
This is one of those things that sounds cliche but is still profound and #fundamental: this is all. There's no great reward in the future…
Modified: July 25, 2020.
Modified: .
I want kids, eventually. I want to be able to talk with them, to build a relationship, to see the world through someone else's eyes. I want…
Modified: February 25, 2022.
The [ agent ] model of intelligence imposes a sharp distinction between the agent and its environment, where the agent 'chooses' actions…
Modified: June 27, 2021.
let's say the signal we see after the intervention is modeled as the combination of the counterfactual forecast and an intervention effect…
Modified: February 15, 2022.
[ impermanence ] [ dukkha ] (unsatisfactoriness) [ no-self|annita ] (no-self) Daniel Ingram's summary: things "come and go, don't satisfy…
Modified: May 19, 2022.
(fellow student) Smitha has these post-its on her desk: what are you doing? why is it important? are you making progress? I think these…
Modified: February 16, 2022.
This is personal mental image for [ emptiness ] that has been really resonant for me, arising from an experience taking [ MDMA ] with a…
Modified: January 24, 2024.
It helps a lot to write down the things you think someone should know about working in a new environment. Even if a new person would figure…
Modified: September 07, 2020.
I feel an obligation to try to do big things with my life, because I've had access to rare opportunities. If ten thousand randomly selected…
Modified: January 25, 2022.
Television: Arcane ken burns on the vietnam war WandaVision For All Mankind Severance: https://m.imdb.com/title/tt11280740/ Borgen Diplomat…
Modified: March 20, 2022.
How should a machine learning model represent text? Word-level and character-level features are obvious options, but both have drawbacks…
Modified: February 13, 2023.
Sometimes mentioned as a potential approach to [ AI safety ]. Gwern: Why Tool AIs want to be Agent AIs (roughly: because treating…
Modified: April 07, 2022.
Notes on Toolformer: Language Models Can Teach Themselves to Use Tools The basic method is: "Given just a handful of human-written examples…
Modified: February 16, 2023.
Trace of a Linear Operator We define the trace as the sum of diagonal elements of a matrix: Lemma : If and are square, then . Proof…
Modified: March 16, 2022.
There are three main approaches to moral philosophy: [ utilitarian ]ism: you should feed a starving person because it will increase 'global…
Modified: June 07, 2021.
These days we think a lot about using data to train large [ language model ]s. But there's only so much data in the world; eventually we'll…
Modified: October 27, 2022.
I didn't have a good intuitive understanding of the social landscape of being a researcher (and joining a [ research community ]). When…
Modified: February 25, 2022.
If you and I agree of our own volition to exchange X for Y, this implies that we both believe we are gaining value in the trade. If one of…
Modified: February 22, 2022.
The core of the transformer architecture is multi-headed [ attention ]. The transformer block consists of a multi-headed attention layer…
Modified: February 13, 2023.
What does the computational profile of a transformer vs a similar RNN look like? First, the transformer. Let's take the LLama 6.7B model…
Modified: October 04, 2023.
In developing intuition about [ transformer ]s it's useful to think about specific primitive operations that can be implemented by a small…
Modified: February 13, 2023.
Incorporating explicit memory and retrieval seems pretty clearly like the next frontier in language modeling and AI more broadly. We have…
Modified: September 03, 2022.