Trace of a Linear Operator We define the trace as the sum of diagonal elements of a matrix: Lemma : If and are square, then . Proof…
There are three main approaches to moral philosophy: [ utilitarian ]ism: you should feed a starving person because it will increase 'global…
These days we think a lot about using data to train large [ language model ]s. But there's only so much data in the world; eventually we'll…
I didn't have a good intuitive understanding of the social landscape of being a researcher (and joining a [ research community ]). When…
If you and I agree of our own volition to exchange X for Y, this implies that we both believe we are gaining value in the trade. If one of…
The core of the transformer architecture is multi-headed [ attention ]. The transformer block consists of a multi-headed attention layer…
What does the computational profile of a transformer vs a similar RNN look like? First, the transformer. Let's take the LLama 6.7B model…
In developing intuition about [ transformer ]s it's useful to think about specific primitive operations that can be implemented by a small…
Incorporating explicit memory and retrieval seems pretty clearly like the next frontier in language modeling and AI more broadly. We have…
According to this reddit post , one of the main takeaways of functional analysis is that the right way to interpret the 'transpose' of a…
SSC link: How general is this phenomenon? You have a belief Your belief colors your perception of something that doesn't inherently…
Sasha Chapin describes trauma as a 'splitting off' of difficult or painful experiences as memories that the mind tries to avoid accessing…
A pitfall with relying too heavily on rational deduction is that lots of logically 'true' conclusions are unimportant, or worse yet…
(notes loosely based on the Berkeley deep RL course lecture ) Setup: RL with policy gradients The basic setup is that we want to optimize…
Language is an incredible bottleneck. There are infinitely many true facts about the world, even just in pure math, and yet we communicate…
The reason to try new things is not really because the new things themselves are more exciting than the old ones. The reason is that it…
From Jeff Bezos' 1997 shareholder letter : Some decisions are consequential and irreversible or nearly irreversible – one-way doors – and…
Inspired by Kevin Buzzard's overview of the state of automatic theorem provers. Type theory is like set theory in that sets and types are…
nostalgebraist argues that unconditional love can't and shouldn't exist : A parent might love their child "unconditionally," in the well…
update April 2024: I'm going to leave this here, but I now think about confidence in less of an information-theoretic belief way, and more…
It's a basic law of probability that, given two events A and B, the probability that at least one of them occurs is given by This counts the…
(this note expresses a tendency that I notice in myself. I don't necessarily endorse this tendency but I think it's interesting to…
From a review by [ Oliver Burkeman ] of Jordan Peterson's "Beyond Order" ( https://www.theguardian.com/books/2021/mar/02/beyond-order-by…
in contrast to [ things I believe that no one else believes ], which are intended to be potentially-novel insights about the world…
like a 'useful perpective', but 'lens' implies focus or distortion whereas 'perspective' implies linear projection. Related to [ many models…
Consuming unstructured content from the internet is addictive. Twitter is full of life advice, interesting technical discussion, takes on…
Suppose I have an agent that generates text. I want it to generate text that is [ value alignment|aligned ] with human values. Approaches…
Notes on the Alignment Forum's Value Learning sequence curated by Rohin Shah. ambitious value learning : the idea of learning 'the human…
The standard [ Markov decision process ] formalism includes a reward function ; the total (discounted) reward across a trajectory is its…
References: Jacob Eisner, High-Level Explanation of Variational Inference (2011) https://www.cs.jhu.edu/~jason/tutorials/variational.html…
Holy shit. In December on Galiano I was brainstorming about [ continuous structure learning ] and thought of the general trick, for…
The divergence of a vector-valued function on a vector field measures the extent to which a given point is a source of the field. It…
Inspired by [ Emily ], I'm considering going 'mostly' vegetarian. What would that mean for me? I don't myself buy meats or dairy products…
I don't know quite how to articulate or formalize this, but I get a sense that there is something fundamentally analogue, 'periodic' or…
Why am I doing all of this? If I carve aside hours or days or months to 'fill in' my graph of notes, what am I hoping to get from it? Why is…
Ref: https://arxiv.org/abs/2010.11929 We start by chunking an image into patches, and concatenating each patch with a position embedding…
Telling people about your failures, your fears, your self-doubt, your insecurities can be a path towards deeper connection. Understanding…
How to be warm: https://www.youtube.com/watch?v=1MolmoFuXu4&t=123s
The "strength of weak ties": most good things in life come from people you barely know. This is because your close, regular connections are…
Thinking through: Why the toughest capitalists should root for a wealth tax ( https://www.ft.com/content/e1adf707-b95a-4422-9211-1841cd7ce…
Moxie Marlinspike on web3: https://moxie.org/2022/01/07/web3-first-impressions.html We know that people do not want to run their own…
[ weekly review ] • Plus: What went well? • Minus: What didn't go so well? • Next: What will I focus on next week?
Reference: Mahmood et al., 2014. Weighted importance sampling for off-policy learning with linear function approximation Here's a situation…
I suspect many of these are evergreen. I'm not [ writing ] enough. I'm not keeping up a regular journaling practice.
Just like norms in the Trump administration, there are mental habits, rhythms of life, attitudes towards the world, that are powerfully…
In the course of any person's life, you take in a vast amount of information. You have your own personal experiences, of course, and you…
See also: [ if ever a prof ], [ advice for college students ] Things not directly related to course material that I wish I'd learned earlier…
What will I do when I don't have a job? I don't feel that I have a clear direction. I want to learn and explore. There are lots of [ my…
A story from [ Dan Brown ]: A group of psychologists came to interview the Dalai Lama, the spiritual leader of Tibet. One of the Americans…
As a kid, we learned about https://en.wikipedia.org/wiki/The_Game_(mind_game) : if you think of the game, you lose. (and have to say "I…
From 2017: wisdom I've acquired: the psychology of depression. :-( and grad school. :-( and being gay. [ dual-process cognition ] theory…
“It was true that I didn’t have much ambition, but there ought to be a place for people without ambition, I mean a better place than the one…
This may be a central point of confusion: how do we define AI systems that have preferences about the real world , so that their goals and…
In software: a library is a collection of tools. You can use some or all of them, in combination with other tools. A framework , on the…
your writing needs to be at the edge of your knowledge, it needs to address the most fascinating people you know or can imagine. That is…
Quote I like from Manuel Blum's advice to grad students , connecting writing to the power of [ Turing machine ]s: STUDYING: You are all…
What is the philosophy of the project? What principles is it betting on? Example from Ben's Ads doc: iterating on an end-to-end pipeline…
Regular writing practices that would be valuable. [ prediction as a model-building exercise ]
The models we use in AI are [ all models are wrong|wrong ] (if maybe still useful). How? Agency The [ agent ] model assumes a separation of…
status: a theory that feels true for my personal trajectory. Totally uncritiqued and unverified that anyone else shares this experience…
yin is being yang is doing there is a profound relationship between those two at a deep level and/but there is a whole web of associations…
Sometimes it's daunting how much knowledge there is in the world. For any given area, there are a thousand specialties and subspecialties…
Something that SuccessfulFriend said today: It's rare that someone totally independent comes up with a really good idea. The best ideas come…
A zero-knowledge proof allows a prover to demonstrate that it possesses certain information, without revealing that information to the…
A zk-SNARK, or zero knowledge Succinct Non-interactive Argument of Knowledge, is a [ zero knowledge ] proof system that is non-interactive…