Audio narrations of LessWrong posts.
The following is a nitpick on an 18 year old blog post.
This fable is retold a lot. The progenitor of it as a rationalist mashal is probably Yudkowsky's classic sequence article. To adversarially su…
Leo was born at 5am on the 20th May, at home (this was an accident but the experience has made me extremely homebirth-pilled). Before that, I was on the minimally-neurotic side when it came to expec…
Daniel notes: This is a linkpost for Vitalik's post. I've copied the text below so that I can mark it up with comments.
...
Special thanks to Balvi volunteers for feedback and review
In April this…
This essay is about shifts in risk taking towards the worship of jackpots and its broader societal implications. Imagine you are presented with this coin flip game.
How many times do you flip it?
At…
METR released a new paper with very interesting results on developer productivity effects from AI. I have copied their blog post here in full.
We conduct a randomized controlled trial (RCT) to unde…
I wrote a guide to Redwood's writing:
Section 1 is a quick guide to the key ideas in AI control, aimed at someone who wants to get up to speed as quickly as possible.
Section 2 …
Written in an attempt to fulfill @Raemon's request.
AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely …
I've seen many prescriptive contributions to AGI governance take the form of proposals for some radically new structure. Some call for a Manhattan project, others for the creatio…
This is the unedited text of a post I made on X in response to a question asked by @cube_flipper: "you say opus 3 is close to aligned – what's the negative space here, what makes it misaligned?". I …
I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.
(Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec profession…
People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite direct exhortation against that exact interpretation. But I don’t know of any source directly describing a …
As AI models become more sophisticated, a key concern is the potential for “deceptive alignment” or “scheming”. This is the risk of an AI system becoming aware that its goals do not align with human…
Jordan Taylor*, Connor Kissane*, Sid Black*, Jacob Merizian*, Alex Zelenka-Marin, Jacob Arbeid, Ben Millwood, Alan Cooney, Joseph Bloom
Joseph Bloom, Alan Cooney
This is a research updat…
Hi! We’re Chana and Aric, from the new 80,000 Hours video program.
For over a decade, 80,000 Hours has been talking about the world's most pressing problems in newsletters, articles…
One of the most common (and comfortable) assumptions in AI safety discussions—especially outside of technical alignment circles—is that oversight will save us. Whether it's a hum…
Here are two potential problems you’ll face if you’re an AI lab deploying powerful AI:
It was the July 4 weekend. Grok on Twitter got some sort of upgrade.
Elon Musk: We have improved @Grok significantly.
You should notice a difference when you ask Grok questions.
Indeed we did not…
It was the July 4 weekend. Grok on Twitter got some sort of upgrade.
Elon Musk: We have improved @Grok significantly.
You should notice a difference when you ask Grok questions.
Indeed we did not…
I've increasingly found right-wing political frameworks to be valuable for thinking about how to navigate the development of AGI. In this post I've copied over a twitter thread I wrote about three r…
Thank you to Arepo and Eli Lifland for looking over this article for errors.
I am sorry that this article is so long. Every time I thought I was done with it I ran into more issues with the model, …