Audio narrations of LessWrong posts.
Audio note: this article contains 61 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
A lot of our work involves…
For people who care about falsifiable stakes rather than vibes
TL;DR
All timeline arguments ultimately turn on five quantitative pivots. Pick optimistic answers to three of them and your median fore…
Last week I covered that GPT-4o was briefly an (even more than usually) absurd sycophant, and how OpenAI responded to that.
Their explanation at that time was paper thin. It didn’t tell us much tha…
arXiv | project page | Authors: Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang
This paper from Tsinghua find that RL on verifiable rewar…
A corollary of Sutton's Bitter Lesson is that solutions to AI safety should scale with compute. Let me list a few examples of research directions that aim at this kind of solution:
Reproducing a result from recent work, we study a Gemma 3 12B instance trained to take risky or safe options; the model can then report its own risk tolerance. We find that:
We’ve been looking for joinable endeavors in AI safety outreach over the past weeks and would like to share our findings with you. Let us know if we missed any and we’ll add them to the list.
For co…
I contributed one (1) task to HCAST, which was used in METR's Long Tasks paper. This gave me some thoughts I feel moved to share.
Regarding Baselines and Estimates
METR's tasks have two sources for …
Utilitarianism implies that if we build an AI that successfully maximizes utility/value, we should be ok with it replacing us. Sensible people add caveats related to how hard it’ll be to determine t…
AI 2027 is a Bet Against Amdahl's Law was my attempt to summarize and analyze "the key load-bearing arguments AI 2027 presents for short timelines". There were a lot of great comments – every time I…
Strength
In 1997, with Deep Blue's defeat of Kasparov, computers surpassed human beings at chess. Other games have fallen in more recent years: Go, Starcraft, and League of Legends among them. AI is…
(Disclaimer: Post written in a personal capacity. These are personal hot takes and do not in any way represent my employer's views.)
TL;DR: I do not think we will produce high reliability methods t…
Cryonics Institute and Suspended Animation now have an arrangement where Suspended Animation will conduct a field cryopreservation before shipping the body to Cryonics Institute, thus decreasing tis…
Politico writes:
The [Ukrainian] program […] rewards soldiers with points if they upload videos proving their drones have hit Russian targets. It will soon be integrated with a new online marketplac…
Burnout. Burn out? Whatever, it sucks.
Burnout is a pretty confusing thing made harder by our naive reactions being things like “just try harder” or “grit your teeth and push through”, which usuall…
As an employee of the European AI Office, it's important for me to emphasize this point: The views and opinions of the author expressed herein are personal and do not necessarily reflect those of th…
As an employee of the European AI Office, it's important for me to emphasize this point: The views and opinions of the author expressed herein are personal and do not necessarily reflect those of th…
Gemini 2.5 Pro is sitting in the corner, sulking. It's not a liar, a sycophant or a cheater. It does excellent deep research reports. So why does it have so few friends? The answer, of course, is par…
AI progress is driven by improved algorithms and additional compute for training runs. Understanding what is going on with these trends and how they are currently driving progress is helpful for und…
Right before releasing o3, OpenAI updated its Preparedness Framework to 2.0. I previously wrote an analysis of the Preparedness Framework 1.0. I still stand by essentially everything I wrote in that…