LessWrong (30+ Karma)

LessWrong ([email protected])

Audio narrations of LessWrong posts.

Technology Philosophy Society & Culture

Update frequency: every day
Average duration: 18 minutes
Episodes: 577
Years Active: 2025

Share to:

[Linkpost] “Anthropic Lets Claude Opus 4 & 4.1 End Conversations” by Stephen Martin

This is a link post.

Citing model welfare concerns, Anthropic has given Claude Opus 4 & 4.1 the ability to end ongoing conversations with its user.

Most of the model welfare concerns Anthropic is cit…

00:05:56 | Sat 16 Aug 2025

“The Collider Bias Theory of (Not Quite) Everything” by Jack_S

Quick Summary

Collider bias and Berkson's paradox are pretty common and often neglected
I think it's not just a niche statistical concept: it explains a bunch of interesting stuff, and has some us…

00:19:12 | Sat 16 Aug 2025

“The Inheritors: a book review” by Alex_Altair

I recently read a novel called The Inheritors, by William Golding. It was slow, it was painful, and before I was even done it had become one of my favorite books.

For whatever reason, there is a dif…

00:05:22 | Sat 16 Aug 2025

“Towards data-centric interpretability with sparse autoencoders” by Nick Jiang, lilysun004, lewis smith, Neel Nanda

Nick and Lily are co-first authors on this project. Lewis and Neel jointly supervised this project.

TL;DR

We use sparse autoencoders (SAEs) for four textual data analysis
tasks—data diffing, findin…

00:36:04 | Sat 16 Aug 2025

“The Evolution of Agency - A Research Agenda” by Jonas Hallgren, markov

In Douglas Hofstadter's "Gödel, Escher, Bach," he explores how simple elements give rise to complex wholes that seem to possess entirely new properties. An ant colony provides the perfect real-world…

00:14:28 | Sat 16 Aug 2025

“Thoughts on Gradual Disempowerment” by Tom Davidson

Epistemic status: very rough! Spent a couple of days reading the Gradual Disempowerment paper and thinking about my view on it. Won’t spend longer on this, so am sharing rough notes as is

Summary

I…

00:37:29 | Sat 16 Aug 2025

“A philosophical kernel: biting analytic bullets” by jessicata

Sometimes, a philosophy debate has two basic positions, call them A and B. A matches a lot of people's intuitions, but is hard to make realistic. B is initially unintuitive (sometimes radically so),…

00:27:06 | Fri 15 Aug 2025

“Spending Too Much Time At Airports” by Zvi

In honor of Nate Silver's analysis of when to leave for the airport, and because it's been an intense week, I thought I’d offer my thoughts on various related questions.

Buying The Ti…

00:12:13 | Fri 15 Aug 2025

“Misalignment classifiers: Why they’re hard to evaluate adversarially, and why we’re studying them anyway” by charlie_griffin, ollie, oliverfm, Rogan Inglis, Alan Cooney

Even if the misalignment risk from current AI agents is small, it may be useful to start internally deploying misalignment classifiers: language models designed to classify transcripts that represen…

00:35:29 | Fri 15 Aug 2025

[Linkpost] “In defense of the amyloid hypothesis” by dsj

This is a link post.

I wrote a defense of the amyloid hypothesis as an ACX guest post. Scott called it "one of the best things I've read all year, and the first thing on Alzheimers that makes me actu…

00:00:41 | Fri 15 Aug 2025

“Training a Reward Hacker Despite Perfect Labels” by ariana_azarbal, vgillioz, TurnTrout

Summary: Perfectly labeled outcomes in training can still boost reward hacking tendencies in generalization. This can hold even when the train/test sets are drawn from the exact same distribution. W…

00:13:20 | Fri 15 Aug 2025

“Somebody invented a better bookmark” by Alex_Altair

This will only be exciting to those of us who still read physical paper books. But like. Guys. They did it. They invented the perfect bookmark.

Classic paper bookmarks fall out easily. You have to p…

00:03:36 | Thu 14 Aug 2025

[Linkpost] “METR Research Update: Algorithmic vs. Holistic Evaluation” by David Rein

This is a link post.

TL;DR

On 18 real tasks from two large open-source repositories, early-2025 AI agents often implement functionally correct code that cannot be easily used as-is, because of issue…

00:01:24 | Thu 14 Aug 2025

“Should you make stone tools?” by Alex_Altair

Knowing how evolution works gives you an enormously powerful tool to understand the living world around you and how it came to be that way. (Though it's notoriously hard to use this tool correctly, …

00:06:03 | Thu 14 Aug 2025

“Doing A Thing Puts You in The Top 10% (And That Sucks)” by Brendan Long

I've gone snowboarding about 30 times since I started learning a few years ago, but every time I'm on a lift, most of the other riders have been out 90 days just this season[1]. In fact, almost ever…

00:03:29 | Thu 14 Aug 2025

“GPT-5s Are Alive: Synthesis” by Zvi

What do I ultimately make of all the new versions of GPT-5?

The practical offerings and how they interact continues to change by the day. I expect more to come. It will take a while for things to s…

01:05:41 | Wed 13 Aug 2025

“Launching new AIXI research community website + reading group(s)” by Cole Wyeth

We have recently launched a new website / blog / community hub for AIXI / algorithmic information theory (AIT) researchers: https://uaiasi.com/

Marcus Hutter's vision is to strengthen the research c…

00:01:21 | Wed 13 Aug 2025

[Linkpost] “Why Are There So Many Rationalist Cults?” by omark

This is a link post.

Linkpost for Ozy Brennan's August 2025 Asterisk Magazine article.

There's a lot to like about the Rationalist community, but they do have a certain tendency to spawn — shall we s…

00:00:48 | Wed 13 Aug 2025

“Enlightenment AMA” by lsusr

Awakening/satori is the process by which meditation permanently cures[1] a person of suffering. I have noticed that people who have gone through the process of awakening usually have little intrinsi…

00:01:46 | Wed 13 Aug 2025

“Mech Interp Wiki Page and Why You Should Edit Wikipedia” by Noah Birnbaum, JoNeedsSleep

TL;DR:
A couple months ago, we (Jo and Noah) wrote the first Wikipedia article on Mechanistic Interpretability. It was oddly missing despite Mech Interp's visibility in alignment circles. We think W…

00:03:20 | Wed 13 Aug 2025

Disclaimer: The podcast and artwork embedded on this page are the property of LessWrong ([email protected]). This content is not affiliated with or endorsed by eachpod.com.

EachPod

EachPod

LessWrong (30+ Karma)

[Linkpost] “Anthropic Lets Claude Opus 4 & 4.1 End Conversations” by Stephen Martin

“The Collider Bias Theory of (Not Quite) Everything” by Jack_S

“The Inheritors: a book review” by Alex_Altair

“Towards data-centric interpretability with sparse autoencoders” by Nick Jiang, lilysun004, lewis smith, Neel Nanda

“The Evolution of Agency - A Research Agenda” by Jonas Hallgren, markov

“Thoughts on Gradual Disempowerment” by Tom Davidson

“A philosophical kernel: biting analytic bullets” by jessicata

“Spending Too Much Time At Airports” by Zvi

“Misalignment classifiers: Why they’re hard to evaluate adversarially, and why we’re studying them anyway” by charlie_griffin, ollie, oliverfm, Rogan Inglis, Alan Cooney

[Linkpost] “In defense of the amyloid hypothesis” by dsj

“Training a Reward Hacker Despite Perfect Labels” by ariana_azarbal, vgillioz, TurnTrout

“Somebody invented a better bookmark” by Alex_Altair

[Linkpost] “METR Research Update: Algorithmic vs. Holistic Evaluation” by David Rein

“Should you make stone tools?” by Alex_Altair

“Doing A Thing Puts You in The Top 10% (And That Sucks)” by Brendan Long

“GPT-5s Are Alive: Synthesis” by Zvi

“Launching new AIXI research community website + reading group(s)” by Cole Wyeth

[Linkpost] “Why Are There So Many Rationalist Cults?” by omark

“Enlightenment AMA” by lsusr

“Mech Interp Wiki Page and Why You Should Edit Wikipedia” by Noah Birnbaum, JoNeedsSleep