LessWrong (30+ Karma)

Audio narrations of LessWrong posts.

Society & Culture Philosophy Technology

Update frequency: every day
Average duration: 18 minutes
Episodes: 571
Years Active: 2025

“Will Any Old Crap Cause Emergent Misalignment?” by J Bostock

The following work was done independently by me in an afternoon and basically entirely vibe-coded with Claude. Code and instructions to reproduce can be found here.

Emergent Misalignment was discove…

00:08:40 | Wed 27 Aug 2025

“Attaching requirements to model releases has serious downsides (relative to a different deadline for these requirements)” by ryan_greenblatt

Here's a relatively important question regarding transparency requirements for AI companies: At which points in time should AI companies be required to disclose information? (While I focus on transp…

00:06:38 | Wed 27 Aug 2025

“[Anthropic] A hacker used Claude Code to automate ransomware” by bohaska

Anthropic post title: Detecting and countering misuse of AI: August 2025

Read the full report here. Below lines are from the Anthropic post, and have not been edited. Accompanying images are availab…

00:07:25 | Wed 27 Aug 2025

“AI companies have started saying safeguards are load-bearing” by Zach Stein-Perlman

There are two ways to show that an AI system is safe: show that it doesn't have dangerous capabilities, or show that it's safe even if it has dangerous capabilities. Until three months ago, AI compan…

00:11:49 | Wed 27 Aug 2025

“Reports Of AI Not Progressing Or Offering Mundane Utility Are Often Greatly Exaggerated” by Zvi

In the wake of the confusions around GPT-5, this week had yet another round of claims that AI wasn’t progressing, or AI isn’t or won’t create much value, and so on. There were reports that one study…

00:32:27 | Wed 27 Aug 2025

“AI Induced Psychosis: A shallow investigation” by Tim Hua

“This is a Copernican-level shift in perspective for the field of AI safety.” - Gemini 2.5 Pro

“What you need right now is not validation, but immediate clinical help.” - Kimi K2

Two Minute Summary

…

00:56:47 | Wed 27 Aug 2025

“Do-Divergence: A Bound for Maxwell’s Demon” by johnswentworth, David Lorell

Let's start with the classic Maxwell's Demon setup.

We have a container of gas, i.e. a bunch of molecules bouncing around. Down the middle of the container is a wall with a tiny door in it, which c…

00:06:07 | Wed 27 Aug 2025

“Harmless reward hacks can generalize to misalignment in LLMs” by Mia Taylor, Owain_Evans

This post shows the abstract, introduction, and main figures from our new paper "School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs".

TL;DR: We train LLMs on d…

00:18:14 | Wed 27 Aug 2025

“Aesthetic Preferences Can Cause Emergent Misalignment” by Anders Woodruff

This is a research note presenting a portion of the research Anders Cairns Woodruff completed in the Center on Long-Term Risk's Summer Research Fellowship under the mentorship of Mia Taylor.

The dat…

00:10:17 | Tue 26 Aug 2025

“Hidden Reasoning in LLMs: A Taxonomy” by Rauno Arike

Summary

When discussing the possibility that LLMs will cease to reason in transparent natural language with other AI safety researchers, we have sometimes noticed that we talk past each other: e.g.,…

00:26:01 | Tue 26 Aug 2025

“A Comprehensive Guide to Running” by Declan Molony

Just because you can run, it doesn't mean that you know how to do it properly.

This systematic review showed that:

50% of runners experience an injury each year that prevents them from running for a…

00:36:18 | Tue 26 Aug 2025

[Linkpost] “Breastfeeding and IQ: Effects shrink as you control for more confounders” by Nina Panickssery

This is a link post.

Socioeconomic status, parental education, and parental intelligence have strong effects on child IQ and are themselves correlated with breastfeeding practices. When studies ignor…

00:03:13 | Tue 26 Aug 2025

“New Paper on Reflective Oracles & Grain of Truth” by Cole Wyeth

This is a linkpost for https://www.arxiv.org/pdf/2508.16245

With Marcus Hutter, Jan Leike (@janleike), and Jessica Taylor (@jessicata) , I have revisited Leike et al.'s paper "A Formal Solution to t…

00:02:03 | Tue 26 Aug 2025

“Neuroscience of human sexual attraction triggers (3 hypotheses)” by Steven Byrnes

tl;dr

There's a stereotype that male sexual attraction is triggered mainly by appearance, and female sexual attraction is triggered mainly by status.

…Yes I know, this stereotype is grossly oversimpli…

00:26:08 | Tue 26 Aug 2025

“Before LLM Psychosis, There Was Yes-Man Psychosis” by johnswentworth

A studio executive has no beliefs

That's the way of a studio system

We've bowed to every rear of all the studio chiefs

And you can bet your ass we've kissed 'em

Even the birds in the Hollywood hil…

00:05:27 | Mon 25 Aug 2025

“Arguments About AI Consciousness Seem Highly Motivated And At Best Overconfident” by Zvi

I happily admit I am deeply confused about consciousness.

I don’t feel confident I understand what it is, what causes it, which entities have it, what future entities might have it, to what extent …

00:47:45 | Mon 25 Aug 2025

“The Best Materials To Build Any Intuition” by Algon

Many textbooks, tutorials or ... tapes leave out the ways people actually think about a subject, and leave you to fumble your way to your own picture. They don't even attempt to help you build intui…

00:05:31 | Mon 25 Aug 2025

“Kids and Cleaning” by jefftk

Before having kids I thought teaching them to clean up would be similar to the rest of parenting: once they're physically able to do it you start practicing with them, and after a while they're …

00:05:20 | Mon 25 Aug 2025

“Futility Illusions” by silentbob

…or the it doesn’t make a difference anyway fallacy.

Improving Productivity is Futile

I once had a coaching call on some generic productivity topic along the lines of “I’m not getting done as m…

00:08:36 | Sun 24 Aug 2025

“Notes on cooperating with unaligned AI” by Lukas Finnveden

These are some research notes on whether we could reduce AI takeover risk by cooperating with unaligned AIs. I think the best and most readable public writing on this topic is “Making deals with ear…

01:01:04 | Sun 24 Aug 2025

Disclaimer: The podcast and artwork embedded on this page are the property of LessWrong ([email protected]). This content is not affiliated with or endorsed by eachpod.com.

EachPod