LessWrong (30+ Karma)

Audio narrations of LessWrong posts.

Philosophy Society & Culture Technology

Update frequency: every day
Average duration: 18 minutes
Episodes: 583
Years Active: 2025

“Selective Generalization: Improving Capabilities While Maintaining Alignment” by ariana_azarbal, Matthew A. Clarke, jorio, Cailley Factor, cloud

Audio note: this article contains 53 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

Ariana Azarbal*, Matthew A…

00:18:14 | Thu 17 Jul 2025

“Bodydouble / Thinking Assistant matchmaking” by Raemon

I keep meaning to write up a more substantive followup to the Hire (or Become) a Thinking Assistant post. But, I think this still basically the biggest productivity effect size I know of, and more p…

00:04:03 | Thu 17 Jul 2025

“Kimi K2” by Zvi

While most people focused on Grok, there was another model release that got uniformly high praise: Kimi K2 from Moonshot.ai.

It's definitely a good model, sir, especially for a cheap-to-run open mo…

00:26:46 | Wed 16 Jul 2025

“Grok 4 Various Things” by Zvi

Yesterday I covered a few rather important Grok incidents.

Today is all about Grok 4's capabilities and features. Is it a good model, sir?

It's not a great model. It's not the smartest or best mod…

01:13:57 | Wed 16 Jul 2025

“Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety” by Tomek Korbak, Mikita Balesni, Vlad Mikulik, Rohin Shah

Twitter | Paper PDF

Seven years ago, OpenAI five had just been released, and many people in the AI safety community expected AIs to be opaque RL agents. Luckily, we ended up with reasoning models th…

00:02:16 | Tue 15 Jul 2025

“Do confident short timelines make sense?” by TsviBT, abramdemski

TsviBT

Tsvi's context

Some context:

My personal context is that I care about decreasing existential risk, and I think that the broad distribution of efforts put forward by X-deriskers fairly strong…

02:11:00 | Tue 15 Jul 2025

[Linkpost] “LLM-induced craziness and base rates” by Kaj_Sotala

This is a link post.

One billion people use chatbots on a weekly basis. That's 1 in every 8 people on Earth.

How many people have mental health issues that cause them to develop religious delusions o…

00:04:12 | Tue 15 Jul 2025

[Linkpost] “Bernie Sanders (I-VT) mentions AI loss of control risk in Gizmodo interview” by Matrice Jacobine

This is a link post.

Most of the interview is about technological unemployment and AI-driven inequality, but here is the part where Sanders talks about loss of control risk (which Gizmodo put as titl…

00:03:10 | Mon 14 Jul 2025

“Recent Redwood Research project proposals” by ryan_greenblatt, Buck, Julian Stastny, joshc, Alex Mallen, Adam Kaufman, Tyler Tracy, Aryan Bhatt, Joey Yudelson

Previously, we've shared a few higher-effort project proposals relating to AI control in particular. In this post, we'll share a whole host of less polished project proposals. All of these projects …

00:08:11 | Mon 14 Jul 2025

“Narrow Misalignment is Hard, Emergent Misalignment is Easy” by Edward Turner, Anna Soligo, Senthooran Rajamanoharan, Neel Nanda

Anna and Ed are co-first authors for this work. We’re presenting these results as a research update for a continuing body of work, which we hope will be interesting and useful for others working on …

00:11:14 | Mon 14 Jul 2025

“Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings” by Casey Barkan, Sid Black, Oliver Sourbut

Audio note: this article contains 270 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.

This post is a companion …

00:23:55 | Mon 14 Jul 2025

“Self-preservation or Instruction Ambiguity? Examining the Causes of Shutdown Resistance” by Senthooran Rajamanoharan, Neel Nanda

This is a write-up of a brief investigation into shutdown resistance undertaken by the Google DeepMind interpretability team.

TL;DR

Why do models sometimes resist shutdown? Are they ignoring instruc…

00:19:16 | Mon 14 Jul 2025

“Worse Than MechaHitler” by Zvi

Grok 4, which has excellent benchmarks and which xAI claims is ‘the world's smartest artificial intelligence,’ is the big news.

If you set aside the constant need to say ‘No, Grok, No,’ is it a goo…

00:45:39 | Mon 14 Jul 2025

“How Does Time Horizon Vary Across Domains?” by Thomas Kwa

Summary

In the paper Measuring AI Ability to Complete Long Software Tasks (Kwa & West et al. 2025), METR defined an AI model's 50% time horizon as the length of tasks (measured by how long they take…

00:36:47 | Mon 14 Jul 2025

“xAI’s Grok 4 has no meaningful safety guardrails” by eleventhsavi0r

This article includes descriptions of content that some users may find distressing.

Testing was conducted on July 10 and 11; safety measures may have changed since then.

I’m a longtime lurker who …

00:10:31 | Mon 14 Jul 2025

“Stop and check! The parable of the prince and the dog” by Dumbledore’s Army

This post is a response to John Wentsworth's recent post on Generalized Hangriness specifically as it applies outrage, an emotion that is especially likely to make false claims. I expect that some r…

00:03:13 | Mon 14 Jul 2025

“OpenAI Model Differentiation 101” by Zvi

LLMs can be deeply confusing. Thanks to a commission, today we go back to basics.

How did we get such a wide array of confusingly named and labeled models and modes in ChatGPT? What are they, and w…

00:21:25 | Mon 14 Jul 2025

“10x more training compute = 5x greater task length (kind of)” by Expertium

I assume you are familiar with the METR paper: https://arxiv.org/abs/2503.14499

In case you aren't: the authors measured how long it takes a human to complete some task, then let LLMs do those tasks…

00:04:58 | Mon 14 Jul 2025

“Three Missing Cakes, or One Turbulent Critic?” by Benquo

Zack Davis emailed me[1] asking me to weigh in on the moderators' request for comment on their proposal to ban Said_Achmiz. I've had a conflict with Said in the past in this thread and apparently th…

00:04:39 | Mon 14 Jul 2025

“You can get LLMs to say almost anything you want” by Kaj_Sotala

Yes, Prime Minister has a sketch demonstrating how asking a series of leading questions can get a person to a particular conclusion:

“Are you worried about the number of young people without jobs?”

…

00:23:32 | Sun 13 Jul 2025

Disclaimer: The podcast and artwork embedded on this page are the property of LessWrong ([email protected]). This content is not affiliated with or endorsed by eachpod.com.

EachPod