FlakeBench and the Digital Wingman: When AI Handles Your Exit Strategy

Author: Mike Breault
Published: Sun 22 Jun 2025
Episode Link: None

We unpack a tongue‑in‑cheek study from Sigmavic about how large language models generate humane excuses to cancel plans. From FlakeBench’s efficacy, kindness, and humanity metrics to surprising model rankings (Anthropic’s Sonnet tops, GPT‑4 sometimes lags), we explore what it means when AI mediates our awkward conversations. We connect the science to a humorous Monroe comic on leaving social engagements and discuss the social costs of ghosting—and whether outsourcing our flakiness is a future we actually want.

Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.

EachPod

EachPod

FlakeBench and the Digital Wingman: When AI Handles Your Exit Strategy