1. EachPod

DuckDB v1.3.0: The Spatial Join Breakthrough — From Nested Loops to an On-the-Fly R-tree

Author
Mike Breault
Published
Mon 25 Aug 2025
Episode Link
None

Spatial joins connect data by location. In this episode we unpack DuckDB's v1.3.0 dedicated spatial join operator, how it builds an in‑memory R-tree and buffers the smaller table to probe it efficiently, and why this yields dramatic speedups (e.g., a 58M-row join against 310 neighborhoods dropping from ~30 minutes to under 30 seconds). We trace the journey from brute-force nested-loop to IE-join optimizations with bounding boxes, discuss current limits and ongoing work (larger-than-memory builds, more parallelism), and highlight implications for geospatial analysis.


Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.

Sponsored by Embersilk LLC

Share to: