1. EachPod

About Data Pipelines

Author
Lars Wikman, Andreas Ekeroot
Published
Mon 01 Jan 2024
Episode Link
https://share.transistor.fm/s/59b05f05

Lars dove into data pipelines, and emerged bearing arrows and wishing for a lot fewer copies.

What is there to think about regarding data pipelines, what is interesting about them?

Which tools are out there, and why might you want to use them?

Why all this talk about making fewer copies of data?

What does Lars' current ideal pipeline look like, and where does Elixir fit in?

Links

Quotes

  • I've been reading a lot about data pipelines
  • What's so special about data pipelines?
  • There's a lot of special tooling
  • There's a lot of bad, bad tooling
  • Less than optimal tooling
  • Converging on something biggerlk
  • He got me eventually
  • All of your steps in one bucket
  • What tools do you associate with data?
  • I inherited a data pipeline
  • BashReduce
  • Iterate on the L and the T
  • The modern data stack
  • And then you demand more work
  • No unnecessary copies
  • Barely a copy
  • Reconnecting with my Python roots

Share to: