Linear Digressions

Ben Jaffe and Katie Malone ([email protected])

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.

Science Technology Learning

Update frequency: every 6 days
Average duration: 19 minutes
Episodes: 291
Years Active: 2014 - 2020

Share to:

Google Flu Trends

It's been a nasty flu season this year. So we were remembering a story from a few years back (but not covered yet on this podcast) about when Google tried to predict flu outbreaks faster than the Cen…

00:12:46 | Mon 26 Mar 2018

How to pick projects for a professional data science team

This week's episodes is for data scientists, sure, but also for data science managers and executives at companies with data science teams. These folks all think very differently about the same questi…

00:31:17 | Mon 19 Mar 2018

Autoencoders

Autoencoders are neural nets that are optimized for creating outputs that... look like the inputs to the network. Turns out this is a not-too-shabby way to do unsupervised machine learning with neura…

00:12:41 | Mon 12 Mar 2018

When Private Data Isn't Private Anymore

After all the back-patting around making data science datasets and code more openly available, we figured it was time to also dump a bucket of cold water on everyone's heads and talk about the things…

00:26:20 | Mon 05 Mar 2018

What makes a machine learning algorithm "superhuman"?

A few weeks ago, we podcasted about a neural network that was being touted as "better than doctors" in diagnosing pneumonia from chest x-rays, and how the underlying dataset used to train the algorit…

00:34:48 | Mon 26 Feb 2018

Open Data and Open Science

One interesting trend we've noted recently is the proliferation of papers, articles and blog posts about data science that don't just tell the result--they include data and code that allow anyone to …

00:16:54 | Mon 19 Feb 2018

Defining the quality of a machine learning production system

Building a machine learning system and maintaining it in production are two very different things. Some folks over at Google wrote a paper that shares their thoughts around all the items you might wa…

00:20:29 | Mon 12 Feb 2018

Auto-generating websites with deep learning

We've already talked about neural nets in some detail (links below), and in particular we've been blown away by the way that image recognition from convolutional neural nets can be fed into recurrent…

00:19:24 | Sun 04 Feb 2018

The Case for Learned Index Structures, Part 2: Hash Maps and Bloom Filters

Last week we started the story of how you could use a machine learning model in place of a data structure, and this week we wrap up with an exploration of Bloom Filters and Hash Maps. Just like last …

00:20:41 | Mon 29 Jan 2018

The Case for Learned Index Structures, Part 1: B-Trees

Jeff Dean and his collaborators at Google are turning the machine learning world upside down (again) with a recent paper about how machine learning models can be used as surprisingly effective substi…

00:18:50 | Mon 22 Jan 2018

Challenges with Using Machine Learning to Classify Chest X-Rays

Another installment in our "machine learning might not be a silver bullet for solving medical problems" series. This week, we have a high-profile blog post that has been making the rounds for the las…

00:18:00 | Mon 15 Jan 2018

The Fourier Transform

The Fourier transform is one of the handiest tools in signal processing for dealing with periodic time series data. Using a Fourier transform, you can break apart a complex periodic function into a b…

00:15:39 | Mon 08 Jan 2018

Statistics of Beer

What better way to kick off a new year than with an episode on the statistics of brewing beer?

00:15:20 | Tue 02 Jan 2018

Re - Release: Random Kanye

We have a throwback episode for you today as we take the week off to enjoy the holidays. This week: what happens when you have a markov chain that generates mashup Kanye West lyrics with Bible verses…

00:09:33 | Sun 24 Dec 2017

Debiasing Word Embeddings

When we covered the Word2Vec algorithm for embedding words, we mentioned parenthetically that the word embeddings it produces can sometimes be a little bit less than ideal--in particular, gender bias…

00:18:20 | Mon 18 Dec 2017

The Kernel Trick and Support Vector Machines

Picking up after last week's episode about maximal margin classifiers, this week we'll go into the kernel trick and how that (combined with maximal margin algorithms) gives us the much-vaunted suppor…

00:17:48 | Mon 11 Dec 2017

Maximal Margin Classifiers

Maximal margin classifiers are a way of thinking about supervised learning entirely in terms of the decision boundary between two classes, and defining that boundary in a way that maximizes the dista…

00:14:21 | Mon 04 Dec 2017

Re - Release: The Cocktail Party Problem

Grab a cocktail, put on your favorite karaoke track, and let’s talk some more about disentangling audio data!

00:13:43 | Mon 27 Nov 2017

Clustering with DBSCAN

DBSCAN is a density-based clustering algorithm for doing unsupervised learning. It's pretty nifty: with just two parameters, you can specify "dense" regions in your data, and grow those regions out …

00:16:14 | Mon 20 Nov 2017

The Kaggle Survey on Data Science

Want to know what's going on in data science these days? There's no better way than to analyze a survey with over 16,000 responses that recently released by Kaggle. Kaggle asked practicing and aspi…

00:25:20 | Mon 13 Nov 2017

Disclaimer: The podcast and artwork embedded on this page are the property of Ben Jaffe and Katie Malone ([email protected]). This content is not affiliated with or endorsed by eachpod.com.

EachPod