Linear Digressions

Ben Jaffe and Katie Malone ([email protected])

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.

Science Technology Learning

Update frequency: every 6 days
Average duration: 19 minutes
Episodes: 291
Years Active: 2014 - 2020

Share to:

Anscombe's Quartet

Anscombe's Quartet is a set of four datasets that have the same mean, variance and correlation but look very different. It's easy to think that having a good set of summary statistics (like mean, va…

00:15:39 | Mon 19 Jun 2017

Traffic Metering Algorithms

Originally release June 2016 This episode is for all you (us) traffic nerds--we're talking about the hidden structure underlying traffic on-ramp metering systems. These systems slow down the flow of…

00:18:34 | Mon 12 Jun 2017

Page Rank

The year: 1998. The size of the web: 150 million pages. The problem: information retrieval. How do you find the "best" web pages to return in response to a query? A graduate student named Larry P…

00:19:58 | Mon 05 Jun 2017

Fractional Dimensions

We chat about fractional dimensions, and what the actual heck those are.

00:20:28 | Mon 29 May 2017

Things You Learn When Building Models for Big Data

As more and more data gets collected seemingly every day, and data scientists use that data for modeling, the technical limits associated with machine learning on big datasets keep getting pushed bac…

00:21:39 | Mon 22 May 2017

How to Find New Things to Learn

If you're anything like us, you a) always are curious to learn more about data science and machine learning and stuff, and b) are usually overwhelmed by how much content is out there (not all of it v…

00:17:54 | Mon 15 May 2017

Federated Learning

As machine learning makes its way into more and more mobile devices, an interesting question presents itself: how can we have an algorithm learn from training data that's being supplied as users inte…

00:14:03 | Mon 08 May 2017

Word2Vec

Word2Vec is probably the go-to algorithm for vectorizing text data these days. Which makes sense, because it is wicked cool. Word2Vec has it all: neural networks, skip-grams and bag-of-words implem…

00:17:59 | Mon 01 May 2017

Feature Processing for Text Analytics

It seems like every day there's more and more machine learning problems that involve learning on text data, but text itself makes for fairly lousy inputs to machine learning algorithms. That's why t…

00:17:28 | Mon 24 Apr 2017

Education Analytics

This week we'll hop into the rapidly developing industry around predictive analytics for education. For many of the students who eventually drop out, data science is showing that there might be earl…

00:21:05 | Mon 17 Apr 2017

A Technical Deep Dive on Stanley, the First Self-Driving Car

In our follow-up episode to last week's introduction to the first self-driving car, we will be doing a technical deep dive this week and talking about the most important systems for getting a car to …

00:40:42 | Mon 10 Apr 2017

An Introduction to Stanley, the First Self-Driving Car

In October 2005, 23 cars lined up in the desert for a 140 mile race. Not one of those cars had a driver. This was the DARPA grand challenge to see if anyone could build an autonomous vehicle capabl…

00:13:07 | Mon 03 Apr 2017

Feature Importance

Figuring out what features actually matter in a model is harder to figure out than you might first guess. When a human makes a decision, you can just ask them--why did you do that? But with machine…

00:20:15 | Mon 27 Mar 2017

Space Codes!

It's hard to get information to and from Mars. Mars is very far away, and expensive to get to, and the bandwidth for passing messages with Earth is not huge. The messages you do pass have to traver…

00:23:56 | Mon 20 Mar 2017

Finding (and Studying) Wikipedia Trolls

You may be shocked to hear this, but sometimes, people on the internet can be mean. For some of us this is just a minor annoyance, but if you're a maintainer or contributor of a large project like W…

00:15:50 | Mon 13 Mar 2017

A Sprint Through What's New in Neural Networks

Advances in neural networks are moving fast enough that, even though it seems like we talk about them all the time around here, it also always seems like we're barely keeping up. So this week we hav…

00:16:56 | Mon 06 Mar 2017

Stein's Paradox

When you're estimating something about some object that's a member of a larger group of similar objects (say, the batting average of a baseball player, who belongs to a baseball team), how should you…

00:27:02 | Mon 27 Feb 2017

Empirical Bayes

Say you're looking to use some Bayesian methods to estimate parameters of a system. You've got the normalization figured out, and the likelihood, but the prior... what should you use for a prior? E…

00:18:57 | Mon 20 Feb 2017

Endogenous Variables and Measuring Protest Effectiveness

Have you been out protesting lately, or watching the protests, and wondered how much effect they might have on lawmakers? It's a tricky question to answer, since usually we need randomly distributed…

00:16:28 | Mon 13 Feb 2017

Calibrated Models

Remember last week, when we were talking about how great the ROC curve is for evaluating models? How things change... This week, we're exploring calibrated risk models, because that's a kind of mod…

00:14:32 | Mon 06 Feb 2017

Disclaimer: The podcast and artwork embedded on this page are the property of Ben Jaffe and Katie Malone ([email protected]). This content is not affiliated with or endorsed by eachpod.com.

EachPod