1. EachPod

Alignment Newsletter #106: Evaluating generalization ability of learned reward models

Author
Robert Miles
Published
Wed 01 Jul 2020
Episode Link
https://alignment-newsletter.libsyn.com/alignment-newsletter-106
Share to: