Explore Google's DeepMind Genie 2, a model that can turn a single image into a dynamic, interactive 3D world with physics, long-horizon memory, and counterfactuals. We'll examine how Genie 2 interprets user inputs, models object interactions and affordances, and enables rapid prototyping for games and simulations. We'll also look at Sima, the Scalable Instructable Multi-World Agent, which follows natural-language instructions within these worlds, and the potential implications for embodied AI and real-world training.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC