Magma: A Foundation Model for Multimodal AI Agents
A State-Of-The-Art must-read paper -- "Put the sausage to hot dog"*
Most foundation models today still live in a digital world of words.
They’ve become fluent in describing things like objects, actions, and outcomes in a beautiful cascade of tokens. But we’re still struggling to build systems that can properly act within 2D and, more importantly, 3D worlds. Here, the gap between perception and action remains wide. That …