March 13, 2024, 9:20 p.m. | /u/Chronicle112

Machine Learning www.reddit.com

I guess a lot of people have seen the Figure One demo by now, if not: [https://www.youtube.com/watch?v=Sq1QZB5baNw](https://www.youtube.com/watch?v=Sq1QZB5baNw)


Does anybody have some information on what (type) of model is used for the robotic movements? Is it some form of RL or offline RL? I understand that the interpretation of images/language happens through some multimodal llm/vlm, but I want to learn a bit what kind of actions/instructions it outputs to then for example move objects. What input is given to such a …

example form images information interpretation kind language learn llm machinelearning movements multimodal objects offline robotic through type vlm

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Sr. BI Analyst

@ AkzoNobel | Pune, IN