s
Jan. 24, 2024, 7:58 p.m. |

Simon Willison's Weblog simonwillison.net

Google Research: Lumiere


The latest in text-to-video from Google Research, described as "a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion".


Most existing text-to-video models generate keyframes and then use other models to fill in the gaps, which frequently leads to a lack of coherency. Lumiere "generates the full temporal duration of the video at once", which avoids this problem.


Disappointingly but unsurprisingly the paper doesn't go into much detail on the training data, …

ai diffusion diffusion model diverse generate generativeai google google research leads research temporal text text-to-video video video diffusion videos

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant Senior Power BI & Azure - CDI - H/F

@ Talan | Lyon, France