March 2, 2024, 4:56 p.m. | /u/Successful-Western27

machinelearningnews www.reddit.com

Training AI to understand and describe video content requires datasets which are expensive for humans to annotate manually. Now researchers from Snap, UC Merced, and the University of Trento have put together a new dataset called Panda-70M that aims to help.

This new dataset has 70 million high-res YouTube clips paired with descriptive captions. The key is they used an automated pipeline with multiple cross-modal "teacher" AI models to generate captions based on different inputs like video, subtitles, images, etc. …

captioning dataset datasets humans machinelearningnews multiple researchers snap teachers together training training ai university video videos youtube

More from www.reddit.com / machinelearningnews

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-

@ JPMorgan Chase & Co. | Wilmington, DE, United States

Senior ML Engineer (Speech/ASR)

@ ObserveAI | Bengaluru