all AI news
Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels
March 22, 2024, 4:45 a.m. | Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu
cs.CV updates on arXiv.org arxiv.org
Abstract: This paper focuses on open-ended video question answering, which aims to find the correct answers from a large answer set in response to a video-related question. This is essentially a multi-label classification task, since a question may have multiple answers. However, due to annotation costs, the labels in existing benchmarks are always extremely insufficient, typically one answer per question. As a result, existing works tend to directly treat all the unlabeled answers as negative labels, …
abstract annotation arxiv classification costs cs.cv distillation however labels multiple paper question question answering ranking set type video
More from arxiv.org / cs.CV updates on arXiv.org
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
1 day, 13 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Global Data Architect, AVP - State Street Global Advisors
@ State Street | Boston, Massachusetts
Data Engineer
@ NTT DATA | Pune, MH, IN