Video-Guided Curriculum Learning for Spoken Video Grounding. (arXiv:2209.00277v1 [cs.CV]) | allainews.com

Sept. 2, 2022, 1:15 a.m. | Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren

cs.CL updates on arXiv.org arxiv.org

In this paper, we introduce a new task, spoken video grounding (SVG), which
aims to localize the desired video fragments from spoken language descriptions.
Compared with using text, employing audio requires the model to directly
exploit the useful phonemes and syllables related to the video from raw speech.
Moreover, we randomly add environmental noises to this speech audio, further
increasing the difficulty of this task and better simulating real applications.
To rectify the discriminative phonemes and extract video-related information
from …

arxiv curriculum curriculum learning learning video

More from arxiv.org / cs.CL updates on arXiv.org

ChatDev: Communicative Agents for Software Development 17 hours ago | arxiv.org

agents arxiv chatdev communicative agents +8

Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions 17 hours ago | arxiv.org

abstract arxiv challenges cs.ai +18

JumpCoder: Go Beyond Autoregressive Coder via Online Modification 17 hours ago | arxiv.org

arxiv autoregressive beyond coder +6

Building Efficient and Effective OpenQA Systems for Low-Resource Languages 17 hours ago | arxiv.org

arxiv building cs.cl languages +4

WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning 17 hours ago | arxiv.org

abstract arxiv capabilities code +18

Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models 17 hours ago | arxiv.org

abstract art arxiv cs.ai +27

Uncertainty Estimation on Sequential Labeling via Uncertainty Transmission 17 hours ago | arxiv.org

arxiv cs.cl labeling replace +3

FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models 17 hours ago | arxiv.org

arxiv benchmark constraints cs.cl +7

PartialFormer: Modeling Part Instead of Whole for Machine Translation 17 hours ago | arxiv.org

arxiv cs.ai cs.cl machine +6

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Coding Data Quality Auditor

@ Neuberger Berman | Work At Home-Georgia

View on ai-jobs.net

Post Graduate (Year-Round) Intern - Market Research Analyst and Agreement Support

@ National Renewable Energy Laboratory | CO - Golden

View on ai-jobs.net

Retail Analytics Engineering - Sr. Manager (Data)

@ Axalta | Woonsocket-1 CVS Drive

View on ai-jobs.net