all AI news
Multimodal Frame-Scoring Transformer for Video Summarization. (arXiv:2207.01814v1 [cs.LG])
July 6, 2022, 1:10 a.m. | Jeiyoon Park, Kiho Kwoun, Chanhee Lee, Heuiseok Lim
cs.LG updates on arXiv.org arxiv.org
As the number of video content has mushroomed in recent years, automatic
video summarization has come useful when we want to just peek at the content of
the video. However, there are two underlying limitations in generic video
summarization task. First, most previous approaches read in just visual
features as input, leaving other modality features behind. Second, existing
datasets for generic video summarization are relatively insufficient to train a
caption generator and multimodal feature extractors. To address these two
problems, …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Applied Scientist, Control Stack, AWS Center for Quantum Computing
@ Amazon.com | Pasadena, California, USA
Specialist Marketing with focus on ADAS/AD f/m/d
@ AVL | Graz, AT
Machine Learning Engineer, PhD Intern
@ Instacart | United States - Remote
Supervisor, Breast Imaging, Prostate Center, Ultrasound
@ University Health Network | Toronto, ON, Canada
Senior Manager of Data Science (Recommendation Science)
@ NBCUniversal | New York, NEW YORK, United States