all AI news
Cross-domain Multi-modal Few-shot Object Detection via Rich Text
March 26, 2024, 4:47 a.m. | Zeyu Shangguan, Daniel Seita, Mohammad Rostami
cs.CV updates on arXiv.org arxiv.org
Abstract: Cross-modal feature extraction and integration have led to steady performance improvements in few-shot learning tasks due to generating richer features. However, existing multi-modal object detection (MM-OD) methods degrade when facing significant domain-shift and are sample insufficient. We hypothesize that rich text information could more effectively help the model to build a knowledge relationship between the vision instance and its language description and can help mitigate domain shift. Specifically, we study the Cross-Domain few-shot generalization of …
abstract arxiv cs.cv detection domain extraction feature feature extraction features few-shot few-shot learning however improvements information integration modal multi-modal object performance sample shift tasks text type via
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Engineer
@ Quantexa | Sydney, New South Wales, Australia
Staff Analytics Engineer
@ Warner Bros. Discovery | NY New York 230 Park Avenue South