all AI news
Generalizing CLIP to Unseen Domain via Text-Guided Diverse Novel Feature Synthesis
May 7, 2024, 4:47 a.m. | Siyuan Yan, Cheng Luo, Zhen Yu, Zongyuan Ge
cs.CV updates on arXiv.org arxiv.org
Abstract: Vision-language foundation models like CLIP have shown impressive zero-shot generalization, but finetuning on downstream datasets can cause overfitting and loss of its generalization ability on unseen domains. Although collecting additional data from new domains of interest is possible, this method is often impractical due to the challenges in obtaining annotated data. To address this, we propose a plug-and-play feature augmentation method called LDFS (Language-Guided Diverse Feature Synthesis) to synthesize new domain features and improve existing …
abstract arxiv clip cs.cv data datasets diverse domain domains feature finetuning foundation language loss novel overfitting synthesis text type via vision vision-language zero-shot
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US