May 7, 2024, 4:47 a.m. | Siyuan Yan, Cheng Luo, Zhen Yu, Zongyuan Ge

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.02586v1 Announce Type: new
Abstract: Vision-language foundation models like CLIP have shown impressive zero-shot generalization, but finetuning on downstream datasets can cause overfitting and loss of its generalization ability on unseen domains. Although collecting additional data from new domains of interest is possible, this method is often impractical due to the challenges in obtaining annotated data. To address this, we propose a plug-and-play feature augmentation method called LDFS (Language-Guided Diverse Feature Synthesis) to synthesize new domain features and improve existing …

abstract arxiv clip cs.cv data datasets diverse domain domains feature finetuning foundation language loss novel overfitting synthesis text type via vision vision-language zero-shot

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US