Duoduo CLIP: Efficient 3D Understanding with Multi-View Images | allainews.com

June 19, 2024, 2:45 a.m. | Han-Hung Lee, Yiming Zhang, Angel X. Chang

cs.CV updates on arXiv.org arxiv.org

arXiv:2406.11579v1 Announce Type: new
Abstract: We introduce Duoduo CLIP, a model for 3D representation learning that learns shape encodings from multi-view images instead of point-clouds. The choice of multi-view images allows us to leverage 2D priors from off-the-shelf CLIP models to facilitate fine-tuning with 3D data. Our approach not only shows better generalization compared to existing point cloud methods, but also reduces GPU requirements and training time. In addition, we modify the model with cross-view attention to leverage information across …

abstract arxiv clip cs.cv data fine-tuning images off representation representation learning shape shows tuning type understanding view

More from arxiv.org / cs.CV updates on arXiv.org

InstantGroup: Instant Template Generation for Scalable Group of Brain MRI Registration 13 hours ago | arxiv.org

abstract arxiv brain costs +15

Visual Odometry with Neuromorphic Resonator Networks 13 hours ago | arxiv.org

abstract arxiv cs.ai cs.cv +15

CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video 13 hours ago | arxiv.org

arxiv cs.cv dynamic neural radiance field +4

InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models 13 hours ago | arxiv.org

arxiv cs.cv instruction-tuned language +6

Towards Training-free Open-world Segmentation via Image Prompt Foundation Models 13 hours ago | arxiv.org

abstract arxiv computer computer vision +33

Re-initialization-free Level Set Method via Molecular Beam Epitaxy Equation Regularization for Image Segmentation 13 hours ago | arxiv.org

abstract arxiv become continuity +15

ObjFormer: Learning Land-Cover Changes From Paired OSM Data and Optical High-Resolution Imagery via Object-Guided Transformer 13 hours ago | arxiv.org

arxiv cs.ai cs.cv cs.cy +9

Unsupervised Open-Vocabulary Object Localization in Videos 13 hours ago | arxiv.org

abstract advances arxiv attention +21

Enhancing Low-light Light Field Images with A Deep Compensation Unfolding Network 13 hours ago | arxiv.org

arxiv compensation cs.cv eess.iv +6

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

PhD Student AI simulation electric drive (f/m/d)

@ Volkswagen Group | Kassel, DE, 34123

View on ai-jobs.net

AI Privacy Research Lead

@ Leidos | 6314 Remote/Teleworker US

View on ai-jobs.net

Senior Platform System Architect, Silicon

@ Google | New Taipei, Banqiao District, New Taipei City, Taiwan

View on ai-jobs.net

Fabrication Hardware Litho Engineer, Quantum AI

@ Google | Goleta, CA, USA

View on ai-jobs.net