all AI news
SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception
March 18, 2024, 4:44 a.m. | Yiheng Li, Hongyang Li, Zehao Huang, Hong Chang, Naiyan Wang
cs.CV updates on arXiv.org arxiv.org
Abstract: Multi-modal 3D object detection has exhibited significant progress in recent years. However, most existing methods can hardly scale to long-range scenarios due to their reliance on dense 3D features, which substantially escalate computational demands and memory usage. In this paper, we introduce SparseFusion, a novel multi-modal fusion framework fully built upon sparse 3D features to facilitate efficient long-range perception. The core of our method is the Sparse View Transformer module, which selectively lifts regions of …
3d object 3d object detection abstract arxiv computational cs.cv detection features framework fusion however memory modal multi-modal novel object paper perception progress reliance scale type usage
More from arxiv.org / cs.CV updates on arXiv.org
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
1 day, 20 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Director, Clinical Data Science
@ Aura | Remote USA
Research Scientist, AI (PhD)
@ Meta | Menlo Park, CA | New York City