March 14, 2024, 4:46 a.m. | Yiming Wu, Ruixiang Li, Zequn Qin, Xinhai Zhao, Xi Li

Abstract: Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights …

