all AI news
SSD-MonoDTR: Supervised Scale-constrained Deformable Transformer for Monocular 3D Object Detection. (arXiv:2305.07270v1 [cs.CV])
cs.CV updates on arXiv.org arxiv.org
Transformer-based methods have demonstrated superior performance for
monocular 3D object detection recently, which predicts 3D attributes from a
single 2D image. Most existing transformer-based methods leverage visual and
depth representations to explore valuable query points on objects, and the
quality of the learned queries has a great impact on detection accuracy.
Unfortunately, existing unsupervised attention mechanisms in transformer are
prone to generate low-quality query features due to inaccurate receptive
fields, especially on hard objects. To tackle this problem, this paper …
2d image arxiv detection image impact objects performance quality query scale ssd transformer