June 11, 2024, 4:50 a.m. | Jianmeng Liu, Yichen Liu, Yuyao Zhang, Zeyuan Meng, Yu-Wing Tai, Chi-Keung Tang

cs.CV updates on arXiv.org arxiv.org

arXiv:2406.05543v1 Announce Type: new
Abstract: Recent conditional 3D completion works have mainly relied on CLIP or BERT to encode textual information, which cannot support complex instruction. Meanwhile, large language models (LLMs) have shown great potential in multi-modal understanding and generation tasks. Inspired by the recent advancements of LLM, we present Volume Patch LLM (VP-LLM), which leverages LLMs to perform conditional 3D completion in a single-forward pass. To integrate a 3D model into the LLM tokenization configuration, the incomplete 3D object …

abstract arxiv bert clip cs.ai cs.cv encode information language language models large language large language models llm llms modal multi-modal potential support tasks text textual through type understanding

Senior Data Engineer

@ Displate | Warsaw

Principal Software Engineer

@ Microsoft | Prague, Prague, Czech Republic

Sr. Global Reg. Affairs Manager

@ BASF | Research Triangle Park, NC, US, 27709-3528

Senior Robot Software Developer

@ OTTO Motors by Rockwell Automation | Kitchener, Ontario, Canada

Coop - Technical Service Hub Intern

@ Teradyne | Santiago de Queretaro, MX

Coop - Technical - Service Inside Sales Intern

@ Teradyne | Santiago de Queretaro, MX