April 4, 2024, 4:46 a.m. | Shan Yang, Yongfei Zhang

cs.CV updates on arXiv.org arxiv.org

arXiv:2401.13201v2 Announce Type: replace
Abstract: Multimodal large language models (MLLM) have achieved satisfactory results in many tasks. However, their performance in the task of person re-identification (ReID) has not been explored to date. This paper will investigate how to adapt them for the task of ReID. An intuitive idea is to fine-tune MLLM with ReID image-text datasets, and then use their visual encoder as a backbone for ReID. However, there still exist two apparent issues: (1) Designing instructions for ReID, …

abstract adapt arxiv cs.ai cs.cl cs.cv however identification language language model language models large language large language model large language models mllm multimodal multimodal large language model paper performance person results tasks them type will

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote