March 11, 2024, 4:45 a.m. | Qiuhui Chen, Huping Ye, Yi Hong

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.05141v1 Announce Type: new
Abstract: Understanding 3D medical image volumes is a critical task in the medical domain. However, existing 3D convolution and transformer-based methods have limited semantic understanding of an image volume and also need a large set of volumes for training. Recent advances in multi-modal large language models (MLLMs) provide a new and promising way to understand images with the help of text descriptions. However, most current MLLMs are designed for 2D natural images. To enhance the 3D …

abstract advances arxiv convolution cs.cv domain however image language language models large language large language models medical modal multi-modal semantic set training transformer type understanding

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US