April 10, 2024, 4:45 a.m. | Yupei Zhang, Li Pan, Qiushi Yang, Tan Li, Zhen Chen

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.06057v1 Announce Type: new
Abstract: Medical multi-modal pre-training has revealed promise in computer-aided diagnosis by leveraging large-scale unlabeled datasets. However, existing methods based on masked autoencoders mainly rely on data-level reconstruction tasks, but lack high-level semantic information. Furthermore, two significant heterogeneity challenges hinder the transfer of pre-trained knowledge to downstream tasks, \textit{i.e.}, the distribution heterogeneity between pre-training data and downstream data, and the modality heterogeneity within downstream data. To address these challenges, we propose a Unified Medical Multi-modal Diagnostic (UMD) …

abstract arxiv autoencoders challenges computer cs.cv data datasets diagnosis diagnostic framework hinder however information knowledge medical modal multi-modal pre-training scale semantic tasks training transfer type

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne