Aug. 29, 2022, 1:14 a.m. | Zecheng Liu, Jia Wei, Rui Li

cs.CV updates on arXiv.org arxiv.org

People perceive the world with different senses, such as sight, hearing,
smell, and touch. Processing and fusing information from multiple modalities
enables Artificial Intelligence to understand the world around us more easily.
However, when there are missing modalities, the number of available modalities
is different in diverse situations, which leads to an N-to-One fusion problem.
To solve this problem, we propose a transformer based fusion block called
TFusion. Different from preset formulations or convolution based methods, the
proposed block automatically …

arxiv cv fusion multimodal transformer

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote