MMMModal -- Multi-Images Multi-Audio Multi-turn Multi-Modal | allainews.com

Feb. 20, 2024, 5:50 a.m. | Husein Zolkepli, Aisyah Razak, Kamarul Adha, Ariff Nazhan

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.11297v1 Announce Type: new
Abstract: Our contribution introduces a groundbreaking multimodal large language model designed to comprehend multi-images, multi-audio, and multi-images-multi-audio within a single multiturn session. Leveraging state-of-the-art models, we utilize the SigLIP encoder for visual inputs and the Whisper Encoder for audio inputs. Notably, this multimodal large language model is bilingual, proficient in understanding both English and Malay simultaneously. We proudly unveil two versions of this model: TinyLlama with 1.1B parameters, and Mistral with 7B parameters. With its ability …

abstract art arxiv audio bilingual cs.cl encoder groundbreaking images inputs language language model large language large language model modal multi-modal multimodal multimodal large language model session state state-of-the-art models type visual whisper

More from arxiv.org / cs.CL updates on arXiv.org

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators an hour ago | arxiv.org

abstract accelerators architectures arxiv +13

CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning an hour ago | arxiv.org

arxiv benchmark chinese cs.ai +8

Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models an hour ago | arxiv.org

abstract advances arxiv cs.cl +16

An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT an hour ago | arxiv.org

abstract arxiv chatgpt communication +14

Commentary Generation from Data Records of Multiplayer Strategy Esports Game an hour ago | arxiv.org

abstract arxiv audience become +20

Honeyfile Camouflage: Hiding Fake Files in Plain Sight an hour ago | arxiv.org

abstract arxiv challenge cosine +13

You Only Cache Once: Decoder-Decoder Architectures for Language Models an hour ago | arxiv.org

architectures arxiv cache cs.cl +4

Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge an hour ago | arxiv.org

abstract arxiv computing concerns +23

LLMs with Personalities in Multi-issue Negotiation Games an hour ago | arxiv.org

abstract agents ai agents arxiv +26

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net