BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning | allainews.com

June 28, 2024, 4:47 a.m. | Ruyang Liu, Chen Li, Yixiao Ge, Ying Shan, Thomas H. Li, Ge Li

cs.CV updates on arXiv.org arxiv.org

arXiv:2309.15785v2 Announce Type: replace
Abstract: The recent progress in Large Language Models (LLM) has spurred various advancements in image-language conversation agents, while how to build a proficient video-based dialogue system is still under exploration. Considering the extensive scale of LLM and visual backbone, minimal GPU memory is left for facilitating effective temporal modeling, which is crucial for comprehending and providing feedback on videos. To this end, we propose Branching Temporal Adapter (BT-Adapter), a novel method for extending image-language pretrained models …

abstract adapter agents arxiv build conversation cs.cv dialogue exploration gpu image instruction tuning language language models large language large language models llm memory progress replace scale tuning type video visual while

More from arxiv.org / cs.CV updates on arXiv.org

PlaNet-S: Automatic Semantic Segmentation of Placenta 2 days, 10 hours ago | arxiv.org

abstract architectures arxiv automated +15

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model 2 days, 10 hours ago | arxiv.org

abstract arxiv cs.cv current +20

Continuous 3D Myocardial Motion Tracking via Echocardiography 2 days, 10 hours ago | arxiv.org

abstract arxiv clinical continuous +17

Optimal Transport Aggregation for Visual Place Recognition 2 days, 10 hours ago | arxiv.org

aggregation arxiv cs.cv recognition +4

BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning 2 days, 10 hours ago | arxiv.org

abstract adapter agents arxiv +22

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation 2 days, 10 hours ago | arxiv.org

abstract applications arxiv automated +23

LiverUSRecon: Automatic 3D Reconstruction and Volumetry of the Liver with a Few Partial Ultrasound Scans 2 days, 10 hours ago | arxiv.org

3d reconstruction abstract acquisition analysis +10

ALMA: a mathematics-driven approach for determining tuning parameters in generalized LASSO problems, with applications to … 2 days, 10 hours ago | arxiv.org

abstract acquisition applications artifacts +19

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions 2 days, 10 hours ago | arxiv.org

abstract agents arxiv cs.ai +21

Data Scientist

@ Ford Motor Company | Chennai, Tamil Nadu, India

View on ai-jobs.net

Systems Software Engineer, Graphics

@ Parallelz | Vancouver, British Columbia, Canada - Remote

View on ai-jobs.net

Engineering Manager - Geo Engineering Team (F/H/X)

@ AVIV Group | Paris, France

View on ai-jobs.net

Data Analyst

@ Microsoft | San Antonio, Texas, United States

View on ai-jobs.net

Azure Data Engineer

@ TechVedika | Hyderabad, India

View on ai-jobs.net

Senior Data & AI Threat Detection Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

View on ai-jobs.net