Sept. 7, 2023, 6:27 p.m. | 1littlecoder

1littlecoder www.youtube.com

Qwen-VL (Qwen Large Vision Language Model) is the visual multimodal version of the large model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-VL accepts image, text, and bounding box as inputs, outputs text and bounding box.

QWEN-VL on Hugging Face with Benchmarks https://huggingface.co/Qwen/Qwen-VL
Camenduru's Colab Repo - https://github.com/camenduru/Qwen-VL-Chat-colab
Colab Direct Link - https://colab.research.google.com/github/camenduru/Qwen-VL-Chat-colab/blob/main/Qwen_VL_Chat_colab.ipynb

❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder …

alibaba alibaba cloud box china cloud image language language model llm multimodal series support text tongyi qianwen vision

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant Senior Power BI & Azure - CDI - H/F

@ Talan | Lyon, France