The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Run Qwen2-VL on Your Computer with Text, Images, and Video, Step by Step

Run Qwen2-VL on Your Computer with Text, Images, and Video, Step by Step

Your local multimodal chat model

Benjamin Marie's avatar
Benjamin Marie
Sep 02, 2024
∙ Paid
7

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
Run Qwen2-VL on Your Computer with Text, Images, and Video, Step by Step
4
1
Share
Example of inference by Qwen2-VL 2B

Alibaba’s Qwen2-VL are vision language models now available as 2B and 7B parameter models. They are generative language models that support multimodal inputs. You can provide Qwen2-VL, along with text, a single image, multiple images, or a 20-minute video!

The models demonstrate impressive performance in visual understanding. Like Microsoft’s Florence-2, Qwen2-VL can perform many types of tasks such as OCR, image captioning, question answering, visual grounding, etc.

Florence-2: Run Multitask Vision-language Models on Your Computer

Florence-2: Run Multitask Vision-language Models on Your Computer

Benjamin Marie
·
July 1, 2024
Read full story

Quantized versions, using the AWQ and GPTQ formats, were also published by Alibaba to facilitate deployment on smaller GPUs.

The Kaitchup – AI on a Budget is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, I first review Qwen2-VL’s architecture and performance. Then, we will see how to use Qwen2-VL with multiple images and videos using small GPUs (8 GB and 12 GB). I explain how to set up the model and format the prompt step by step.

The examples detailed in this article are also implemented in this notebook:

Get the notebook (#100)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share