Qwen2.5-VL 7B Instruct

Name: Qwen2.5-VL 7B Instruct
Brand: Qwen

by Qwen

Qwen2.5 VL 7B is a multimodal LLM from the Qwen Team with the following key enhancements: - SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. - Understanding videos of 20min+: Qwen2.5-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. - Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. - Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. For more details, see this [blog post](https://qwenlm.github.io/blog/qwen2-vl/) and [GitHub repo](https://github.com/QwenLM/Qwen2-VL). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Chat with Qwen2.5-VL 7B Instruct

Input Price$0.00/1M tokens

Output Price$0.00/1M tokens

Intelligence15.6

Coding11.9

Specifications

Technical details and pricing.

ProviderQwen

Context Window32,768 tokens

Release DateSep 19, 2024

ModalitiesText, Image → Text

CapabilitiesVision

Benchmarks

12 benchmark scores from Artificial Analysis.

GPQA49.1%

MMLU Pro72.0%

HLE4.2%

LiveCodeBench27.6%

MATH 50085.8%

AIME 202514.0%

AIME16.0%

SciCode26.7%

LCR20.3%

IFBench36.9%

Tau234.5%

TerminalBench Hard4.5%

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Frequently Asked Questions

What is Qwen2.5-VL 7B Instruct good for?

Use Qwen2.5-VL 7B Instruct for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does Qwen2.5-VL 7B Instruct cost?

Pricing is based on usage. Current rates are $0.00/1M tokens for input and $0.00/1M tokens for output.

Can I try Qwen2.5-VL 7B Instruct for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does Qwen2.5-VL 7B Instruct support images or audio?

Qwen2.5-VL 7B Instruct can understand images.

Similar Models

Other models you might want to explore.

Qwen2.5 7B Instruct

Qwen

Qwen2.5 7B is the latest series of Qwen large language models.

Details →

Qwen2.5 VL 72B Instruct

Qwen

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects.

Details →

Qwen2.5 72B Instruct

qwen

Qwen2.5 72B is the latest series of Qwen large language models.

Details →

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.