LLMs by Category
Top AI Models by Category
Compare the latest models across open source, proprietary, uncensored, coding, math, speed, and release freshness.
Most Used AI Models
Popular picks across OpenRouter.
MiniMax M2.5
MiniMax
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity.
Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.
Kimi K2.5
MoonshotAI
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm.
GLM 5
Z.ai
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows.
DeepSeek V3.2
DeepSeek
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.
Grok 4.1 Fast
xAI
Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window.
Claude Opus 4.6
Anthropic
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Claude Sonnet 4.5
Anthropic
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows.
Claude Sonnet 4.6
Anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.
Gemini 2.5 Flash
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.
Top Open Source AI Models
Community-driven, inspectable weights.
GLM 5
Z.ai
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows.
Kimi K2.5
MoonshotAI
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm.
Qwen3.5 397B A17B
Qwen
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
MiniMax M2.5
MiniMax
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity.
GLM 4.7
Z.ai
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.
DeepSeek V3.2
DeepSeek
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.
MiMo-V2-Flash
Xiaomi
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.
Kimi K2 Thinking
MoonshotAI
Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning.
Qwen3 Max Thinking
Qwen
Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning.
MiniMax M2.1
MiniMax
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development.
Top Proprietary AI Models
Frontier closed models.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.
Claude Opus 4.6
Anthropic
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Claude Sonnet 4.6
Anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.
GPT-5.2
OpenAI
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.
Claude Opus 4.5
Anthropic
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use.
GPT-5.2-Codex
OpenAI
GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows.
Nano Banana Pro (Gemini 3 Pro Image Preview)
Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.
GPT-5.1
OpenAI
GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5.
Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.
Claude Opus 4.1
Anthropic
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks.
Top Coding AI Models
Models tuned for code and developer workflows.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.
Claude Sonnet 4.6
Anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.
GPT-5.2
OpenAI
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.
Claude Opus 4.6
Anthropic
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Claude Opus 4.5
Anthropic
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use.
Claude Opus 4.1
Anthropic
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks.
Gemini 2.5 Pro Preview 05-06
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.
Nano Banana Pro (Gemini 3 Pro Image Preview)
Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.
Claude Sonnet 4.5
Anthropic
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows.
GPT-5.1
OpenAI
GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5.
Top OCR AI Models
Models specialised in optical character recognition and document extraction.
PaddleOCR-VL-0.9B
PaddlePaddle
Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context — optimized for efficient on-device document parsing.
olmOCR-2-7B
AllenAI
Allen AI's 7B OCR model fine-tuned from Qwen2.5-VL-7B on curated academic papers and technical documentation. Supports 128K context and extracts structured text from PDFs and scanned documents with high fidelity.
DeepSeek-OCR
DeepSeek
DeepSeek's ~3B MoE OCR model using optical context compression to encode full pages into compact token sequences. Outputs structured Markdown preserving text layout, tables, and mathematical formulas from images and PDFs.
Mistral OCR
Mistral AI
Mistral's dedicated document understanding model (December 2025). Processes PDFs and images page-by-page via API, returning structured Markdown with preserved tables, equations, image bounding boxes, and rich layout metadata.
Top Math AI Models
Math and reasoning specialists.
GPT-5.2
OpenAI
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.
GPT-5 Codex
OpenAI
GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows.
Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.
DeepSeek V3.2 Speciale
DeepSeek
DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance.
MiMo-V2-Flash
Xiaomi
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.
GPT-5.1-Codex
OpenAI
GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows.
Nano Banana Pro (Gemini 3 Pro Image Preview)
Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.
GLM 4.7
Z.ai
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.
Kimi K2 Thinking
MoonshotAI
Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning.
GPT-5
OpenAI
GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience.
Fast AI Models
Lowest cost + latency options.
Mercury
Inception
Mercury is the first diffusion large language model (dLLM).
Granite 4.0 Micro
IBM
Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models.
Gemini 2.5 Flash Lite Preview 09-2025
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
Nova Micro 1.0
Amazon
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost.
Gemini 2.5 Flash
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.
Gemini 2.5 Flash Lite
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
gpt-oss-120b
OpenAI
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases.
Grok Code Fast 1
xAI
Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding.
gpt-oss-20b
OpenAI
gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license.
GPT-5 Codex
OpenAI
GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows.
Large Context Window AI Models
Models with 200K+ context windows.
Grok 4.1 Fast
xAI
Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window.
Grok 4 Fast
xAI
Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window.
Gemini 3 Flash Preview
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.
Gemini 2.5 Flash
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.
Gemini 2.5 Flash Lite
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
Gemini 2.0 Flash
Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5).
Gemini 3 Pro Preview
Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.
Gemini 2.5 Pro
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.
Gemini 2.5 Flash Lite Preview 09-2025
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
Top Uncensored AI Models
Lightly filtered, high-flexibility models.
Llama 3 8B Lunaris
Sao10K
Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3.
MythoMax 13B
gryphe
One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge
UnslopNemo 12B
TheDrummer
UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.
Cydonia 24B V4.1
TheDrummer
Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.
Skyfall 36B V2
TheDrummer
Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.
Rocinante 12B
TheDrummer
Rocinante 12B is designed for engaging storytelling and rich prose.
Hermes 3 405B Instruct
Nous
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
Hermes 4 70B
Nous
Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B.
Llama 3.3 Euryale 70B
Sao10K
Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k).
Hermes 3 70B Instruct
Nous
Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
Newest AI Models
Fresh releases from OpenRouter.
GPT-5.3-Codex
OpenAI
GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2.
Aion-2.0
AionLabs
Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.
Claude Sonnet 4.6
Anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.
Qwen3.5 Plus
Qwen
The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.
Qwen3.5 397B A17B
Qwen
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
MiniMax M2.5
MiniMax
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity.
GLM 5
Z.ai
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows.
Qwen3 Max Thinking
Qwen
Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning.
Claude Opus 4.6
Anthropic
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Chat with 100+ AI Models in one App.
Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.
How to Choose the Right AI Model
A practical guide to picking the best LLM for your use case.
Match the model to the task
General-purpose models like GPT-4o and Claude Sonnet handle most tasks well. For specialized work, coding models (DeepSeek Coder, Codestral) and math models (QwQ, DeepSeek R1) outperform generalists on their respective benchmarks while often costing less per token.
Consider context window size
If you work with long documents, codebases, or multi-turn conversations, context window matters. Models range from 8K to over 1M tokens. Larger windows let you process entire books or repositories in a single prompt, but they increase cost and latency.
Balance cost, speed, and quality
Frontier models deliver the highest benchmark scores but cost more per token and respond slower. Fast models like Gemini Flash, Llama 3 (8B), and Mistral Small can handle routine tasks at a fraction of the cost with sub-second latency — ideal for high-volume applications.
Open source vs. proprietary
Open-source models (Llama, Mistral, Qwen, DeepSeek) let you self-host, fine-tune, and inspect weights. Proprietary models (GPT-4o, Claude, Gemini) often lead on benchmarks and offer managed APIs with built-in safety features. Many teams use both: proprietary for peak performance, open source for cost control and customization.
Check for multimodal capabilities
Some models accept images, audio, or files alongside text. If your workflow involves analyzing screenshots, diagrams, or audio transcriptions, filter for models with vision or audio input support. Models with structured output and function calling are essential for building agents and tool-using applications.
Use benchmarks as a starting point
Scores like GPQA, MMLU Pro, and HLE measure academic knowledge and reasoning. LiveCodeBench and SciCode test practical coding ability. MATH 500 and AIME evaluate mathematical problem-solving. No single benchmark tells the full story — compare scores across categories relevant to your use case, then test with your own prompts.
Data on this page is sourced from OpenRouter and Artificial Analysis. Pricing, speed, and benchmark scores are updated regularly. Try any model instantly using the free chat — no API key required.