LLMs by Category

Top AI Models by Category

Compare the latest models across open source, proprietary, uncensored, coding, math, speed, and release freshness.

Most Used AI Models

Popular picks in the current catalog.

Kimi K2.7 Code

NEW

MoonshotAI

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...

Context 262K

Speed 122 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

Claude Fable 5

NEW

Anthropic

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...

Context 1.0M

Speed 142 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Nemotron 3 Ultra

NEW

NVIDIA

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Context 262K

Speed N/A

Input Text

Output Text

Reasoning Yes

Details →

Qwen3.7 Plus

NEW

Qwen

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...

Context 1.0M

Speed 180 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

M3

NEW

Minimax

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

Context 524K

Speed 54 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

Step 3.7 Flash

Stepfun

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Context 256K

Speed 405 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

Claude Opus 4.8 (Fast)

Anthropic

Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

Context 1.0M

Speed N/A

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Context 1.0M

Speed 65 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Qwen3.7 Max

Qwen

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

Context 1.0M

Speed 188 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Grok Build 0.1

xAI

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

Context 256K

Speed N/A

Input Text, Image

Output Text

Reasoning Yes

Details →

Top Open Source AI Models

Community-driven, inspectable weights.

Kimi K2.6

MoonshotAI

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...

Context 262K

Speed 42 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

MiMo-V2.5-Pro

Xiaomi

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

Context 1.0M

Speed 159 tok/s

Input Text

Output Text

Reasoning Yes

Details →

V4 Pro

Deepseek

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Context 1.0M

Speed 81 tok/s

Input Text

Output Text

Reasoning Yes

Details →

GLM 5.1

Z Ai

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...

Context 203K

Speed 83 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Qwen3.6 Plus

Qwen

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

Context 1.0M

Speed 53 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

M2.7

Minimax

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...

Context 197K

Speed 44 tok/s

Input Text

Output Text

Reasoning Yes

Details →

GLM 5 Turbo

Z Ai

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...

Context 262K

Speed N/A

Input Text

Output Text

Reasoning Yes

Details →

Kimi K2.5

MoonshotAI

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...

Context 256K

Speed 40 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

V4 Flash

Deepseek

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Context 1.0M

Speed 106 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Qwen3.5 397B A17B

Qwen

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...

Context 262K

Speed 52 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

Top Proprietary AI Models

Frontier closed models.

Claude Fable 5

NEW

Anthropic

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...

Context 1.0M

Speed 142 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Context 1.0M

Speed 65 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

GPT-5.5

OpenAI

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

Context 1.1M

Speed 63 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Context 1.1M

Speed 360 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Context 1.0M

Speed 136 tok/s

Input Audio, File, Image, Text, Video

Output Text

Reasoning Yes

Details →

GPT-5.4 Image 2

OpenAI

It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

Context 272K

Speed 147 tok/s

Input Image, Text, File

Output Image, Text

Reasoning Yes

Details →

Qwen3.7 Max

Qwen

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

Context 1.0M

Speed 188 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

Context 1.0M

Speed 279 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

M3

NEW

Minimax

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

Context 524K

Speed 54 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

GPT-5.3-Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

Context 400K

Speed 196 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Top Coding AI Models

Models tuned for code and developer workflows.

Claude Fable 5

NEW

Anthropic

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...

Context 1.0M

Speed 142 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

GPT-5.5

OpenAI

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

Context 1.1M

Speed 63 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Context 1.1M

Speed 360 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

GPT-5.4 Image 2

OpenAI

It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

Context 272K

Speed 147 tok/s

Input Image, Text, File

Output Image, Text

Reasoning Yes

Details →

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Context 1.0M

Speed 65 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Context 1.0M

Speed 136 tok/s

Input Audio, File, Image, Text, Video

Output Text

Reasoning Yes

Details →

GPT-5.3-Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

Context 400K

Speed 196 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

GPT-5.4 Mini

OpenAI

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...

Context 400K

Speed 178 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

Claude Sonnet 4.6

Anthropic

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...

Context 1.0M

Speed 64 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Qwen3.7 Max

Qwen

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

Context 1.0M

Speed 188 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Top OCR AI Models

Models specialised in optical character recognition and document extraction.

PaddleOCR-VL-0.9B

PaddlePaddle

Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context — optimized for efficient on-device document parsing.

Context 16K

Speed N/A

Input Text, Image

Output Text

Reasoning No

Details →

olmOCR-2-7B

AllenAI

Allen AI's 7B OCR model fine-tuned from Qwen2.5-VL-7B on curated academic papers and technical documentation. Supports 128K context and extracts structured text from PDFs and scanned documents with high fidelity.

Context 128K

Speed N/A

Input Text, Image

Output Text

Reasoning No

Details →

DeepSeek-OCR

DeepSeek

DeepSeek's ~3B MoE OCR model using optical context compression to encode full pages into compact token sequences. Outputs structured Markdown preserving text layout, tables, and mathematical formulas from images and PDFs.

Context N/A

Speed N/A

Input Text, Image

Output Text

Reasoning No

Details →

Mistral OCR

Mistral AI

Mistral's dedicated document understanding model (December 2025). Processes PDFs and images page-by-page via API, returning structured Markdown with preserved tables, equations, image bounding boxes, and rich layout metadata.

Context N/A

Speed N/A

Input Image, Pdf

Output Text

Reasoning No

Details →

Top Math AI Models

Math and reasoning specialists.

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Context 1.1M

Speed 360 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

GPT-5.3-Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...

Context 400K

Speed 196 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

Context 1.0M

Speed 279 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

V4 Pro

Deepseek

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Context 1.0M

Speed 81 tok/s

Input Text

Output Text

Reasoning Yes

Details →

MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...

Context 262K

Speed 156 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Context 1.0M

Speed 136 tok/s

Input Audio, File, Image, Text, Video

Output Text

Reasoning Yes

Details →

GLM 4.7 Flash

Z Ai

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

Context 203K

Speed 111 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Kimi K2.7 Code

NEW

MoonshotAI

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...

Context 262K

Speed 122 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

Grok 4.3

xAI

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...

Context 1.0M

Speed 155 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Context 1.0M

Speed 65 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Fast AI Models

Lowest cost + latency options.

Mercury 2

Inception

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...

Context 128K

Speed 1162 tok/s

Input Text

Output Text

Reasoning Yes

Details →

LiquidAI: LFM2-24B-A2B

Liquid

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...

Context 33K

Speed 531 tok/s

Input Text

Output Text

Reasoning No

Details →

IBM: Granite 4.1 8B

Ibm Granite

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

Context 131K

Speed 421 tok/s

Input Text

Output Text

Reasoning No

Details →

Step 3.7 Flash

Stepfun

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Context 256K

Speed 405 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Context 1.1M

Speed 360 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

Gemini 3.1 Flash Lite

Google

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...

Context 1.0M

Speed 325 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

Context 1.0M

Speed 279 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

Qwen3.5-Flash

Qwen

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Context 1.0M

Speed 259 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

M2.1

Minimax

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...

Context 197K

Speed 235 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Grok 4.20 Multi-Agent

xAI

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

Context 2.0M

Speed 220 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Top Image Generation AI Models

Models that generate images from text prompts.

GPT-5.4 Image 2

OpenAI

It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...

Context 272K

Speed 147 tok/s

Input Image, Text, File

Output Image, Text

Reasoning Yes

Details →

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...

Context 66K

Speed N/A

Input Image, Text

Output Image, Text

Reasoning Yes

Details →

Top Audio AI Models

Models with voice and audio output capabilities.

GPT Audio

OpenAI

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...

Context 128K

Speed N/A

Input Text, Audio

Output Text, Audio

Reasoning No

Details →

GPT Audio Mini

OpenAI

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...

Context 128K

Speed 175 tok/s

Input Text, Audio

Output Text, Audio

Reasoning No

Details →

Large Context Window AI Models

Models with 200K+ context windows.

Grok 4.20 Multi-Agent

xAI

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...

Context 2.0M

Speed 220 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Context 1.1M

Speed 360 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

GPT-5.5

OpenAI

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...

Context 1.1M

Speed 63 tok/s

Input File, Image, Text

Output Text

Reasoning Yes

Details →

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...

Context 1.0M

Speed 279 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

Gemini 3.1 Flash Lite

Google

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...

Context 1.0M

Speed 325 tok/s

Input Text, Image, Video, File, Audio

Output Text

Reasoning Yes

Details →

V4 Pro

Deepseek

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

Context 1.0M

Speed 81 tok/s

Input Text

Output Text

Reasoning Yes

Details →

MiMo-V2.5-Pro

Xiaomi

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....

Context 1.0M

Speed 159 tok/s

Input Text

Output Text

Reasoning Yes

Details →

MiMo-V2.5

Xiaomi

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

Context 1.0M

Speed 46 tok/s

Input Text, Audio, Image, Video

Output Text

Reasoning Yes

Details →

Gemini 3.1 Pro Preview Custom Tools

Google

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...

Context 1.0M

Speed N/A

Input Text, Audio, Image, Video, File

Output Text

Reasoning Yes

Details →

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...

Context 1.0M

Speed 136 tok/s

Input Audio, File, Image, Text, Video

Output Text

Reasoning Yes

Details →

Top Uncensored AI Models

Lightly filtered, high-flexibility models.

Newest AI Models

Fresh model releases.

Kimi K2.7 Code

NEW

MoonshotAI

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...

Context 262K

Speed 122 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

Claude Fable 5

NEW

Anthropic

Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...

Context 1.0M

Speed 142 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Nemotron 3 Ultra

NEW

NVIDIA

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Context 262K

Speed N/A

Input Text

Output Text

Reasoning Yes

Details →

Qwen3.7 Plus

NEW

Qwen

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...

Context 1.0M

Speed 180 tok/s

Input Text, Image

Output Text

Reasoning Yes

Details →

M3

NEW

Minimax

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...

Context 524K

Speed 54 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

Step 3.7 Flash

Stepfun

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Context 256K

Speed 405 tok/s

Input Text, Image, Video

Output Text

Reasoning Yes

Details →

Claude Opus 4.8 (Fast)

Anthropic

Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

Context 1.0M

Speed N/A

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...

Context 1.0M

Speed 65 tok/s

Input Text, Image, File

Output Text

Reasoning Yes

Details →

Qwen3.7 Max

Qwen

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...

Context 1.0M

Speed 188 tok/s

Input Text

Output Text

Reasoning Yes

Details →

Grok Build 0.1

xAI

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...

Context 256K

Speed N/A

Input Text, Image

Output Text

Reasoning Yes

Details →

AI chat subscription

Turn model research into daily AI work.

Use 100+ models, web search, files, and EU-hosted options in one paid chat workspace.

Start chat View plans

Inference credits

Build with EU-hosted open-source models.

OpenAI-compatible API for GLM, Kimi, DeepSeek and more. Add credits inside the dashboard.

Get API access Add credits

How to Choose the Right AI Model

A practical guide to picking the best LLM for your use case.

Match the model to the task

General-purpose models like GPT-4o and Claude Sonnet handle most tasks well. For specialized work, coding models (DeepSeek Coder, Codestral) and math models (QwQ, DeepSeek R1) outperform generalists on their respective benchmarks while often costing less per token.

Consider context window size

If you work with long documents, codebases, or multi-turn conversations, context window matters. Models range from 8K to over 1M tokens. Larger windows let you process entire books or repositories in a single prompt, but they increase cost and latency.

Balance cost, speed, and quality

Frontier models deliver the highest benchmark scores but cost more per token and respond slower. Fast models like Gemini Flash, Llama 3 (8B), and Mistral Small can handle routine tasks at a fraction of the cost with sub-second latency - ideal for high-volume applications.

Open source vs. proprietary

Open-source models (Llama, Mistral, Qwen, DeepSeek) let you self-host, fine-tune, and inspect weights. Proprietary models (GPT-4o, Claude, Gemini) often lead on benchmarks and offer managed APIs with built-in safety features. Many teams use both: proprietary for peak performance, open source for cost control and customization.

Check for multimodal capabilities

Some models accept images, audio, or files alongside text. If your workflow involves analyzing screenshots, diagrams, or audio transcriptions, filter for models with vision or audio input support. Models with structured output and function calling are essential for building agents and tool-using applications.

Use benchmarks as a starting point

Scores like GPQA, MMLU Pro, and HLE measure academic knowledge and reasoning. LiveCodeBench and SciCode test practical coding ability. MATH 500 and AIME evaluate mathematical problem-solving. No single benchmark tells the full story - compare scores across categories relevant to your use case, then test with your own prompts.

Model catalog, pricing, speed, and benchmark scores are updated regularly. Try any model instantly using the free chat - no API key required.