LLMs by Category
Top AI Models by Category
Compare the latest models across open source, proprietary, uncensored, coding, math, speed, and release freshness.
Most Used AI Models
Popular picks in the current catalog.
Kimi K2.7 Code
NEWMoonshotAI
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...
Claude Fable 5
NEWAnthropic
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...
Nemotron 3 Ultra
NEWNVIDIA
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Qwen3.7 Plus
NEWQwen
Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...
M3
NEWMinimax
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...
Step 3.7 Flash
Stepfun
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...
Claude Opus 4.8 (Fast)
Anthropic
Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode
Claude Opus 4.8
Anthropic
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
Qwen3.7 Max
Qwen
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...
Grok Build 0.1
xAI
Grok Build 0.1 is xAIβs fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...
Top Open Source AI Models
Community-driven, inspectable weights.
Kimi K2.6
MoonshotAI
Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration. It handles complex end-to-end coding tasks across Python, Rust, and Go, and...
MiMo-V2.5-Pro
Xiaomi
MiMo-V2.5-Pro is Xiaomiβs flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....
V4 Pro
Deepseek
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...
GLM 5.1
Z Ai
GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...
Qwen3.6 Plus
Qwen
Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...
M2.7
Minimax
MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent...
GLM 5 Turbo
Z Ai
GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows...
Kimi K2.5
MoonshotAI
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed...
V4 Flash
Deepseek
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...
Qwen3.5 397B A17B
Qwen
The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It delivers...
Top Proprietary AI Models
Frontier closed models.
Claude Fable 5
NEWAnthropic
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...
Claude Opus 4.8
Anthropic
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
GPT-5.5
OpenAI
GPT-5.5 is OpenAIβs frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...
GPT-5.5 Pro
OpenAI
GPT-5.5 Pro is OpenAIβs high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Googleβs frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
GPT-5.4 Image 2
OpenAI
It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...
Qwen3.7 Max
Qwen
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...
Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
M3
NEWMinimax
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...
GPT-5.3-Codex
OpenAI
GPT-5.3-Codex is OpenAIβs most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...
Top Coding AI Models
Models tuned for code and developer workflows.
Claude Fable 5
NEWAnthropic
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...
GPT-5.5
OpenAI
GPT-5.5 is OpenAIβs frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...
GPT-5.5 Pro
OpenAI
GPT-5.5 Pro is OpenAIβs high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...
GPT-5.4 Image 2
OpenAI
It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...
Claude Opus 4.8
Anthropic
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Googleβs frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
GPT-5.3-Codex
OpenAI
GPT-5.3-Codex is OpenAIβs most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...
GPT-5.4 Mini
OpenAI
GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoning, coding,...
Claude Sonnet 4.6
Anthropic
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, end-to-end project management with...
Qwen3.7 Max
Qwen
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...
Top OCR AI Models
Models specialised in optical character recognition and document extraction.
PaddleOCR-VL-0.9B
PaddlePaddle
Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context β optimized for efficient on-device document parsing.
olmOCR-2-7B
AllenAI
Allen AI's 7B OCR model fine-tuned from Qwen2.5-VL-7B on curated academic papers and technical documentation. Supports 128K context and extracts structured text from PDFs and scanned documents with high fidelity.
DeepSeek-OCR
DeepSeek
DeepSeek's ~3B MoE OCR model using optical context compression to encode full pages into compact token sequences. Outputs structured Markdown preserving text layout, tables, and mathematical formulas from images and PDFs.
Mistral OCR
Mistral AI
Mistral's dedicated document understanding model (December 2025). Processes PDFs and images page-by-page via API, returning structured Markdown with preserved tables, equations, image bounding boxes, and rich layout metadata.
Top Math AI Models
Math and reasoning specialists.
GPT-5.5 Pro
OpenAI
GPT-5.5 Pro is OpenAIβs high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...
GPT-5.3-Codex
OpenAI
GPT-5.3-Codex is OpenAIβs most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results...
Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
V4 Pro
Deepseek
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...
MiMo-V2-Flash
Xiaomi
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Googleβs frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
GLM 4.7 Flash
Z Ai
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...
Kimi K2.7 Code
NEWMoonshotAI
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...
Grok 4.3
xAI
Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instruction-following tasks, and applications requiring high factual...
Claude Opus 4.8
Anthropic
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
Fast AI Models
Lowest cost + latency options.
Mercury 2
Inception
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving...
LiquidAI: LFM2-24B-A2B
Liquid
LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per...
IBM: Granite 4.1 8B
Ibm Granite
Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...
Step 3.7 Flash
Stepfun
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...
GPT-5.5 Pro
OpenAI
GPT-5.5 Pro is OpenAIβs high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...
Gemini 3.1 Flash Lite
Gemini 3.1 Flash Lite is Googleβs GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...
Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
Qwen3.5-Flash
Qwen
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...
M2.1
Minimax
MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world...
Grok 4.20 Multi-Agent
xAI
Grok 4.20 Multi-Agent is a variant of xAIβs Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...
Top Image Generation AI Models
Models that generate images from text prompts.
GPT-5.4 Image 2
OpenAI
It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and...
Nano Banana 2 (Gemini 3.1 Flash Image Preview)
Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Googleβs latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines...
Top Audio AI Models
Models with voice and audio output capabilities.
GPT Audio
OpenAI
The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced...
GPT Audio Mini
OpenAI
A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million...
Large Context Window AI Models
Models with 200K+ context windows.
Grok 4.20 Multi-Agent
xAI
Grok 4.20 Multi-Agent is a variant of xAIβs Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information...
GPT-5.5 Pro
OpenAI
GPT-5.5 Pro is OpenAIβs high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...
GPT-5.5
OpenAI
GPT-5.5 is OpenAIβs frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. It features a 1M+ token...
Gemini 3.5 Flash
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed. It is highly optimized for coding proficiency and parallel agentic execution...
Gemini 3.1 Flash Lite
Gemini 3.1 Flash Lite is Googleβs GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...
V4 Pro
Deepseek
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...
MiMo-V2.5-Pro
Xiaomi
MiMo-V2.5-Pro is Xiaomiβs flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....
MiMo-V2.5
Xiaomi
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...
Gemini 3.1 Pro Preview Custom Tools
Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party...
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is Googleβs frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows. Building on the multimodal foundation...
Top Uncensored AI Models
Lightly filtered, high-flexibility models.
Newest AI Models
Fresh model releases.
Kimi K2.7 Code
NEWMoonshotAI
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts. It uses a native multimodal mixture-of-experts...
Claude Fable 5
NEWAnthropic
Claude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding. It supports text, image, and file inputs with text output, with reasoning support and...
Nemotron 3 Ultra
NEWNVIDIA
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Qwen3.7 Plus
NEWQwen
Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its...
M3
NEWMinimax
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding,...
Step 3.7 Flash
Stepfun
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...
Claude Opus 4.8 (Fast)
Anthropic
Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode
Claude Opus 4.8
Anthropic
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and file inputs with text output, with reasoning support and a 1M-token...
Qwen3.7 Max
Qwen
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for agent-centric workloads, with particular strengths in coding, office and productivity tasks,...
Grok Build 0.1
xAI
Grok Build 0.1 is xAIβs fast coding model trained specifically for agentic software engineering workflows. It supports text and image inputs with text output, and is optimized for interactive coding...
AI chat subscription
Turn model research into daily AI work.
Use 100+ models, web search, files, and EU-hosted options in one paid chat workspace.
Inference credits
Build with EU-hosted open-source models.
OpenAI-compatible API for GLM, Kimi, DeepSeek and more. Add credits inside the dashboard.
How to Choose the Right AI Model
A practical guide to picking the best LLM for your use case.
Match the model to the task
General-purpose models like GPT-4o and Claude Sonnet handle most tasks well. For specialized work, coding models (DeepSeek Coder, Codestral) and math models (QwQ, DeepSeek R1) outperform generalists on their respective benchmarks while often costing less per token.
Consider context window size
If you work with long documents, codebases, or multi-turn conversations, context window matters. Models range from 8K to over 1M tokens. Larger windows let you process entire books or repositories in a single prompt, but they increase cost and latency.
Balance cost, speed, and quality
Frontier models deliver the highest benchmark scores but cost more per token and respond slower. Fast models like Gemini Flash, Llama 3 (8B), and Mistral Small can handle routine tasks at a fraction of the cost with sub-second latency - ideal for high-volume applications.
Open source vs. proprietary
Open-source models (Llama, Mistral, Qwen, DeepSeek) let you self-host, fine-tune, and inspect weights. Proprietary models (GPT-4o, Claude, Gemini) often lead on benchmarks and offer managed APIs with built-in safety features. Many teams use both: proprietary for peak performance, open source for cost control and customization.
Check for multimodal capabilities
Some models accept images, audio, or files alongside text. If your workflow involves analyzing screenshots, diagrams, or audio transcriptions, filter for models with vision or audio input support. Models with structured output and function calling are essential for building agents and tool-using applications.
Use benchmarks as a starting point
Scores like GPQA, MMLU Pro, and HLE measure academic knowledge and reasoning. LiveCodeBench and SciCode test practical coding ability. MATH 500 and AIME evaluate mathematical problem-solving. No single benchmark tells the full story - compare scores across categories relevant to your use case, then test with your own prompts.
Model catalog, pricing, speed, and benchmark scores are updated regularly. Try any model instantly using the free chat - no API key required.