The best LLMs for your use case:
Qwen's native multimodal MoE model with 397B total parameters and 17B active, featuring hybrid Gated Delta Networks for strong reasoning and vision capabilities.
Speed:
Intelligence:
Price: (1M Tokens)
$0.60 / 3.60Inputs:
JSON Mode:
Function Calling:
Benchmarks:
MMMU
Multimodal - Vision
MMLU-Pro
General Knowledge
Multilingual MMLU
Multilingual
EQBench
Creative Writing
LongBenchv2
Summarization
BFCL
Agents and Function Calling
SimpleQA
General Knowledge
GPQA-Diamond
General Knowledge
LMArena
Chat
WebDevArena
Code
LiveCodeBench
Code
MMMU
Multimodal - Vision
MMLU-Pro
General Knowledge
Multilingual MMLU
Multilingual
EQBench
Creative Writing
LongBenchv2
Summarization
BFCL
Agents and Function Calling
SimpleQA
General Knowledge
GPQA-Diamond
General Knowledge
LMArena
Chat
WebDevArena
Code
LiveCodeBench
Code
1T-parameter MoE flagship from Moonshot with long-horizon coding, agent swarms scaling to 300 sub-agents, and state-of-the-art reasoning.
Speed:
Intelligence:
Price: (1M Tokens)
$1.20 / 4.50Inputs:
JSON Mode:
Function Calling:
Benchmarks:
MMMU
Multimodal - Vision
GPQA-Diamond
General Knowledge
EQBench
Creative Writing
LMArena
Chat
LiveCodeBench
Code
MMLU-Pro
General Knowledge
WebDevArena
Code
LongBenchv2
Summarization
BFCL
Agents and Function Calling
SimpleQA
General Knowledge
MMMU
Multimodal - Vision
GPQA-Diamond
General Knowledge
EQBench
Creative Writing
LMArena
Chat
LiveCodeBench
Code
MMLU-Pro
General Knowledge
WebDevArena
Code
LongBenchv2
Summarization
BFCL
Agents and Function Calling
SimpleQA
General Knowledge
Use case:
Multimodal - Vision
Features:
Long Context Handling