The best LLMs for your use case:
Hybrid instruct + reasoning model (232Bx22B MoE) optimized for high-throughput, cost-efficient inference and distillation.
Speed:
Intelligence:
Price: (1M Tokens)
$0.20Inputs:
JSON Mode:
Function Calling:
Benchmarks:
BFCL
Agents and Function Calling
LiveBench
General Knowledge
EQBench
Creative Writing
LiveCodeBench
Code
Aider Polyglot
Code
MGSM
Multilingual
GPQA-Diamond
General Knowledge
MMLU-Pro
General Knowledge
LongBenchv2
Summarization
Multilingual MMLU
Multilingual
WebDevArena
Code
LMArena
Chat
BFCL
Agents and Function Calling
LiveBench
General Knowledge
EQBench
Creative Writing
LiveCodeBench
Code
Aider Polyglot
Code
MGSM
Multilingual
GPQA-Diamond
General Knowledge
MMLU-Pro
General Knowledge
LongBenchv2
Summarization
Multilingual MMLU
Multilingual
WebDevArena
Code
LMArena
Chat
Decoder-only model built for advanced language processing tasks.
Speed:
Intelligence:
Price: (1M Tokens)
$1.20Inputs:
JSON Mode:
Function Calling:
Benchmarks:
BFCL
Agents and Function Calling
MMLU-Pro
General Knowledge
EQBench
Creative Writing
LiveCodeBench
Code
LongBenchv2
Summarization
Multilingual MMLU
Multilingual
LiveBench
General Knowledge
MGSM
Multilingual
LMArena
Chat
BFCL
Agents and Function Calling
MMLU-Pro
General Knowledge
EQBench
Creative Writing
LiveCodeBench
Code
LongBenchv2
Summarization
Multilingual MMLU
Multilingual
LiveBench
General Knowledge
MGSM
Multilingual
LMArena
Chat
Use case:
Agents and Function Calling