The best open LLMs for your use case:

1Kimi K2.6Moonshot

1T-parameter MoE flagship from Moonshot with long-horizon coding, agent swarms scaling to 300 sub-agents, and state-of-the-art reasoning.

Speed:

Intelligence:

Price: (1M Tokens)

$1.20 / 4.50

Cached input: (1M Tokens)

$0.20

Context: (tokens)

262,144

Inputs:

ImageText

Benchmarks:

SciCode

Coding Agents

52.2

MCP-Mark

Agents and Function Calling

55.9

MMMU-Pro

Multimodal - Vision

79.4

Apex Agents

Agents and Function Calling

27.9

FrontierCode

Coding Agents

3.8

LiveCodeBench

Coding Agents

89.6

Terminal-Bench 2.0

Coding Agents

66.7

Claw-Eval

Agents and Function Calling

62.3

SciCode

Coding Agents

52.2

MCP-Mark

Agents and Function Calling

55.9

MMMU-Pro

Multimodal - Vision

79.4

Apex Agents

Agents and Function Calling

27.9

FrontierCode

Coding Agents

3.8

LiveCodeBench

Coding Agents

89.6

Terminal-Bench 2.0

Coding Agents

66.7

Claw-Eval

Agents and Function Calling

62.3

Try it out

2MiniMax-M3MiniMax

Next-generation reasoning model from MiniMax with frontier agentic, coding, and multimodal performance. Strong scores on SWE-Bench, BrowseComp, OmniDocBench, and IMO/USAMO competition reasoning.

Speed:

Intelligence:

Price: (1M Tokens)

$0.30 / 1.20

Cached input: (1M Tokens)

$0.06

Context: (tokens)

524,288

Inputs:

ImageText

Benchmarks:

GPQA-Diamond

General Knowledge

92.9

Video-MME v2

Multimodal - Vision

85.4

Claw-Eval

Agents and Function Calling

74.5

SWE-Bench Verified

Coding Agents

80.5

SWE-Bench Pro

Coding Agents

Apex Agents

Agents and Function Calling

27.7

MMMU-Pro

Multimodal - Vision

78.1

GPQA-Diamond

General Knowledge

92.9

Video-MME v2

Multimodal - Vision

85.4

Claw-Eval

Agents and Function Calling

74.5

SWE-Bench Verified

Coding Agents

80.5

SWE-Bench Pro

Coding Agents

Apex Agents

Agents and Function Calling

27.7

MMMU-Pro

Multimodal - Vision

78.1

Try it out

Use case:

Chat

Features:

Long Context Handling