The best open LLMs for your use case:

1Kimi K3Moonshot

Moonshot AI's 2.8-trillion-parameter open-weight frontier model with native vision, 1M-token context, and state-of-the-art agentic coding and tool use.

Speed:

Intelligence:

Price: (1M Tokens)

$3.00 / 15.00

Cached input: (1M Tokens)

$0.30

Context: (tokens)

1,048,576

Inputs:

ImageText

Benchmarks:

HLE

General Knowledge

43.5

GPQA-Diamond

General Knowledge

93.5

MMMU-Pro

Multimodal - Vision

81.6

Terminal-Bench 2.1

Coding Agents

88.3

FrontierSWE

Coding Agents

81.2

DeepSWE

Coding Agents

Program Bench

Coding Agents

77.8

MCP-Atlas

Agents and Function Calling

84.2

HLE

General Knowledge

43.5

GPQA-Diamond

General Knowledge

93.5

MMMU-Pro

Multimodal - Vision

81.6

Terminal-Bench 2.1

Coding Agents

88.3

FrontierSWE

Coding Agents

81.2

DeepSWE

Coding Agents

Program Bench

Coding Agents

77.8

MCP-Atlas

Agents and Function Calling

84.2

Try it out

2DeepSeek-V4-ProDeepSeek

DeepSeek's frontier 1.6T-parameter Mixture-of-Experts model (49B active per token) with hybrid attention built for long-context, low-cost reasoning. Runs in FP4 with a 512K-token context window.

Speed:

Intelligence:

Price: (1M Tokens)

$1.74 / 3.48

Cached input: (1M Tokens)

$0.20

Context: (tokens)

512,000

Inputs:

ImageText

Benchmarks:

GPQA-Diamond

General Knowledge

90.1

HLE

General Knowledge

37.7

MMLU-Pro

General Knowledge

87.5

SimpleQA

General Knowledge

57.9

LiveCodeBench

Coding Agents

93.5

SWE-Bench Verified

Coding Agents

80.6

Terminal-Bench 2.0

Coding Agents

67.9

GDPval-AA

Agents and Function Calling

1554

GPQA-Diamond

General Knowledge

90.1

HLE

General Knowledge

37.7

MMLU-Pro

General Knowledge

87.5

SimpleQA

General Knowledge

57.9

LiveCodeBench

Coding Agents

93.5

SWE-Bench Verified

Coding Agents

80.6

Terminal-Bench 2.0

Coding Agents

67.9

GDPval-AA

Agents and Function Calling

1554

Try it out

Use case:

General Knowledge