The best LLMs for your use case:

1Qwen3.5 397B-A17BQwen

Qwen's native multimodal MoE model with 397B total parameters and 17B active, featuring hybrid Gated Delta Networks for strong reasoning and vision capabilities.

Speed:

Intelligence:

Price: (1M Tokens)

$0.60 / 3.60

Inputs:

ImageText

JSON Mode:

Function Calling:

Benchmarks:

#1

Multilingual MMLU

Multilingual

88.5
#1

GPQA-Diamond

General Knowledge

88.4
#1

MMLU-Pro

General Knowledge

87.8
#1

LongBenchv2

Summarization

63.2
#1

MMMU

Multimodal - Vision

85
#2

BFCL

Agents and Function Calling

72.9
#3

LMArena

Chat

1447
#3

LiveCodeBench

Code

83.6
#1

Multilingual MMLU

Multilingual

88.5
#1

GPQA-Diamond

General Knowledge

88.4
#1

MMLU-Pro

General Knowledge

87.8
#1

LongBenchv2

Summarization

63.2
#1

MMMU

Multimodal - Vision

85
#2

BFCL

Agents and Function Calling

72.9
#3

LMArena

Chat

1447
#3

LiveCodeBench

Code

83.6
2Llama 4 Maverick (17Bx128E)Meta

SOTA 128-expert MoE powerhouse for multilingual image/text understanding, creative writing, and enterprise-scale applications.

Speed:

Intelligence:

Price: (1M Tokens)

$0.27

Inputs:

ImageText

JSON Mode:

Function Calling:

Benchmarks:

#2

Multilingual MMLU

Multilingual

84.6
#4

MMMU

Multimodal - Vision

73.4
#10

GPQA-Diamond

General Knowledge

69.8
#11

WebDevArena

Code

1015
#12

EQBench

Creative Writing

628.6
#14

LMArena

Chat

1269
#15

BFCL

Agents and Function Calling

53.32
#18

LiveCodeBench

Code

43.4
#20

MMLU-Pro

General Knowledge

62.9
#2

Multilingual MMLU

Multilingual

84.6
#4

MMMU

Multimodal - Vision

73.4
#10

GPQA-Diamond

General Knowledge

69.8
#11

WebDevArena

Code

1015
#12

EQBench

Creative Writing

628.6
#14

LMArena

Chat

1269
#15

BFCL

Agents and Function Calling

53.32
#18

LiveCodeBench

Code

43.4
#20

MMLU-Pro

General Knowledge

62.9

Use case:

Multilingual