The best LLMs for your use case:

1Qwen3.5 9BQwen

Compact 9B dense model from Qwen punching above its weight class on knowledge and coding benchmarks at a fraction of the cost.

Speed:

Intelligence:

Price: (1M Tokens)

$0.10 / 0.15

Inputs:

ImageText

JSON Mode:

Function Calling:

Benchmarks:

#3

EQBench

Creative Writing

1210.1
#3

Multilingual MMLU

Multilingual

75
#4

LongBenchv2

Summarization

48.9
#4

BFCL

Agents and Function Calling

65
#5

MMLU-Pro

General Knowledge

82.5
#6

GPQA-Diamond

General Knowledge

81.7
#6

LMArena

Chat

1313
#6

LiveCodeBench

Code

82.7
#7

SimpleQA

General Knowledge

18
#3

EQBench

Creative Writing

1210.1
#3

Multilingual MMLU

Multilingual

75
#4

LongBenchv2

Summarization

48.9
#4

BFCL

Agents and Function Calling

65
#5

MMLU-Pro

General Knowledge

82.5
#6

GPQA-Diamond

General Knowledge

81.7
#6

LMArena

Chat

1313
#6

LiveCodeBench

Code

82.7
#7

SimpleQA

General Knowledge

18
2GPT-OSS 120BOpenAI

OpenAI's open-source 120B parameter model with MXFP4 quantization for efficient inference.

Speed:

Intelligence:

Price: (1M Tokens)

$0.15 / 0.60

Inputs:

ImageText

JSON Mode:

Function Calling:

Benchmarks:

#4

EQBench

Creative Writing

1152
#2

Multilingual MMLU

Multilingual

79.3
#4

LMArena

Chat

1355
#5

WebDevArena

Code

1090
#7

GPQA-Diamond

General Knowledge

73.1
#8

SimpleQA

General Knowledge

16.8
#4

EQBench

Creative Writing

1152
#2

Multilingual MMLU

Multilingual

79.3
#4

LMArena

Chat

1355
#5

WebDevArena

Code

1090
#7

GPQA-Diamond

General Knowledge

73.1
#8

SimpleQA

General Knowledge

16.8

Use case:

Creative Writing

Features:

Low Latency