Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

12

Full-text search

Active filters: quantllm

codewithdark/Llama-3.2-3B-4bit

3B • Updated Dec 18, 2025 • 8

codewithdark/Llama-3.2-3B-GGUF-4bit

3B • Updated Dec 19, 2025 • 5

codewithdark/Llama-3.2-3B-4bit-mlx

Text Generation • 3B • Updated Dec 19, 2025 • 65

QuantLLM/Llama-3.2-3B-4bit-mlx

Text Generation • 3B • Updated Dec 20, 2025 • 15

QuantLLM/Llama-3.2-3B-2bit-mlx

Text Generation • 3B • Updated Dec 20, 2025 • 17

QuantLLM/Llama-3.2-3B-8bit-mlx

Text Generation • 3B • Updated Dec 20, 2025 • 25

QuantLLM/Llama-3.2-3B-5bit-mlx

Text Generation • 3B • Updated Dec 20, 2025 • 14

QuantLLM/Llama-3.2-3B-5bit-gguf

3B • Updated Dec 20, 2025 • 12

QuantLLM/Llama-3.2-3B-2bit-gguf

3B • Updated Dec 20, 2025 • 3

QuantLLM/functiongemma-270m-it-8bit-gguf

0.3B • Updated Dec 21, 2025 • 3 • 1

QuantLLM/functiongemma-270m-it-4bit-gguf

0.3B • Updated Dec 21, 2025 • 4

QuantLLM/functiongemma-270m-it-4bit-mlx

Text Generation • 0.3B • Updated Dec 21, 2025 • 10