nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/DeepSeek-R1-Distill-Qwen-32B-NVFP4
Text Generation
• 19B • Updated
• 2.23k
• 1
nm-testing/tinysmokeqwen3moe-W4A16-first-only
2.54M • Updated
• 1
nm-testing/tinysmokeqwen3moe
2.93M • Updated
• 3
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4
5B • Updated
• 78
nm-testing/granite-4.0-h-small-FP8-block
Text Generation
• 32B • Updated
• 14
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8
8B • Updated
nm-testing/Llama3_2_1B_speculator.eagle3
0.4B • Updated
• 87.4k
nm-testing/Llama-3.1-8B-Instruct-KV-Cache-FP8
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test132
0.7B • Updated
nm-testing/TinyLlama-1.1B-Chat-v1.0-awq-asym-test-awq-asym
0.3B • Updated
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-1105
Updated
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test011
Updated
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test
Updated
nm-testing/Kimi-Linear-48B-A3B-Instruct-FP8-DYNAMIC
49B • Updated
• 11
nm-testing/llama2.c-stories42M-pruned2.4
Updated
• 971
nm-testing/gpt-oss-20B.eagle3.unconverted-drafter
nm-testing/random-weights-llama3.1.8b-2layer-eagle3-unconverted
Updated
• 79
nm-testing/Llama-4-Scout-17B-16E-Instruct-BLOCK-FP8
Text Generation
• 109B • Updated
• 1
nm-testing/Llama-4-Maverick-17B-128E-Instruct-block-FP8
Text Generation
• Updated
• 2
nm-testing/Qwen3-VL-235B-A22B-Instruct-FP8-BLOCK
Text Generation
• Updated