πŸ¦™ Llama-CPP-Python Pre-built Wheels (Python 3.13)

The solution for Hugging Face "Build Timeout" errors on the Free CPU Tier.

If you are using Python 3.13 on a Hugging Face Free Space, compiling llama-cpp-python from source usually crashes or times out. This repository provides pre-compiled manylinux wheels that install in seconds.


πŸš€ Why use these wheels?

  • No Compilation: Skips the 15+ minute build process.
  • Python 3.13 Support: Specifically built for the latest Python version.
  • Generic CPU Optimization: Compiled with GGML_NATIVE=OFF. This ensures the model runs on HF's shared CPUs without "Illegal Instruction" or "Core Dump" errors.
  • Lightweight: Only ~4.3 MB compared to the massive overhead of building from source.

πŸ› οΈ How to use in your HF Space

Option A: Using requirements.txt

Simply paste this direct link into your requirements.txt file:

[https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl](https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl)


Option B: Using a Dockerfile
If you are using a custom Docker setup, add this line:
RUN pip install [https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl](https://huggingface.co/James040/llama-cpp-python-wheels/resolve/main/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl)


πŸ“¦ Build SpecificationsThese wheels were built using a high-performance automated pipeline on GitHub.SpecificationValuePython Version3.13PlatformLinux x86_64 (Manylinux)Build FlagsGGML_NATIVE=OFF, GGML_BLAS=OFFBuild SourceJameson040/my_lama-wheels
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support