| | --- |
| | |
| | |
| | |
| | license: mit |
| | tags: |
| | - llama-cpp-python |
| | - cuda |
| | - nvidia |
| | - blackwell |
| | - windows |
| | - prebuilt-wheels |
| | - python |
| | - machine-learning |
| | - large-language-models |
| | - gpu-acceleration |
| | --- |
| | |
| | # llama-cpp-python 0.3.9 Prebuilt Wheel with CUDA Support for Windows |
| |
|
| | This repository provides a prebuilt Python wheel for **llama-cpp-python** (version 0.3.9) with NVIDIA CUDA support, for Windows 10/11 (x64) systems. This wheel enables GPU-accelerated inference for large language models (LLMs) using the `llama.cpp` library, simplifying setup by eliminating the need to compile from source. The wheel is compatible with Python 3.10 and supports NVIDIA GPUs, including the latest Blackwell architecture. |
| |
|
| | ## Available Wheel |
| | - `llama_cpp_python-0.3.9-cp310-cp310-win_amd64.whl` (Python 3.10, CUDA 12.8) |
| |
|
| | ## Compatibility |
| | The prebuilt wheels are designed for NVIDIA Blackwell GPUs but have been tested and confirmed compatible with previous-generation NVIDIA GPUs, including: |
| | - NVIDIA RTX 5090 |
| | - NVIDIA RTX 3090 |
| |
|
| | ## Installation |
| | To install the wheel, use the following command in your Python 3.10 environment: |
| |
|
| | ```bash |
| | pip install llama_cpp_python-0.3.9-cp310-cp310-win_amd64.whl |