Windows 安装 LLM 部署服务 vLLM
vLLM LLM Windows About 1,082 words前置条件
必须有英伟达显卡,必须安装了cuda。
vllm-windows
GitHub 地址: https://github.com/SystemPanic/vllm-windows
安装依赖
torch
pip install torch==2.11.0+cu126 torchvision==0.26.0+cu126 torchaudio==2.11.0+cu126 --index-url https://download.pytorch.org/whl/cu126
llguidance xgrammar
pip install llguidance xgrammar
wheel
从vllm-windows的Release中下载whl文件,并安装
pip install vllm-0.19.0+cu124-cp312-cp312-win_amd64.whl --extra-index-url https://download.pytorch.org/whl/nightly/cu126
vllm
pip install vllm
启动服务
vllm serve Qwen/Qwen2.5-1.5B-Instruct
指定参数
vllm serve Qwen/Qwen2.5-1.5B-Instruct --gpu-memory-utilization 0.7 --max-model-len 4096 --max-num-seqs 4
可能的错误
显存不足
ValueError: Free memory on device cuda:0 (6.89/8.0 GiB) on startup is less than desired GPU memory utilization (0.9, 7.2 GiB). Decrease GPU memory utilization or reduce GPU memory used by other processes.
为安装 CUDA
安装地址:https://developer.nvidia.com/cuda-toolkit-archive
ValueError: CUDA_LIB_PATH is not set. CUDA_LIB_PATH need to be set with the absolute path to CUDA root folder on Windows (for example, set CUDA_LIB_PATH=C:\CUDA\v12.4)
Views: 10 · Posted: 2026-05-11
———         Thanks for Reading         ———
Give me a Star, Thanks:)
https://github.com/fendoudebb/LiteNote扫描下方二维码关注公众号和小程序↓↓↓
Loading...