修复GTX 10系显卡no kernel image报错

今天想试试看P104-100玩AI画图怎么样,结果遇到了PyTorch相关的报错,解决后感觉也挺有代表性的,写篇文章简单记录一下。

问题复现

尝试使用P104-100跑ComfyUI玩,结果收到了报错:

torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device
Search for `cudaErrorNoKernelImageForDevice' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

我就觉得纳闷,为什么同样的驱动版本,都是580版本的驱动,V100不报错,但是P104-100报错了。
然后我运行了检查GPU compute capability的代码,得到结果如下:

print(torch.cuda.get_device_properties(0))
/home/rin/.local/lib/python3.12/site-packages/torch/cuda/__init__.py:283: UserWarning: 
    Found GPU0 NVIDIA P104-100 which is of cuda capability 6.1.
    Minimum and Maximum cuda capability supported by this version of PyTorch is
    (7.0) - (12.0)
    
  warnings.warn(
/home/rin/.local/lib/python3.12/site-packages/torch/cuda/__init__.py:304: UserWarning: 
    Please install PyTorch with a following CUDA
    configurations:  12.6 following instructions at
    https://pytorch.org/get-started/locally/
    
  warnings.warn(matched_cuda_warn.format(matched_arches))
/home/rin/.local/lib/python3.12/site-packages/torch/cuda/__init__.py:283: UserWarning: 
    Found GPU1 NVIDIA P104-100 which is of cuda capability 6.1.
    Minimum and Maximum cuda capability supported by this version of PyTorch is
    (7.0) - (12.0)
    
  warnings.warn(
/home/rin/.local/lib/python3.12/site-packages/torch/cuda/__init__.py:326: UserWarning: 
NVIDIA P104-100 with CUDA capability sm_61 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_70 sm_75 sm_80 sm_86 sm_90 sm_100 sm_120.
If you want to use the NVIDIA P104-100 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(
_CudaDeviceProperties(name='NVIDIA P104-100', major=6, minor=1, total_memory=8109MB, multi_processor_count=15, uuid=eb87a003-8cda-0e4c-b06d-768185c4dc13, pci_bus_id=1, pci_device_id=0, pci_domain_id=0, L2_cache_size=2MB)

这下就清楚了,V100不报错是因为其cuda capability为7.0,正好赶上末班车,而10系显卡的Pascal架构cuda capability为6.1,刚好被抛弃了。解决方法也很简单了,那就是去找出来最后支持cuda capability为6.1的PyTorch版本。

修复报错

查找了资料后,我得知PyTorch的2.7.1版本是最后支持Pascal架构的版本,我们只需要卸载当前的PyTorch,降级为2.7.1即可。

pip3.12 uninstall torch torchvision torchaudio
pip3.12 install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1

然后在尝试运行print(torch.cuda.get_device_properties(0))验证一下,应该就不报错了:

print(torch.cuda.get_device_properties(0))
_CudaDeviceProperties(name='NVIDIA P104-100', major=6, minor=1, total_memory=8109MB, multi_processor_count=15, uuid=eb87a003-8cda-0e4c-b06d-768185c4dc13, L2_cache_size=2MB)

如果得到这样的结果,comfyUI应该也可以正常运行了。
最后我补充说明一下我的软件版本环境:

  • Python 3.12
  • ComfyUI 0.3.76