프로그래밍/기계학습

RuntimeError: CUDA error: no kernel image is available for execution on the device

kugancity 2022. 9. 11. 19:28
반응형

 

 

 

RTX3090로 교체하고  pytorch 설치 후 아래와 같은 에러가 발생하였다. 

 

 

$ python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.rand(10).to("cuda")
/home/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:146: UserWarning:
NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

...
    nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

 

검색해보니 3090과 호환이 되지 않는 cuda10 버전 대신 cuda11 버전을 설치하라고 한다. 

 

You’ve most likely installed the binaries with the CUDA10.2 runtime, which is incompatible with your 3090. Install the pip wheels or conda binaries with CUDA11 and it should work.

 

참고로 설치된 cuda version은 nvcc --version 으로 확인이 가능하다.

실제로 10.1 버전이 설치되어 있다. 

$nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

 

검색해보니 아래와 같은 형태로 설치하려는 버전에(+cu110)을 추가하면 된다고 함. 


pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html

(base) $ pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.7.1+cu110
  Downloading https://download.pytorch.org/whl/cu110/torch-1.7.1%2Bcu110-cp39-cp39-linux_x86_64.whl (1156.7 MB)
     |████████████████████████████████| 1156.7 MB 73 kB/s
Collecting torchvision==0.8.2+cu110
  Downloading https://download.pytorch.org/whl/cu110/torchvision-0.8.2%2Bcu110-cp39-cp39-linux_x86_64.whl (12.8 MB)
     |████████████████████████████████| 12.8 MB 38.1 MB/s
Requirement already satisfied: numpy in ./anaconda3/lib/python3.9/site-packages (from torch==1.7.1+cu110) (1.21.5)
Requirement already satisfied: typing-extensions in ./anaconda3/lib/python3.9/site-packages (from torch==1.7.1+cu110) (4.1.1)
Requirement already satisfied: pillow>=4.1.1 in ./anaconda3/lib/python3.9/site-packages (from torchvision==0.8.2+cu110) (9.0.1)
Installing collected packages: torch, torchvision
Successfully installed torch-1.7.1+cu110 torchvision-0.8.2+cu110

 

관련 에러가 사라진 것을 확인할 수 있다. 

 

$ python3
Python 3.9.12 (main, Apr  5 2022, 06:56:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.randn(3,3).to("cuda:0")
tensor([[-0.9963, -1.6286,  0.9210],
        [-1.0565, -1.7602,  1.5008],
        [-0.0684,  1.7714,  0.2132]], device='cuda:0')
>>>

 

참고로 nvcc --version에서 나오는 버전은 변화가 없다. 

 

(base) $ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

 

참고: https://ssaru.github.io/2021/05/05/20210505-til_install_rtx3090_supported_pytorch/

https://discuss.pytorch.org/t/nvidia-geforce-rtx-3090-with-cuda-capability-sm-86-is-not-compatible-with-the-current-pytorch-installation/141940/8

728x90
반응형