
  • RuntimeError: CUDA error: no kernel image is available for execution on the device
    RTX3090로 교체하고  pytorch 설치 후 아래와 같은 에러가 발생하였다. 



    $ python3
    Python 3.8.10 (default, Jun 22 2022, 20:18:18)
    [GCC 9.4.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import torch
    >>> torch.rand(10).to("cuda")
    /home/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:146: UserWarning:
    NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
    The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
    If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
        nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
    RuntimeError: CUDA error: no kernel image is available for execution on the device
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.


    검색해보니 3090과 호환이 되지 않는 cuda10 버전 대신 cuda11 버전을 설치하라고 한다. 


    You’ve most likely installed the binaries with the CUDA10.2 runtime, which is incompatible with your 3090. Install the pip wheels or conda binaries with CUDA11 and it should work.


    참고로 설치된 cuda version은 nvcc --version 으로 확인이 가능하다.

    실제로 10.1 버전이 설치되어 있다. 

    $nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2019 NVIDIA Corporation
    Built on Sun_Jul_28_19:07:16_PDT_2019
    Cuda compilation tools, release 10.1, V10.1.243


    검색해보니 아래와 같은 형태로 설치하려는 버전에(+cu110)을 추가하면 된다고 함. 

    pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html

    (base) $ pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html
    Looking in links: https://download.pytorch.org/whl/torch_stable.html
    Collecting torch==1.7.1+cu110
      Downloading https://download.pytorch.org/whl/cu110/torch-1.7.1%2Bcu110-cp39-cp39-linux_x86_64.whl (1156.7 MB)
         |████████████████████████████████| 1156.7 MB 73 kB/s
    Collecting torchvision==0.8.2+cu110
      Downloading https://download.pytorch.org/whl/cu110/torchvision-0.8.2%2Bcu110-cp39-cp39-linux_x86_64.whl (12.8 MB)
         |████████████████████████████████| 12.8 MB 38.1 MB/s
    Requirement already satisfied: numpy in ./anaconda3/lib/python3.9/site-packages (from torch==1.7.1+cu110) (1.21.5)
    Requirement already satisfied: typing-extensions in ./anaconda3/lib/python3.9/site-packages (from torch==1.7.1+cu110) (4.1.1)
    Requirement already satisfied: pillow>=4.1.1 in ./anaconda3/lib/python3.9/site-packages (from torchvision==0.8.2+cu110) (9.0.1)
    Installing collected packages: torch, torchvision
    Successfully installed torch-1.7.1+cu110 torchvision-0.8.2+cu110


    관련 에러가 사라진 것을 확인할 수 있다. 


    $ python3
    Python 3.9.12 (main, Apr  5 2022, 06:56:58)
    [GCC 7.5.0] :: Anaconda, Inc. on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import torch
    >>> torch.cuda.is_available()
    >>> torch.randn(3,3).to("cuda:0")
    tensor([[-0.9963, -1.6286,  0.9210],
            [-1.0565, -1.7602,  1.5008],
            [-0.0684,  1.7714,  0.2132]], device='cuda:0')


    참고로 nvcc --version에서 나오는 버전은 변화가 없다. 


    (base) $ nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2019 NVIDIA Corporation
    Built on Sun_Jul_28_19:07:16_PDT_2019
    Cuda compilation tools, release 10.1, V10.1.243


    참고: https://ssaru.github.io/2021/05/05/20210505-til_install_rtx3090_supported_pytorch/


