unable to enum cuda gpus no cuda capable device is detected что делать
Устройство с поддержкой CUDA не обнаружено, хотя требования установлены
Вот некоторая информация о моей системе:
Я также проверил, установлены ли заголовки ядра:
Установка CUDA
Так что моя система отвечает всем необходимым условиям. Затем я следовал инструкциям по установке через apt-get (я установил cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb ).
PATH а также LD_LIBRARY_PATH установлены для указания на необходимые места:
Драйверы NVIDIA также выглядят современными:
Информация о драйвере компилятора cuda:
В инструкциях упоминается, что это может быть проблемой с разрешением файла:
Если устройство с поддержкой CUDA и драйвер CUDA установлены, но deviceQuery сообщает об отсутствии устройств с поддержкой CUDA, это, вероятно, означает, что файлы /dev/nvidia* отсутствуют или имеют неправильные разрешения.
У тех файлов не было флага выполнения, который я тогда добавил:
Однако после запуска deviceQuery (который по-прежнему не работает) некоторые разрешения сбрасываются:
Может быть связано
Сборка образцов не удалась
Когда я пытаюсь собрать образцы CUDA через make это не удается для одного из них с сообщением
Который действительно, кажется, отсутствует:
Хотя соответствующий заголовочный файл есть:
Проблема со статической связью
Ошибка, которая возникает из deviceQuery предлагает проблему со статической связью:
насколько мне известно LD_LIBRARY_PATH отвечает только за динамическое связывание. Я нашел этот вопрос, где предложение включить /usr/lib/nvidia-current к пути компоновщика. Однако этот каталог не существует в моей установке:
no CUDA-capable device is detected (using ubuntu 12.04.4 server) [closed]
Want to improve this question? Update the question so it’s on-topic for Stack Overflow.
I recently installed the cuda toolkit 5.5 with driver 331.67 (I have a GeForce GTX 680). For some reason, I cannot run any of the test scrips:
I followed the steps on the «getting started guide» here
and made a script to create the character device files at startup (as I am running the server edition of Ubuntu such graphics files aren’t created by default):
Here is some info on the nvidia module
EDIT #1 I tried downgrading to driver 319.76:
1 Answer 1
So it turns out the main error I was encountering was due to the fact that there was a version mismatch between the nvidia kernel module and the driver component. Here are the steps I took which helped me find a resolution.
2) Having installed the kernel modules from the repos, I just picked the corresponding driver component with correct version. If you don’t know the version of your installed kernel module you can use modprobe and modinfo. For example, on my system
The module nvidia_304_updates was installed from the repos (package nvidia-updates-current). Its exact version is found with modinfo
After downloading and installing the corresponding driver component from the archive on the nvidia website,
, I was able to run the command
And the original script I was trying to execute
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected #8
Comments
VictorZuanazzi commented Nov 15, 2019
great work with the library!
I am trying to install it, but I am getting a cuda error. I have been using pytorch the gpus wihout problems until now.
The full line reads: RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1570910687650/work/aten/src/THC/THCGeneral.cpp:50
I am using python 3.7.3 and pytorch 1.3.
the output of nvidia_smi is:
Fri Nov 15 16:13:25 2019
+——————————————————————————+
| NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 |
|——————————-+———————-+———————-+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|==================+======================+======================|
| 0 GeForce GTX TIT. On | 00000000:04:00.0 Off | N/A |
| 22% 41C P8 18W / 250W | 11MiB / 12212MiB | 0% Default |
+——————————-+———————-+———————-+
| 1 GeForce GTX TIT. On | 00000000:06:00.0 Off | N/A |
| 22% 38C P8 17W / 250W | 11MiB / 12212MiB | 0% Default |
+——————————-+———————-+———————-+
| 2 GeForce GTX 108. On | 00000000:07:00.0 Off | N/A |
| 31% 34C P8 8W / 250W | 1283MiB / 11178MiB | 0% Default |
+——————————-+———————-+———————-+
| 3 GeForce GTX 108. On | 00000000:08:00.0 Off | N/A |
| 31% 33C P8 8W / 250W | 10MiB / 11178MiB | 0% Default |
+——————————-+———————-+———————-+
| 4 TITAN X (Pascal) On | 00000000:0C:00.0 Off | N/A |
| 23% 35C P8 8W / 250W | 10MiB / 12196MiB | 0% Default |
+——————————-+———————-+———————-+
| 5 TITAN X (Pascal) On | 00000000:0E:00.0 Off | N/A |
| 23% 30C P8 8W / 250W | 10MiB / 12196MiB | 0% Default |
+——————————-+———————-+———————-+
The text was updated successfully, but these errors were encountered:
No CUDA-capable device is detected although requirements are installed
Here is some information about my system:
I also verified the kernel headers are installed:
Installation of CUDA
So my system meets all the prerequisites. I then followed the instructions for the installation via apt-get (I installed cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb ).
PATH and LD_LIBRARY_PATH are set to point to the required locations:
The NVIDIA drivers also seem to be up-to-date:
Information about the cuda compiler driver:
The instructions mention that this could be a problem with file permission:
If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.
Those files didn’t have the execution flag which I then added:
However after running deviceQuery (which still fails) some of the permissions are reset:
Maybe related
Samples build fails
When I try to build the cuda samples via make it fails for one of them with the message
Which indeed seems to be missing:
Although the corresponding header file is there:
Problem with static linking
The error which is raised from deviceQuery suggests a problem with static linking:
AFAIK LD_LIBRARY_PATH is only responsible for dynamic linking. I found this question where a suggestion is to include /usr/lib/nvidia-current to the linker path. However this directory doesn’t exist within my installation:
Tensorflow complains that no CUDA-capable device is detected
I’m trying to run some Tensorflow code, and I get what seems to be a common problem:
The key pieces of that error message seem to be:
How can I install compatible versions? Where is that libcuda version coming from?
Background
A few months ago, I tried installing Tensorflow with GPU support, but the versions either broke my display or wouldn’t work with Tensorflow. Finally, I got it working by following a tutorial on how to install multiple versions of the CUDA libraries on the same machine. That worked at the time, but when I came back to the project after a few months, it has stopped working. I assume that some driver got upgraded during that time.
Investigation
The first thing I tried was to see what versions I have of the nvidia drivers and libcuda package.
Looks like it’s 390.30. Why does the error message say that libcuda reported 390.77?
Again, everything looks like it’s 390.30. There were some packages that had version 390.77, but they were in the rc status. I guess I installed that version and later removed it, so the configuration files were left behind. I purged the configuration files with commands like this:
Now, there are no packages at all with version 390.77.
I tried reinstalling CUDA, to see if it had been compiled with the wrong version.
That didn’t make any difference.
Finally, I tried running nvidia-smi.
All of this is running on Ubuntu 18.04 with Python 3.6.7, and my graphics card is NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2).