DevHeads.net

Nvidia-detect error with on HP Z4 (CentOS 6.9)

Hi all,

I'm testing an installation of nvidia drivers on a HP Z4 workstation
(nvidia Quadro P600) with CentOS 6.9. Running nvidia-detect with this
setup gives the following output:

# nvidia-detect
Error getting device_class

nvidia-detect also quits with exit-code 255.
Could this be a bug in nvidia-detect? Or is it an unsupported configuration?

The following hardware is detected, it seems some sort of unknown
Intel device is detected by the OS:

# lspci | grep VGA
00:1f.5 Non-VGA unclassified device: Intel Corporation Device a2a4
21:00.0 VGA compatible controller: NVIDIA Corporation Device 1cb2 (rev a1)

# lspci -n | egrep '00:1f.5|21:00.0'
00:1f.5 0000: 8086:a2a4
21:00.0 0300: 10de:1cb2 (rev a1)

Tested with the following version (with equal results):

nvidia-detect-390.25-1.el6.elrepo.x86_64.rpm
nvidia-detect-390.48-1.el6.elrepo.x86_64.rpm

Comments

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Phil Perry at 04/13/2018 - 14:50

On 13/04/18 16:21, Danny Smit wrote:
Hi Danny, I'm the author of nvidia-detect.

nvidia-detect scans the pci bus and checks the returned device_class for
display controllers. In your case, the scan is not returning any devices
(or rather the device_class for any pci devices)

The internal error checking is displaying the above error message and
exiting with the appropriate error code, as intended.

Good question, and I've no idea why it's not working on your machine.

Your device is supported:

$ nvidia-detect -l | grep -i 1cb2
[10de:1cb2] NVIDIA Corporation GP107GL [Quadro P600]

Support was added in the 375.39 NVIDIA driver. I assume the driver works
as expected for you?

If you are able to offer any more clues, please feel free to open a bug
report on elrepo.org/bugs for us to track. Happy to help if I can.

Phil

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Danny Smit at 04/13/2018 - 17:33

On Fri, Apr 13, 2018 at 8:50 PM, Phil Perry < ... at elrepo dot org> wrote:
Yes it works perfectly fine.

I created a bugreport: <a href="http://elrepo.org/bugs/view.php?id=839" title="http://elrepo.org/bugs/view.php?id=839">http://elrepo.org/bugs/view.php?id=839</a>

This part makes me wander though, if I'm correct the second column in
the "lspci -n" output seems to be the class identification. If you
look at the first line of the lcpi output, lspci (or the kernel?)
doesn't seem to recognize the class of some other Intel device, that
probably has nothing to do with the nvidia device at all. It's numeric
class identification seems to be "0000" though (for unclassified?)

Could it be that in the piece of code below (from nvidia-detect), the
device_class is zero because of that line, and nvidia-detect exits?

if (!dev->device_class) {
fprintf(stderr, "Error getting device_class\n");
ret = -1;
goto exit;
}

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Phil Perry at 04/14/2018 - 07:51

On 13/04/18 22:33, Danny Smit wrote:
Thank you.

Yes, I totally missed that in your posted output earlier. You are
correct, I never anticipated a device class of zero (unclassified
device) when I wrote that error checking code, and indeed that is what
is causing the error to be triggered. So it's a bug in the error
checking code, nice catch!

Anyway, I've fixed it and am just waiting for our build systems to be
available to rebuild and push a release for you (will be later today).
I'll update the bug report once that is done, and perhaps you could then
confirm it's working as expected for you.

Thanks again for the catch.

Phil

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Danny Smit at 04/17/2018 - 03:38

On Sat, Apr 14, 2018 at 1:51 PM, Phil Perry < ... at elrepo dot org> wrote:
I can confirm that it works now on the same hardware. nvidia-detect is
able to detect the required driver version as expected.

Thanks for the quick response!

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Phil Perry at 04/17/2018 - 14:15

On 17/04/18 08:38, Danny Smit wrote:
Brilliant - thanks Danny

Re: Nvidia-detect error with on HP Z4 (CentOS 6.9)

By Akemi Yagi at 04/13/2018 - 11:23

On Fri, Apr 13, 2018 at 8:21 AM, Danny Smit <danny.smit. ... at gmail dot com> wrote:
You want to post this to the elrepo mailing list.

Akemi