PC crashing - BSOD Watchdog violation 1x133 (possibly nVidia issue)

Recently I have built a 3D rendering station for a client of mine. The machine runs, but crashes about once or twice a day with a BSOD (watchdog violation 1x133. Blue Screen View mentions either nvlddmkm.sys or ntoskrnl.exe as culprit.

Things I have tried so far:

- Clean install Windows 10

- Clean install Windows 7

- DDU removal of graphics drivers

- latest nVidia drivers

- older nVidia drivers

- extensive RAM testing

Specs:

- Asus X99-WS 10G E

- Intel 6850K

- 7x Zotac 1080ti Blower Edition (transformed to water cooling)

- 64GB RAM

- Samsung 960 Pro 512GB (system drive)

- 3x WD Red in RAID 0 (data drive)

- 2x Cooler Master V1200 PSU

- Custom EK watercooling loop

- Obsidian 900D


Speccy Snapshot: http://speccy.piriform.com/results/qchWoHD0m7mueWTEVccxFlD

Any help or insight on this issue would be greatly appreciated, as this issue now causes great delays in my client's workflow. 

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

On my end I ended up testing for more than a week different scenarios, which leaded to different results. 

First, here are more details of my spec : 
http://speccy.piriform.com/results/WCUj3JJBveoxwgg23G0b6WM


- Windows 10 Pro 64-bit - built 1703 (deferred 1709 so far, but will test it as well)
- Intel Core i7 5960X @ 3.00GHz - no OC
- RAM 64.0GB 
- ASUS X99-E WS 
- SSD 960 Patriot
- 4 x NVIDIA GeForce GTX 780 (EVGA) (06G-P4-3787-RX) 
- Custom EK watercooling loop
- Obsidian 900D


No SLI bridge


Things I have tried so far

- Clean install win 10 (1507). 
- Update to 1703. By default, windows is installing everything that he can, including the Graphics cards (388.13) - crash
- DDU (to downgrade the graphics driver)
- install 368.69 = stable
- install 378.92 = stable
- starting 380.xx - Crash
- 391.11 - crash


Things that I've also tried, looking at a thousand of forums 

1 Remove FAST START & KILL HIBERNATE
2 Remove NVIDIA Audio Drivers
3 Force Standard SATA AHCI Controler 
4 Turn off Driver update via Group policy / Reg / Cab 
5 Manually tuned up TDR values to 60


1 - https://www.tenforums.com/tutorials/4189-turn-off-fast-startup-windows-10-a.html

2 -REMOVE AUDIO DRIVER. http://www.tomshardware.com/forum/id-2813505/dpc-watchdog-violation-bsod-nvidia.html (2015)

"most of the watchdog timeout problems I have looked at that involved the nvida driver were caused by a conflict in the sound support for the motherboard sound driver and the high definition sound driver for the nvidia driver. 
You are not likely to get update windows 10 sound drivers for this motherboard but you might disable the nvidia high definition sound support in windows control panel if you are not actually using sound to your monitors via your video cable (HDMI)
(this assumes the other causes of video hangs have been eliminated as a potential problem source, IE overheating, overclocking, power problems, mixed build of GPU drivers)"

3 - https://answers.microsoft.com/en-us/windows/forum/windows_10-performance/pc-crashing-bsod-watchdog-violation-1x133-possibly/e954c392-12a4-4728-8447-ae470bdcc61a

4 - https://www.tenforums.com/tutorials/48277-enable-disable-driver-updates-windows-update-windows-10-a.html

5 - https://support.allegorithmic.com/documentation/display/SPDOC/GPU+drivers+crash+with+long+computations


Most of my tests were done with

- OPEN HARDWARE MONITOR to check/control GPU usage (MSI afterburner tried as well) - all temperature under 50°C.
- WHOCRASHED (always there to catch our good friend DPC_WATCHDOG_VIOLATION - ntoskrnl.exe & nvlddmkm.sys)

I've test couple of applications in order to test out the real-life situations

- Substance Painter 2 which uses both Open Gl and Iray (tricky part, it allows CPU + GPU*)
- Octane Render Standalone 3.08-test6.1 & C4D plugin 3.08-test6.1
- Netflix in a browser (chrome and edge)

and also some benchmarks
- OCTANEBENCH 3.06.2 & OCTANEBENCH 2.17
- Valley Benchmark 1.0

-------------------------------------------------------------------------------------------------------------------------------------------------------------


So my main problem is moving forward with win 10 and the latest version of my software which will require updated NVIDIA drivers 

Octane 3.08 RC1 - Driver needs to be updated to 390.xx in order to support CUDA 9.1
Substance Painter 2.6.1 (Iray 2016.3.1.4.0) = works well vs Substance Painter 2017.4.2 (Iray 2017.1.2.4.0) and forward - Driver need to be 388.xx minimum in order to use the GPU (I founded no info on which version of CUDA is used in it but I'm guessing Iray might use Cuda 9)

*NB : driver 382.05 (360.xx - 383.xx) with Substance Painter 2017.4.2 :IRAY won't WORK WITHOUT CPU CHECKED - so despite the fact that all the GPU's are checked, the cards are listed, they aren't working - Just the CPU. No message regarding old GPU drivers = confusing.



Not sure if this could be a reason
Despite the fact that I bought all 4 cards the same day from the same vendor, I realized that my 4 cards are a bit different :

3 cards 10DE - 1007 (also none as GK110 [GeForce GTX 780 Rev. 2])
PCI\VEN_10DE&DEV_1007&SUBSYS_37873842&REV_A1
PCI\VEN_10DE&DEV_1007&SUBSYS_37873842
PCI\VEN_10DE&DEV_1007&CC_030000
PCI\VEN_10DE&DEV_1007&CC_0300

BIOS 80.80.58.00.82

1 card with 10DE - 1004 (also none as GK110 [GeForce GTX 780])
PCI\VEN_10DE&DEV_1004&SUBSYS_17873842&REV_A1
PCI\VEN_10DE&DEV_1004&SUBSYS_17873842
PCI\VEN_10DE&DEV_1004&CC_030000
PCI\VEN_10DE&DEV_1004&CC_0300

BIOS 80.80.45.00.80


After realizing that
- I decided to block the 3 cards with group policy (https://www.howtogeek.com/263851/how-to-prevent-windows-from-automatically-updating-specific-drivers/) 
- and to try out to run the latest driver 391.11 with only 1 card (1004) and run various tests 
= everything is stable. 

Reintroducing even one other card is causing BSOD.



Conclusion: the common factor of BSOD :
WIN 10, latest NVIDIA drivers, multiple GPU



So, if anybody in this community have a clue or a solution, that would be welcome!



Good to know 

- Clean install Windows 7
With the latest update, hotfix, and latest 391.11 drivers, everything is stable!

But it is Win 7.

Come on.


Bonus Track

I tried to figure out what was introduced in the 380.xx version vs older 370.xx drivers, but I couldn't pinpoint exactly what it is despite this obvious • HD Audio Driver - 1.3.34.26 < new version + • NVIDIA PhysX System Software - 9.17.0329 < new version  :


https://us.download.nvidia.com/Windows/378.92/378.92-win10-win8-win7-desktop-release-notes.pdf (latest 370.xx)

Software Module Versions
• NView - 148.47
• HD Audio Driver - 1.3.34.23 
• NVIDIA PhysX System Software - 9.16.0318
• GeForce Experience - 3.4.0.70
• CUDA - 8.0

Existing Support
This release supports the following APIs:
• Open Computing Language (OpenCLTM software) 1.2 for NVIDIA® KeplerTM, MaxwellTM, and PascalTM GPUs
• OpenGL® 4.5
• Vulkan® 1.0
• DirectX 11
• DirectX 12 (Windows 10, for Kepler, Maxwell, and Pascal GPUs)


VS 


https://us.download.nvidia.com/Windows/381.65/381.65-win10-win8-win7-desktop-release-notes.pdf (first 380.xx)

Software Module Versions
• NView - 148.47
• HD Audio Driver - 1.3.34.26 < newer version
• NVIDIA PhysX System Software - 9.17.0329 < newer version
• GeForce Experience - 3.4.0.70
• CUDA - 8.0 

Existing Support
This release supports the following APIs:
• Open Computing Language (OpenCLTM software) 1.2 for NVIDIA® KeplerTM, MaxwellTM, and PascalTM GPUs
• OpenGL® 4.5
• Vulkan® 1.0
• DirectX 11
• DirectX 12 (Windows 10, for Kepler, Maxwell, and Pascal GPUs)



New Features
• Added support for Windows 10 Creators Update.
• Added DTS X and Dolby Atmos support for 5.1.2 speaker configuration.
• Added Dolby Vision support for games.
• Added NVIDIA® AnselTM support for Snake Pass and Kona.
• NVIDIA Control Panel
• Display page: Added the option to override the Windows 10 control of desktop color settings.



Plus here is the introduction of support for Cuda 9.0


https://us.download.nvidia.com/Windows/387.92/387.92-win10-win8-win7-desktop-release-notes.pdf

What’s New in Version 387.92 WHQL 

Software Module Versions
• NView - 148.47
• HD Audio Driver - 1.3.35.1
• NVIDIA PhysX System Software - 9.17.0524
• GeForce Experience - 3.9.0.97
• CUDA - 9.0 

New Features
• Added support for OpenGL 4.61
• Added NVIDIA GameStream support for HDR under Windows 10
• Added Fast Sync support for SLI
• NVIDIA Maxwell GPUs: Any resolution less than 4k
• NVIDIA Pascal & later GPUs: Any resolution
• Implemented improved behavior for full-screen Vulkan swapchains using VK_KHR_win32_surface.
• Added support for the DirectX Intermediate Language (DXIL)
• Includes full support for DirectX 12 Shader Model 6.0, features such as Wavemath, and the DirectX Shader Compiler.
• Supported only on NVIDIA Kepler and later GPUs.




LINKS

https://en.wikipedia.org/wiki/Windows_10_version_history
https://www.laptopmag.com/articles/disable-automatic-driver-downloads-on-windows-10
https://support.allegorithmic.com/documentation/display/SPDOC/GPU+drivers+crash+with+long+computations
https://render.otoy.com/forum/viewtopic.php?f=85&t=64382
https://pci-ids.ucw.cz/read/PC/10de/1007 + https://pci-ids.ucw.cz/read/PC/10de/1004
https://render.otoy.com/forum/viewtopic.php?f=85&t=65728
http://www.guru3d.com/files-categories/videocards-nvidia-geforce-vista-%7C-7.html

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

A 3D Rendering professional advised me to use NVIDIA driver version 382.53 for Quad GPU setup and it does work in Windows 10 Pro, with the ASUS X299 Sage motherboard (which is PLX-powered, like the X99-E WS) without issues.

However, as you said, the older drivers don't support the latest version of the CUDA API.

On an ASRock X99 WS motherboard, with all other components the same, I have no problem at all using 4x GPUs with latest NVIDIA drivers.

The ASUS Workstation motherboards use PLX chips for the PCI-E slots. A number of drivers are possibly involved in the communication between the CPU and the NVIDIA GPU: NVIDIA's driver, Windows Kernel, Intel Chipset driver (how the PLX PCI-E switch is handled).

I don't want to blame NVIDIA because this is likely a bug that may involve NVIDIA as well as Microsoft, but it is only NVIDIA who has the tools and the know-how to pinpoint the source of the problem and fix it.

I have openned a ticket with NVIDIA (Reference: 180226-000525) and my understanding is that it is in their radar.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Ok thank you for your input, I'm going to test this out.

I'm also wondering if maybe it has to do with the resolution.
I'm using two monitors of 32' 4K (BenQ 32-Inch IPS 4K BL3201PH) on DisplayPort Cable (Cable Matters Gold Plated Cable 10 Feet) both using a different card. 1004 + 1007

https://www.amazon.com/gp/product/B00O1B5M9I/ref=oh_aui_search_detailpage?ie=UTF8&psc=1

https://www.amazon.com/gp/product/B005H3Q5E0/ref=oh_aui_search_detailpage?ie=UTF8&psc=1

In win 7, I'm using them at the resolution of 2560 x 1440 as the rendering in Win 7 at 150% is horrible and 125% too small.

In win 10, I'm using them a the recommended resolution, 4k (3840 x 2160) with the nice rendering sharp pixel, down to 150%.

Not sure if that would make a difference, but who knows? :)


Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

You can use DDU utility from Windows Safe Mode, to clean you machine from current display driver files, then install the driver that you need, selecting Custom Installation and Clean Installation.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Yeah, I'm now well aware of the whole process, thank you!

I'm now running with the 382.53 Driver, and so far, no issue with couple of Benchmark. Will know more tomorrow. Thanks!

Is there a way to know what is changing / causing the issue between the different version of drivers?

https://us.download.nvidia.com/Windows/382.53/382.53-win10-win8-win7-desktop-release-notes.pdf

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Only NVIDIA can determine what's going on within its driver. That's because, besides new features, NVIDIA has to also accommodate changes made in the Windows Kernel by Microsoft.

I've read that Windows 7 doesn't suffer from the same issues. Its kernel is stable and the NVIDIA driver is well tested on that OS.

Windows 8, 8.1 and, now Windows 10 are a mess. There are major changes in how hardware resources are managed and application developers can't fix kernel related issues; they can only report them to Microsoft and wait for fixes to be applied in the next patch, which often breaks something else.

That's the problem of continuous updates in Windows 10. We were served a half baked, buggy OS and now suffer from the consequences of poorly coordinated, continuous changes.

Currently, in Windows 10, 3D rendering professionals suffer from GPU-related system crashes, and audio professionals suffer from audio dropouts and pops caused by high DPC Latency.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Thank you for this insight, very helpful to know that this is happening somewhere else as well.

Regarding driver 382.53, unfortunately, it failed as well. System crashes with the difference of recovering two times before BSOD. No recovering like TDR recovery. Just hang, cursor & sound freeze, then cursor & sound back, then freeze again, then finally dies.


Reverted to 378.92. Stable with Win 10 v1703.

Quick question, which version of Win 10 are you running with 382.53? 

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Thank you for this insight, very helpful to know that this is happening somewhere else as well.

Regarding driver 382.53, unfortunately, it failed as well. System crashes with the difference of recovering two times before BSOD. No recovering like TDR recovery. Just hang, cursor & sound freeze, then cursor & sound back, then freeze again, then finally dies.


Reverted to 378.92. Stable with Win 10 v1703.

Quick question, which version of Win 10 are you running with 382.53? 

382.53 is stable on Win 10 Pro 64bit 1709 (OS Build 16299.251).

It has only frozen once, while running heavy GPU workloads on all 4x GPUs, when I attempted to run GPU-z, at the time the utility was scanning the system (on the splash screen). Hasn't happened again and I've tested this scenario several times.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Hello! 

I'm also on 382.53 with win 10 ASUS X99 E WS
4X 1080TI Win 10 Pro 64bit 1709 (OS Build 16299.251)

the ONLY stable driver atm is 382.53. I've tested pretty much everything. And like another member I need to update my drivers to run the latest Octane Render's version. 

Hope that the problem will be solved soon. If we need to do multiple messages for Microsoft or Nvidia I'm here :) 
the more we will be the more the updates will come fast. 

Cheers
Max

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

Hello! 

I'm also on 382.53 with win 10 ASUS X99 E WS
4X 1080TI Win 10 Pro 64bit 1709 (OS Build 16299.251)

the ONLY stable driver atm is 382.53. I've tested pretty much everything. And like another member I need to update my drivers to run the latest Octane Render's version. 

Hope that the problem will be solved soon. If we need to do multiple messages for Microsoft or Nvidia I'm here :) 
the more we will be the more the updates will come fast. 

Cheers
Max


In response to Ticket Ref. 180226-000525 NVIDIA Support has confirmed that the bug has been filed, but they cannot commit to how soon it will be resolved.

Looks like both MS and NVIDIA update their software too often and release it without sufficient, timely testing.

ASUS Support does not exist (in the UK, at least). They don't pick up the phone or answer to tickets submitted via their web site. I used FB Messenger to inform ASUS Rog UK about the problem and they promised to forward it to HQ.

It always helps when more people open tickets with clear explanation of the problem. If  you haven't openned a ticket with ASUS, NVIDIA and Microsoft, please consider doing so.

Was this reply helpful?

Sorry this didn't help.

Great! Thanks for your feedback.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this reply?

Thanks for your feedback.

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

 
 

Question Info


Last updated February 1, 2022 Views 6,937 Applies to: