Seemingly random memory_management or ntoskrnl.exe leading to BSOD in Windows 7 and 8.1

Hi,

  I have a laptop which periodically crashes on startup and/or during first use, but generally, after a few reboots, is fine.  I've had Windows 7 and currently 8.1 installed, hoping that it was a problem with the factory restore image, but alas the problem appears to be persisting across Windows versions.  I've tried running the manufacturer in-built and 3rd party memory and HD checking tools, but nothing's come up with any errors, which make me tend to believe it's driver issue at fault.

  I have the two most recent minidumps plus a MemDiag.bin (Zip of files on OneDrive) which apparently also found some problems, but I'm unsure how to debug this further.  Looking through the forum posts I've found what appear to be similar incidents (such as http://answers.microsoft.com/en-us/windows/forum/windows8_1-performance/random-bsods-all-the-time-even-after/d492f8b2-4145-41f4-9992-68dc73e080a3) so I understand it's possible to debug these issues, so I'd really appreciate any input on how best to go about this.  I have a professional background in programming and product support, so hopefully I'll be technical enough to follow any instructions - just keen not to be a leech and to learn the skills to help myself and others :)

Thanks in advance,

  Andy

 
Question Info

Last updated August 6, 2018 Views 2,688 Applies to:

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.

Hi Andy,


Both of the attached DMP files are of the MEMORY_MANAGEMENT (1a) bug check.

This indicates that a severe memory management error occurred.

BugCheck 1A, {41793, fffff680095c19d0, 2, 1}

- The 1st parameter of the bug check is 41793 which indicates an unknown memory management error occurred.

3: kd> dt nt!_MMPFN fffff680095c19d0
   +0x000 u1               : <unnamed-tag>
   +0x008 u2               : <unnamed-tag>
   +0x010 PteAddress       : (null)
   +0x010 VolatilePteAddress : (null)
   +0x010 Lock             : 0n0
   +0x010 PteLong          : 0
   +0x018 u3               : <unnamed-tag>
   +0x01c NodeBlinkLow     : 0
   +0x01e Unused           : 0y0000
   +0x01e VaType           : 0y0000
   +0x01f ViewCount        : 0 ''
   +0x01f NodeFlinkLow     : 0 ''
   +0x020 OriginalPte      : _MMPTE
   +0x028 u4               : <unnamed-tag>

It appears the data structure for the PFN database is zeroed out, which usually indicates corruption.

I've checked your loaded modules list and it's pretty clean, I see no mention of any 3rd party software that usually raises red flags for memory corruption. Just to be sure though before I recommend hardware diagnostics regarding memory, let's enable Driver Verifier:

Driver Verifier:

What is Driver Verifier?

Driver Verifier is included in Windows 8, 7, Windows Server 2008 R2, Windows Vista, Windows Server 2008, Windows 2000, Windows XP, and Windows Server 2003 to promote stability and reliability; you can use this tool to troubleshoot driver issues. Windows kernel-mode components can cause system corruption or system failures as a result of an improperly written driver, such as an earlier version of a Windows Driver Model (WDM) driver.

Essentially, if there's a 3rd party driver believed to be at issue, enabling Driver Verifier will help flush out the rogue driver if it detects a violation.

Before enabling Driver Verifier, it is recommended to create a System Restore Point:

Vista - START | type rstrui - create a restore point
Windows 7 - START | type create | select "Create a Restore Point"
Windows 8 - http://www.eightforums.com/tutorials/4690-restore-point-create-windows-8-a.html

How to enable Driver Verifier:

Start > type "verifier" without the quotes > Select the following options -

1. Select - "Create custom settings (for code developers)"
2. Select - "Select individual settings from a full list"
3. Check the following boxes -
- Special Pool
- Pool Tracking
- Force IRQL Checking
- Deadlock Detection
- Security Checks (Windows 7 & 8)
- DDI compliance checking (Windows 8)
- Miscellaneous Checks
4. Select  - "Select driver names from a list"
5. Click on the "Provider" tab. This will sort all of the drivers by the provider.
6. Check EVERY box that is [B]NOT[/B] provided by Microsoft / Microsoft Corporation.
7. Click on Finish.
8. Restart.

Important information regarding Driver Verifier:

- If Driver Verifier finds a violation, the system will BSOD.

- After enabling Driver Verifier and restarting the system, depending on the culprit, if for example the driver is on start-up, you may not be able to get back into normal Windows because Driver Verifier will flag it, and as stated above, that will cause / force a BSOD.

If this happens, do not panic, do the following:

- Boot into Safe Mode by repeatedly tapping the F8 key during boot-up.

- Once in Safe Mode - Start > Search > type "cmd" without the quotes.

- To turn off Driver Verifier, type in cmd "verifier /reset" without the quotes.
・    Restart and boot into normal Windows.

If your OS became corrupt or you cannot boot into Windows after disabling verifier via Safe Mode:

- Boot into Safe Mode by repeatedly tapping the F8 key during boot-up.

- Once in Safe Mode - Start > type "system restore" without the quotes.

- Choose the restore point you created earlier.

How long should I keep Driver Verifier enabled for?

It varies, many experts and analysts have different recommendations. Personally, I recommend keeping it enabled for at least 24 hours. If you don't BSOD by then, disable Driver Verifier.

My system BSOD'd, where can I find the crash dumps?

They will be located in %systemroot%\Minidump

Any other questions can most likely be answered by this article:
http://support.microsoft.com/kb/244617

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Patrick,

  Thanks for the prompt and detailed response.  TBH, I was concerned that it might be a hardware RAM failure, but when memcheck etc. didn't find any problems I presumed it was software/driver related.

  As recommended, I've put Driver Verifier on, however as you'd expect with this kind of random problem I'm now unable to cause the machine to BSOD!  Anything and everything I've tried which previously caused it to fail (plus some very RAM and CPU-intensive multitasking) is all performing fine, as is continued reboots of the laptop.

  With this in mind, is there any other option you can advise to try and force the error and/or check the hardware in a way I may be unfamiliar with?  I'm open to all options, but appreciate I may just have to leave Driver Verifier on until such time as the system BSOD's then come back to the forum.

Thanks again,

  Andy

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Unfortunately, that's just something I cannot say as there is a reason for every crash, and since we don't exactly know what's causing yours just yet, we cannot force it. With this said, the best we can do is leave verifier enabled and have you use the system as much as you can doing all of your normal tasks.

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Hi Patrick,

  After a period of running without error, the laptop's finally thrown another BSOD.  Minidump log can be found here - I believe the error message mentioned ndis_driver, but I could be wrong as it flashed up very quickly.  Again, if there's anything I can do to help with the investigation please let me know, otherwise I await any input anyone can add.

Regards,

  Andy

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Hi Andy,

The attached DMP file is of the BUGCODE_NDIS_DRIVER (7c) bug check.

This bug check indicates that a problem occurred with an NDIS driver.

BugCheck 7C, {1f, ffffe000027491a0, 1, 0}

^^ The 1st parameter of the bug check is 0x01 which indicates that the 2nd parameter of the bug check is a miniport address.

3: kd> !ndiskd.miniport ffffe000027491a0


MINIPORT

    [Pointer is unavailable; cannot dereference]

    Ndis Handle        ffffe000027491a0
    Ndis API Version   [Unreadable version value]
    Adapter Context    ffffe0000274b000
    Miniport Driver    ffffe0000274a068 - [Unreadable MiniBlock]  [Unreadable version value]
    Ndis Verifier      [Unreadable value]

    Media Type         802.3
    Physical Medium    802.3
    Device Path        String with 144 characters [Buffer at ffffc000005f88f0 is not available]
    Device Object      [DeviceObject at ffffe0000274a0a8 is not readable]
    MAC Address        [MAC address at ffffcf8001cfcfb0 is unavailble]

^^ We likely cannot read the Miniport driver or DevObj fields because it's not a Kernel-Dump. Anyway, from this, we can see that the type of media that disconnected was 802.3, which is WIRED ethernet.

STATE

    Miniport           PAUSING
    Device PnP         Started
    Datapath           00000002          ← DIVERTED_BECAUSE_MEDIA_DISCONNECTED
    NBL Status         NDIS_STATUS_MEDIA_DISCONNECTED
    Operational status [Unreadable value]
    Operational flags  [Unreadable value]
    Admin status       [Unreadable value]
    Media              MediaDisconnected
    Device Power       [Unreadable value]
    Driver Power       [Unreadable value]
    References         [Ref.ReferenceCount at ffffe0000274a328 is not readable]
    User Handles       0
    Total Resets       0
    Pending OID        None

^^ From the above we can see that 802.3 was reported disconnected, and because of that, there was a pause. The miniport cannot send data at this time because its media is disconnected.  NDIS will intercept transmitted packets and immediately return them with an unsuccessful status code.

The problem is, we likely never return from this pause and/or disconnected state, at least we didn't at the time of the crash. If we take a look at the call stack:

3: kd> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffd000`20186938 fffff800`00e542fd : 00000000`0000007c 00000000`0000001f ffffe000`027491a0 00000000`00000001 : nt!KeBugCheckEx
ffffd000`20186940 fffff800`00e3f485 : ffffe000`027491a0 00000000`00000000 ffffe000`02748020 ffffe000`0274a5e8 : ndis!ndisBugCheckEx+0x1d
ffffd000`20186980 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ndis!NdisMPauseComplete+0x1b235

&& NDIS calls a miniport driver's MiniportPause function to initiate a pause request for a miniport adapter. The miniport adapter remains in the Pausing state until the pause operation is complete.

After a miniport driver completes all outstanding send requests and NDIS returns all the network data structures in outstanding receive indications to the driver, the driver calls NdisMPauseComplete to complete the pending pause request. After the driver calls NdisMPauseComplete, the miniport adapter is in the Paused state.

NDIS calls the MiniportRestart function to initiate a restart request for a miniport adapter that is paused.

As we can see however, this never occurred, and instead, ndis called into a bugcheck which brought down the system.

-------------------

1. Ensure ALL of your network drivers are up to date via the manufacturers website.

2. How are you connecting to the internet on this laptop? Are you using the ethernet (ethernet cable plugged into laptop), or are you surfing the internet wirelessly? If you are surfing wirelessly, remove the wireless from the equation and attempt to surf on ethernet for awhile and see if the crashes stop. If you're using a wireless adapter (USB) instead of built-in wireless on the laptop, remove it.

3. If worst comes to worst, this is a faulty NIC within your laptop, and it will need to be sent in for repair to the manufacturer.

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Patrick,

  Again, thanks for the quick and detailed response.  In answer to your questions:

1) I'll get on that right away
2) Generally-speaking the laptop's connected wirelessly, however I have the option to run it wired in a couple of rooms so we'll see if that stops the crashes

Thanks again,

  Andy

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

My pleasure, and thanks for the information, Andy!

I look forward to your update.

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Hi Andy,

Just checking in, how's the system?

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Patrick,

  Thanks for what turned out to be a very timely check in ;)

  Everything had, broadly-speaking, been working fine until lunchtime today when the laptop decided to blue-screen once again.  The minidump log can be found here http://1drv.ms/1dA4ULX but I suspect it's going to point towards a potential hardware failure (RAM I suspect) but I await your take on the contents.

Thanks in advance,

  Andy

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

Andy, you would be correct in your unfortunate assumption.

Good news: We solved the 0x7C bug checks.

Bad news: We have a disk drive problem now, it appears.

KERNEL_DATA_INPAGE_ERROR (7a)

This bug check indicates that the requested page of kernel data from the paging file could not be read into memory.

BugCheck 7A, {fffff6e0000940b8, ffffffffc000003f, 8128f880, ffffc00012817e98}

ERROR_CODE: (NTSTATUS) 0xc000003f - {Bad CRC}  A cyclic redundancy check (CRC) checksum error occurred.

BUGCHECK_STR:  0x7a_c000003f

^^ No good, not what we like to see.

----------------------

Let's run Chkdsk (paste log afterwards) and then Seatools:

Chkdsk:
There are various ways to run Chkdsk~


Method 1:

Start > Search bar > Type cmd (right click run as admin to execute Elevated CMD)

Elevated CMD should now be opened, type the following:

chkdsk x: /r

x implies your drive letter, so if your hard drive in question is letter c, it would be:

chkdsk c: /r

Restart system and let chkdsk run.

Method 2:


    Open the "Computer" window
    Right-click on the drive in question
    Select the "Tools" tab
    In the Error-checking area, click <Check Now>.

If you'd like to get a log file that contains the chkdsk results, do the following:

Press Windows Key + R and type powershell.exe in the run box

Paste the following command and press enter afterwards:

get-winevent -FilterHashTable @{logname="Application"; id="1001"}| ?{$_.providername –match "wininit"} | fl timecreated, message | out-file Desktop\CHKDSKResults.txt

This will output a .txt file on your Desktop containing the results of the chkdsk.

If chkdsk turns out okay, run Seatools -

http://www.seagate.com/support/downloads/seatools/

You can run it via Windows or DOS. Do note that the only difference is simply the environment you're running it in. In Windows, if you are having what you believe to be device driver related issues that may cause conflicts or false positive, it may be a wise decision to choose the most minimal testing environment (DOS).

Run all tests EXCEPT: Fix All and anything Advanced.

Regards,

Patrick

Debugger/Reverse Engineer.

Did this solve your problem?

Sorry this didn't help.

Great! Thanks for marking this as the answer.

How satisfied are you with this reply?

Thanks for your feedback, it helps us improve the site.

How satisfied are you with this response?

Thanks for your feedback.

* Please try a lower page number.

* Please enter only numbers.

* Please try a lower page number.

* Please enter only numbers.