FYI: SMT Configuration Error Affecting AMD Ryzen Processors in Windows

An error was discovered by a user in the Anandtech Forums (Agent-47), where Windows is treating each CPU thread of a Ryzen processor as if each thread was a real core with its own L2 and L3 cache. This was tested with an 8 core, 16 thread Ryzen CPU (obviously).

I do not own one of these CPU's yet, and I've had reservations due to the mysterious performance delta of SMT on vs off, a teething problem AMD themselves are aware of. It would appear that Windows could be a culprit behind this performance delta. I do not know if Microsoft is aware of this SMT error or not. It's possible the fix is already in the works, I have no way of knowing that. If the rumor is true - that the next Xbox will be using a Ryzen-base APU - it's reasonable to assume such an error could also impact performance of a Windows-based gaming console as well.

Please discuss. Could this Windows error really be impacting performance for SMT that much? Will the error be fixed? Any Windows Devs on here that know about this error?

 

Discussion Info


Last updated June 28, 2019 Views 3,574 Applies to:

Well yes and no...

There are 2 sides to this... Recently AMD marketing has been less than convincing(they seem to have a track record of this Bulldozer, R600 and so on). They've seemed unable to choose their battles, are they going for prosumer/productivity or gaming?

Now, Ryzen seems like an excellent product. It shows enormous potential in heavy multitasking and/or heavy multithreaded software, and at half the price or better than at least the 6900K

It really seems as though everything was a bit rushed. Everybody keeps talking about teething problems... (granted the X99 platform had issues as well). So gaming performance is lacking, and I think most well-informed people can forgive that. But even for those who need it for other tasks will still have to deal with low memory speeds, lack of quad channel ram, and maybe even this plethora of other weird...

To answer your question;


Well let's say AMD and Microsoft can and will change the scheduler issue? Will the performance delta shown in those tests with SMT enabled and disabled REALLY change that much? We are talking 3-10% performance for what I gather in very specific situations? I'm not convinced as we've heard this song and dance before with Bulldozer and the windows scheduler and so on. 

Also all this SMT issue talk seems mainly based on the whole gaming discussion, which brings me back to this; How much will it truly change? While the issue does need focus, I'm worried that the reasoning behind is based on this gaming talk and won't change things too much in the grander scheme of things? 

To me at least, it seems like there are other more interesting problems to solve for the whole Ryzen platform, if and when these are fixed Ryzen will be a great option for some workloads.

As of now, there are issues surrounding Ryzen in which I don't see Windows being the culprit...

Now with my somewhat limited knowledge, is this directly related to the whole core complex vs mcm discussion. Claiming the Ryzen is a 2xQuad-core package with the design drawbacks related to this, shared and non shared L2, L3... Can a scheduler really fix problems at this level?

Can any dev's elaborate on this? 

That's very clear! Windows Sysinternals is seeing 136 MB of cache L2 + L3! So answer me ... who's Core iX now using 136 MB of cache L2 + L3? That's a clear question of unoptimized architeture, and yes, currently Windows 10 is not ready for Ryzen. That is very, very disappointing. I hope the Microsoft Developers Team solve this horrendous issue. If possible, for yesterday!

The scheduler can't fix latency between cross CCX caches, but can mitigate it keeping processes running in the same CCX.

In PS4, for example, cross CCX latency is 190 cycles, while same CCX latency is around 20 cycles.

If you keep changing processes from a CCX to another, you're certainly impacting the overall performance.

P.S.: CCX is the AMD core complex.

Bug Article

I just want MS/AMD to address the bug. I am not worried about how much it affects the improvement in games. We ought to  be able to use the software and hardware we bought with hard-earned money properly in all fields, not just games. AMD/Microsoft please fix this asap. Any news?

True enough. I suppose in windows its a simple matter of core affinity for software that doesn't need more than 8 threads to run properly, and manually assigning cores from one CCX.  

That also makes me wonder whether its a good idea to run core parking control with Ryzen, to keep all cores active all the time. I don't quite have my Ryzen system together yet as the Mobo is being delivered on the back of a turtle, so I don't actually know what the situation is with core parking and Ryzen, but just a thought that popped up while responding.

It is stupid to first assume that someone is maliciously causing you harm. It's far more likely that they are being stupid.

Yes, just managing core affinity per program (making all its threads use a single CCX) should be enough to mitigate this problem in a simple way.

Have no idea on how this problem relates to core parking. As far as I know, cores are parked if there is not enough workload and keeping everything on fewer cores is more energy efficient. If your question is about parking cores on each CCX, and how that could affect scheduling forcing a process/thread to change to another CCX, than I can't answer.

I don't know for sure how Windows scheduler and core parking algorithm works, but if they fix it for Zen, I expect that they're going to park more and more cores of a CCX until it is parked, then start parking cores from the other CCX. And, as they keep track on how much work the cores are doing, I expect them to shift the CCX that is going to get parked from time to time, whenever possible (user AFK, for example), so all the cores are used evenly.

About slow delivery, I'm on the same boat. 

This is not true, current test shows that improving the scheduler to be on par with intel scheduler could improve performance as high as 56% on single thread throughput. In regards to gaming.

Try using windows 7 their are no issues on windows 7 with hyperthreading or smt on 7