02-21-2019 08:41 AM - edited 02-21-2019 08:43 AM
Is it possible that the issue is that without the Hyper-V hypervisor layer, the Lenovo-specific settings are being sent directly to the hardware, but with Hyper-V's hypervisor layer, the virtualized primary domain (Windows 10 Pro)'s ability to pass all of those settings to the most recent intel chipset hardware is partially blocked due to lack of support in passing the settings via Hyper-V?
There's a similar issue reported for Lenovo hardware in Qubes, where the virtualization hypervisor (Xen in that case), doesn't expose all of the hardware interfaces necessary for the primary virtualized primary domain (in this case Fedora-23 in dom0) to manage the thermals, so the BIOS hardware defaults are used (most typical impact: reduced battery life). Running standard Fedora-23 doesn't have the issue.
02-22-2019 04:14 PM
I was able to dig in a little bit into this issue today. Hyper-V definitely is affecting writing to the MSR_TEMPERATURE_TARGET register (0x1A2).
I downloaded RWEverything for Windows 10 and added the register 0x1A2 manually to the list. With Hyper-V enabled, the MSR bits 29-24 (the temperature offset, or how many degrees before the target temperature throttling should start) have hexadecimal value 14 - or 20 °C - and the MSR bits 23-16 (the temperature target) have value 64 - or 100 °C. This perfectly explains why thermal throttling kicks in 20 °C too soon.
Normally you would be able to change the bits 29-24 through RWEverything to control the throttling offset, but unfortunately Hyper-V is messing with this ability. Whenever I tried to change the values (or even just hit "Done" without any modifications), I would get BSOD. When I disabled Hyper-V, I could alter the register values without crashes.
Curiously, even without Hyper-V, the throttling offset was set to 18 °C and I couldn't see temperatures larger than that. So, I guess the rabbit hole goes even deeper. I didn't test disabling Intel Virtualization technology.
02-22-2019 06:14 PM - edited 02-22-2019 06:16 PM
i agree that repasting helps, but your knowledge with how thermal paste works and how they are applied at the factory is completely wrong. as linustechtips already tested, too much thermal paste has absolutely no thermal disadvantages. (https://www.youtube.com/watch?v=r2MEAnZ3swQ) It wont help lower temp, but it also wont cause the temps to be higher. 2nd of all, the thermal pastes are pre applied on to the heatsink, the workers just put the heatsink on with the thermal paste already applied. If you look to the parts lookup and look at the heatsink unit, you can see there is already thermal paste applied. The reason temps are so high without repasting is purely because they use horrible thermal paste, and has nothing to do with the thermal paste application.
02-25-2019 08:09 PM
02-25-2019 10:23 PM
I have a guess for what's going on.
* The hardware MSR_TEMPERATURE_TARGET register always sets the activation offset to 0x14 (possibly on startup or more regularly). This explains similar issues in Linux: https://www.reddit.com/r/thinkpad/comments/870u0a/t480s_linux_throttling_bug/
* When using Hyper-V, Windows enables "Device Guard" which provides some security features:
However, this feature may also affect the ability to set the MSR register from the host OS (only the virtual secure mode can access the registers). This was reported elsewhere;
I think that security feature explains the blue screen when using RWEverything to change the register. Presumably some Lenovo software sets the temperature offset, but it skips setting it in this mode?
I've tried to disable "Device Guard" using the group policy editor, but haven't had any luck. Disabling hyper-v does disable device guard though.
02-26-2019 12:28 AM
For me it is not the hyper-v that makes a difference but the bios setting. So vmware would not be of help.
I just tried again and disabled the hyper-v windows feature, rebooted, but still trhottling at 80 degrees.
02-27-2019 02:28 AM
Was just about to place a large X1E order but saw this thread and am now holding off until some official clarification from Lenovo. We need Hyper-V as we're a software development team. Is there really still no fix for this? Looking at the BIOS change logs it seems like the regression was introduced on the 1.15 release some months ago and has still not been fixed. What a shame.
02-27-2019 02:33 AM
@nbevans I don't think 1.15 caused this. I had just my second motherboard replacement and in the process, got downgraded to 1.13. I've now tested with 1.13, 1.15, 1.17 and 1.18 and can say that the issue has been present in all those BIOS versions for me.