Welcome to our peer-to-peer forums, where owners help owners. Need help now? Visit eSupport here.

English Community

Linux Operating SystemsOther Linux Discussions
All Forum Topics
Options

15 Posts

06-24-2020

United States of America

21 Signins

130 Page Views

  • Posts: 15
  • Registered: ‎06-24-2020
  • Location: United States of America
  • Views: 130
  • Message 421 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-24, 3:19 AM

@ Dirdi wrote:
 

 

@ NariOX wrote:
 

 

@Dirdi My MX150 will still freak out if it reaches over 80C, this is done in firmware (I believe). While my CPU has been "fixed", I still have a system that I cannot use to the 94C specified by nvidia. 

 

Today, I got my hands on a T480 with a dedicated NVIDIA GeForce MX150. I ran a GPU benchmark for about an hour and the system was stable at ~91°C GPU and ~95°C CPU:

 

 

@Dirdi First of all thank you for your instructions.  They made me realize the mistake I made configuring thermald.

 

What GPU benchmark did you run?  I tried running Unigine Valley on Fedora 30 with thermald 2.2 on kernel 5.6 with NVIDIA set as the Primary GPU and with PRIME Render Offload and had the same issue as @NariOX 

 

And I agree with you there is no need for a firmware workaround for a T480.  I don't even think this issue was a bug.  It was a just a feature that had not been implemented in Linux.

Reply
Options

41 Posts

01-14-2017

United States of America

74 Signins

789 Page Views

  • Posts: 41
  • Registered: ‎01-14-2017
  • Location: United States of America
  • Views: 789
  • Message 422 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-24, 12:26 PM

@ NariOX wrote:
 

  • My `nvidia-smi` doesn't show anything odd:

    Temperature GPU Current Temp : 77 C

    GPU Shutdown Temp : 102 C

    GPU Slowdown Temp : 97 C

    GPU Max Operating Temp : 94 C

Fascinating, I would have expected that the Max Operating Temp was set to a lower value.

 

  • My thermald service was modified to include the `--workaround-enabled`, and my `thermal-conf.xml`and `thermal-conf.xml.auto` both are copied from your post, but with the product name changed to mine (acquired with ` sudo dmidecode | grep -e "Product Name"`, which reports `Product Name: 20L5CTO1WW`)* `thermald` is running. It does report a warning

    thermald[7984]: [1592918783][WARN]sysfs open failed

but I don't know if that means anything. Running it with `loglevel=info` gives that the error is from:

    [1592918735][INFO]sysfs read failed constraint_0_max_power_uw

I get similar warnings, but they do not seem to cause problems:

Jun 23 17:31:52 laptop systemd[1]: Starting Thermal Daemon Service...

Jun 23 17:31:52 laptop systemd[1]: Started Thermal Daemon Service.

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]22 CPUID levels; family:model:stepping 0x6:8e:a (6:142:10)

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]Polling mode is enabled: 4

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]sensor id 10 : No temp sysfs for reading raw temp

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]sensor id 10 : No temp sysfs for reading raw temp

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]sensor id 10 : No temp sysfs for reading raw temp

Jun 23 17:31:52 laptop thermald[1048]: [1592926312][WARN]sysfs open failed

 

 

  • Running NVIDIA 440.82* I have tested with `bumblebee`, with `nvidia-xrun` and with using PRIME offloading. Same results.

Until now I always used primusrun (apt install primus) instead of optirun: "$ primusrun programThatNeedsdGPU". Btw. I installed the NVIDIA driver via the package manager.

 

One thing I did notice is that if I run `dptfxtract`, the generated `thermal-conf.xml.auto` only has `<Preference>QUIET</Preference>`, but no `PERFORMANCE` preference. But I have set all options to `performance` on the BIOS and even on the Windows install that I keep in an external drive.

Here is what my BIOS power settings look like:

Have you tried the "Maximum Performance" / "Maximize Performance" options?

 

I have also tried with the laptop flat on the table and with it where I have it normally (a stand, where it is probably around 70 degrees inclined up). I do have the extended battery, so it doesn't quite stay all the way down to the table, but that shouldn't be a problem, right?

No, I do not think so, since my T480 is docked to a ThinkPad Ultra Dock, also has the extended battery and does not lay flat on the table.

 

Are you using `ACPI_OSI` kernel parameter?

No.

 

@IvanW What exactly was the mistake in your case?

Does it work for you now?

 

I have to admit that "Benchmark" might have been the wrong word. I actually installed a space flight simulation game, set all graphics options to high or max, activated the in-game autopilot and let it fly around for about an hour. At a second monitor I ran s-tui and "watch -dn1 -- 'sensors; nvidia-smi'" to observe temperatures (~91°) and GPU utilization (~100%).

Reply
Options

49 Posts

11-02-2012

United States of America

67 Signins

374 Page Views

  • Posts: 49
  • Registered: ‎11-02-2012
  • Location: United States of America
  • Views: 374
  • Message 423 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-24, 13:07 PM

@Dirdi 

Yes, I have it with maximize performance. What I am wondering is whether Lenovo Vantage can set some permanent option in the EC. I have tried it using an external drive with Windows, but maybe I did it wrong? Also, can you send the output of `nvidia-smi -q -d clock ` when you are above 80C? It's possible the GPU is reaching higher temps, but the clock is throttled (the temps might be high due to high CPU temps).

 

As for benchmark, I have been using furmark. Before throttling, I have about 55 fps (47fps using primus), after throttling it goes down to 23 fps. The iGPU does about 28fps, for reference.

 

 

@MarkRHPearson : I have compiled a kernel with MJG59's patches, but the 80C limit on the GPU remains. One interesting thing I have found out is that after suspend, the GPU clocks and the fan go back to "normal", maybe it is a clue? Also, nvidia-smi gives that the throttle reason is "SW Power Cap"

 

Reply
Options

15 Posts

06-24-2020

United States of America

21 Signins

130 Page Views

  • Posts: 15
  • Registered: ‎06-24-2020
  • Location: United States of America
  • Views: 130
  • Message 424 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-24, 22:43 PM

@ Dirdi wrote:
 

 

@ NariOX wrote:
 

  • My `nvidia-smi` doesn't show anything odd:

    Temperature GPU Current Temp : 77 C

    GPU Shutdown Temp : 102 C

    GPU Slowdown Temp : 97 C

    GPU Max Operating Temp : 94 C

Fascinating, I would have expected that the Max Operating Temp was set to a lower value.

 

I rebooted into Windows and installed GPU-Z and NVIDIA Inspector and discovered that the performance level profile 3 (P0) has a temperature limit of 70 C on Windows.  NVIDIA Inspector says this can be changed but I was too chicken to do it.  On Linux my current temp limit is 80 C same as @NariOX.  

 

I'm guessing that the reason that there are different temperature limits is because on Windows the NVIDIA driver comes from Lenovo and they probably customized it but on Linux I'm using the driver from NVIDIA.

 

Do you know if the t480 that you benchmarked is using a custom overclock profile created with gwe or some other tool?  Because that would explain the difference in GPU temps on Linux.

 

 

@IvanW What exactly was the mistake in your case?

Does it work for you now?

 

Mistake was not reading the thermald man page.  Yes it works for me now.

 

Reply
Options

41 Posts

01-14-2017

United States of America

74 Signins

789 Page Views

  • Posts: 41
  • Registered: ‎01-14-2017
  • Location: United States of America
  • Views: 789
  • Message 425 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-25, 13:03 PM

@NariOX Do I understand you right that it is no longer your CPU that gets throttled but "only" your dGPU? If this was true I do not think that this problem is directly related to DPTF. I found this thread over at the NVIDIA forums where some other MX150 owners report the same issue on Windows for various manufacturers (Lenovo, Acer, Dell, etc.).

Reply
Options

49 Posts

11-02-2012

United States of America

67 Signins

374 Page Views

  • Posts: 49
  • Registered: ‎11-02-2012
  • Location: United States of America
  • Views: 374
  • Message 426 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-25, 14:03 PM

But it is odd that `nvidia-smi` reports higher temperatures as the throttling limit. Were you able to check if you MX150 was throttling when you got to 91C?

 

One last thing, can you also check your VBIOS version? I have 86.08.3B.00.38 (you can check with nvidia-smi -q | grep VBIOS).

 

If I got an official "NVidia requires us to throttle at 80C" (aka: "we have no choice") or "we have limited the dGPU to 80C to increase longevity" (aka: "it's a feature, not a bug"), I would be content, but I want to make sure my machine is performing as well as it was designed to.

Reply
Options

41 Posts

01-14-2017

United States of America

74 Signins

789 Page Views

  • Posts: 41
  • Registered: ‎01-14-2017
  • Location: United States of America
  • Views: 789
  • Message 427 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-25, 14:52 PM

@ NariOX wrote:
 

But it is odd that `nvidia-smi` reports higher temperatures as the throttling limit. Were you able to check if you MX150 was throttling when you got to 91C?One last thing, can you also check your VBIOS version? I have 86.08.3B.00.38 (you can check with nvidia-smi -q | grep VBIOS).But if I got an official "NVidia requires us to throttle at 80C" or "we have limited the dGPU to 80C to increase longevity" (aka: "it's a feature, not a bug"), I would be content. Aside from furmark, I can't get my GPU to go beyond 80C (I'm using conductonaut :P)

 

In the meantime I ran the Unigine heaven benchmark and my MX150 was throttled to ~423MHz from the very beginning, while dGPU temperatures did not raise over 80° C at any moment. However, when I ran "stress" in parallel, the CPU heated up the dGPU over 90° C and the dGPU still ran at ~423 MHz. Thus I suspect the temperature not raising above 80°C is a result of the throttling to ~434MHz and not vice versa. The fan spun up to ~4000 RPM but dropped back to 3500 RPM almost immediately after stopping the dGPU benchmark and stress and back to ~3200 RPM once the CPU temperature dropped to ~55°C. The CPU was only throttled when it reached 95°C. Except for the dGPU throttling, I would consider all those observations being expected behavior.

 

"nvidia-smi -q -d power" reported alternatively that "Idle" and "SW Power Cap" were responsible for throttling of the dGPU. While I agree with you that the dGPU is throttled too early, I do not see a direct connection to DPTF. I think this could be either a NVIDIA driver issue, or a question of power management (the connected power supply does not provide enough power to run the dGPU at higher clock rates, or something else. I am willing to investigate this issue further, but I do not think that this thread is the right place. Therefore I suggest to start a new thread regarding this issue. My VBIOS version is 86.08.28.00.59.

 

Reply
Options

49 Posts

11-02-2012

United States of America

67 Signins

374 Page Views

  • Posts: 49
  • Registered: ‎11-02-2012
  • Location: United States of America
  • Views: 374
  • Message 428 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-06-25, 15:25 PM

Agreed. I have started a new one here. Fingers crossed. :)

Reply
Options

1 Posts

07-05-2020

Finland

1 Signins

15 Page Views

  • Posts: 1
  • Registered: ‎07-05-2020
  • Location: Finland
  • Views: 15
  • Message 429 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-07-05, 18:56 PM

Hi!

 

I've been having this problem for a while now and have been watching this thread on and off.

 

I have a T480 running Linux Mint 19.3 with kernel 5.3.0-59 and Secure Boot enabled.

 

rdmsr -f 29:24 -d 0x1a2 reports a critical temperature offset of 30 degrees, making the CPU stay below 70 at all times.

This of course results in very poor performance under heavy load. I'm also experiencing throttling down to 200MHz from time to time.

I can't use throttled because of Secure Boot, and I didn't have any luck with thermald v2.2.

 

Nariox's script (https://gist.github.com/nariox/11e5284373eb4c858f817e060911ec03) sets temperature offset to 3 degrees and allows temps to climb over 90.

After some time though I guess the embedded controller does something and the offset becomes -13 while cores plummet to 200MHz.

 

Running the script again corrects the offset but core speeds don't return to previous level, seemingly because power limit is lower now.

 

Continously running Nariox's script and setting MCHBAR values as per throttled static fix section seems to allow for maximum performance all the time.

I still need to test this for longer periods. However, I'm hopeful as the above brings cores back when throttled to 200MHz.

Reply
Options

8 Posts

01-27-2020

France

12 Signins

90 Page Views

  • Posts: 8
  • Registered: ‎01-27-2020
  • Location: France
  • Views: 90
  • Message 430 of 484

Re:[X1C6/T480s] low cTDP and trip temperature in Linux

2020-07-31, 8:05 AM

I never managed to get thermald (2.2) to fix the problem, but now with the activation of Lockdown when we have SecureBoot, I have the following error in loop in my dmesg:

 

Lockdown: thermald: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7

 

Does anyone know if I can still run thermald without having to deactivate SecureBoot?

Reply
Forum Home

Community Guidelines

Please review our Guidelines before posting.

Learn More

Check out current deals!

Go Shop
X

Save

X

Delete

X

No, I don’t want to share ideas Yes, I agree to these terms

Most Liked Authors

(Last 7 days)

View All