Welcome to our peer-to-peer forums, where owners help owners. Need help now? Visit eSupport here.

English Community

Linux Operating SystemsFedora
All Forum Topics
Options

8 Posts

03-25-2021

Sweden

5 Signins

55 Page Views

  • Posts: 8
  • Registered: ‎03-25-2021
  • Location: Sweden
  • Views: 55
  • Message 1 of 2

GPU throttling on Thinkpad T14 Gen1 with NVIDIA Geforce MX330

2021-05-13, 20:27 PM

I am experiencing severe throttling on my Thinkpad T14 Gen1 with with NVIDIA Geforce MX330. I have followed the guides to install the drivers (https://rpmfusion.org/Howto/NVIDIA) and to make my nvidia GPU primary (https://docs.fedoraproject.org/en-US/quick-docs/how-to-set-nvidia-as-primary-gpu-on-optimus-based-laptops/). I am on version 465.27 of the driver and have a Fedora 34 workstation with kernel 5.11.19.

 

I am seeing constant throttling during even idling. Right now, just idling, I am seeing:

 

nvidia-smi -q -d PERFORMANCE

 

==============NVSMI LOG==============

Timestamp                                 : Sat May  8 13:19:52 2021
Driver Version                            : 465.27
CUDA Version                              : 11.3

Attached GPUs                             : 1
GPU 00000000:2D:00.0
    Performance State                     : P0
    Clocks Throttle Reasons
        Idle                              : Not Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Active
        Display Clock Setting             : Not Active


Where SW Thermal Slowdown is indicating that the GPU is throttled, despite being at 59 degrees Celsius. Running glxgears and checking clocks, I get:

 

nvidia-smi -q -d CLOCK

 

==============NVSMI LOG==============

Timestamp                                 : Sat May  8 13:23:43 2021
Driver Version                            : 465.27
CUDA Version                              : 11.3

Attached GPUs                             : 1
GPU 00000000:2D:00.0
    Clocks
        Graphics                          : 139 MHz
        SM                                : 139 MHz
        Memory                            : 405 MHz
        Video                             : 544 MHz
    Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Default Applications Clocks
        Graphics                          : N/A
        Memory                            : N/A
    Max Clocks
        Graphics                          : 1911 MHz
        SM                                : 1911 MHz
        Memory                            : 3504 MHz
        Video                             : 1708 MHz
    Max Customer Boost Clocks
        Graphics                          : N/A
    SM Clock Samples
        Duration                          : 18446744073709.55 sec
        Number of Samples                 : 100
        Max                               : 1531 MHz
        Min                               : 139 MHz
        Avg                               : 0 MHz
    Memory Clock Samples
        Duration                          : 18446744073709.55 sec
        Number of Samples                 : 100
        Max                               : 3504 MHz
        Min                               : 405 MHz
        Avg                               : 0 MHz
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A


So the GPU is clearly being heavily throttled.

 

My guess is that this is related to the following settings:

 

nvidia-smi -q -d TEMPERATURE

==============NVSMI LOG==============

Timestamp                                 : Sat May  8 13:25:04 2021
Driver Version                            : 465.27
CUDA Version                              : 11.3

Attached GPUs                             : 1
GPU 00000000:2D:00.0
    Temperature
        GPU Current Temp                  : 56 C
        GPU Shutdown Temp                 : 102 C
        GPU Slowdown Temp                 : 97 C
        GPU Max Operating Temp            : 57 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A


Interestingly, if I enable thermald with the --adaptive flag, I get this:

 

==============NVSMI LOG==============

Timestamp                                 : Sat May  8 13:29:56 2021
Driver Version                            : 465.27
CUDA Version                              : 11.3

Attached GPUs                             : 1
GPU 00000000:2D:00.0
    Temperature
        GPU Current Temp                  : 56 C
        GPU Shutdown Temp                 : 102 C
        GPU Slowdown Temp                 : 97 C
        GPU Max Operating Temp            : 75 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A


And the throttling goes away and performance is suddenly much improved.

 

So apparently thermald can change this setting, but I cannot seem to be able to do so manually since “GPUMaxOperatingTempThreshold” is a read-only variable:

 

nvidia-settings -a GPUMaxOperatingTempThreshold=80

 

ERROR: The attribute 'GPUMaxOperatingTempThreshold' specified in assignment 'GPUMaxOperatingTempThreshold=80' cannot be assigned (it is a read-only
       attribute).


I am now on Fedora 34 but I saw the exact same problem on Ubuntu 20.10.

 

I don’t really know what’s going on here, but it seems strange that I should have to run thermald just to escape this throttling problem (and then I still think that 75C is too low to be throttling on. To be honest, I don’t really understand the interplay between GPU Slowdown Temp and GPU Max Operating Temp. It seems to me that they are synonymous.

 

Here’s the full output from nvidia-smi:

 

Sat May  8 15:23:05 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.27       Driver Version: 465.27       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:2D:00.0 Off |                  N/A |
| N/A   67C    P0    N/A /  N/A |    578MiB /  2002MiB |      7%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2762      G   /usr/libexec/Xorg                 293MiB |
|    0   N/A  N/A      2953      G   /usr/bin/gnome-shell               88MiB |
|    0   N/A  N/A      4524      G   ...AAAAAAAAA= --shared-files      134MiB |
|    0   N/A  N/A      5395      G   ...e/Steam/ubuntu12_32/steam       18MiB |
|    0   N/A  N/A      5604      G   ./steamwebhelper                    1MiB |
|    0   N/A  N/A      6303      G   ...AAAAAAAAA= --shared-files        6MiB |
|    0   N/A  N/A      7422      G   anki                               27MiB |
|    0   N/A  N/A     21305      G   /usr/bin/gjs                        2MiB |
+-----------------------------------------------------------------------------+

 

There is nothing in particular going on in the logs when throttling occurs:

 

dmesg | grep -iP "nvidia|gpu|graphics|video|thermal"

 

[    0.000000] Command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.11.18-300.fc34.x86_64 root=UUID=c1325a0f-113a-4ad5-bf64-be324ff943b8 ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[    0.156686] Reserving Intel graphics memory at [mem 0x8b800000-0x8f7fffff]
[    0.165326] Kernel command line: BOOT_IMAGE=(hd1,gpt2)/vmlinuz-5.11.18-300.fc34.x86_64 root=UUID=c1325a0f-113a-4ad5-bf64-be324ff943b8 ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[    0.332283] mce: CPU0: Thermal monitoring enabled (TM1)
[    0.350029] thermal_sys: Registered thermal governor 'fair_share'
[    0.350031] thermal_sys: Registered thermal governor 'bang_bang'
[    0.350032] thermal_sys: Registered thermal governor 'step_wise'
[    0.350033] thermal_sys: Registered thermal governor 'user_space'
[    0.551119] ACPI: Added _OSI(Linux-Dell-Video)
[    0.551119] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.705965] ACPI: \_SB_.PR00: _OSC native thermal LVT Acked
[    0.827183] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    1.111107] efifb: No BGRT, not showing boot graphics
[    1.115656] thermal LNXTHERM:00: registered as thermal_zone0
[    1.115660] ACPI: Thermal Zone [THM0] (79 C)
[    1.834077] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[    1.834348] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input13
[    1.850519] ACPI: Video Device [PEGP] (multi-head: no  rom: yes  post: no)
[    1.850562] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:4c/LNXVIDEO:01/input/input14
[    5.062263] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver
[    5.063220] intel_pch_thermal 0000:00:12.0: enabling device (0000 -> 0002)
[    5.072854] proc_thermal 0000:00:04.0: enabling device (0000 -> 0002)
[    5.076995] proc_thermal 0000:00:04.0: Creating sysfs group for PROC_THERMAL_PCI
[    5.232387] nvidia: loading out-of-tree module taints kernel.
[    5.232400] nvidia: module license 'NVIDIA' taints kernel.
[    5.278696] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    5.299920] nvidia-nvlink: Nvlink Core is being initialized, major device number 511
[    5.300795] nvidia 0000:2d:00.0: enabling device (0006 -> 0007)
[    5.442330] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[    5.539976] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  465.27  Thu Apr 22 23:21:03 UTC 2021
[    5.568987] nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
[    5.576283] nvidia-uvm: Loaded the UVM driver, major device number 509.
[    5.596264] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  465.27  Thu Apr 22 23:12:47 UTC 2021
[    5.615240] [drm] [nvidia-drm] [GPU ID 0x00002d00] Loading driver
[    5.637909] thermal thermal_zone7: failed to read out thermal zone (-61)
[    6.298111] videodev: Linux video capture interface: v2.00
[    6.425911] uvcvideo: Found UVC 1.10 device Integrated Camera (04f2:b6d0)
[    6.435983] uvcvideo: Found UVC 1.50 device Integrated Camera (04f2:b6d0)
[    6.753420] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:2d:00.0 on minor 1
[    7.267256] uvcvideo: Found UVC 1.00 device HD Pro Webcam C920 (046d:0892)
[    7.269010] usbcore: registered new interface driver uvcvideo
[    7.269011] USB Video Class driver (1.1.1)

 

I have posted this previously on the NVIDIA forums (https://forums.developer.nvidia.com/t/severe-throttling-on-thinkpad-t14-gen-1-with-geforce-mx330/177366), but haven't gotten any response or directions so far.

Reply
Options

8 Posts

03-25-2021

Sweden

5 Signins

55 Page Views

  • Posts: 8
  • Registered: ‎03-25-2021
  • Location: Sweden
  • Views: 55
  • Message 2 of 2

Re:GPU throttling on Thinkpad T14 Gen1 with NVIDIA Geforce MX330

2021-05-23, 9:49 AM

Some more information on things I’ve been trying out but which haven’t helped so far

Reply
Forum Home

Community Guidelines

Please review our Guidelines before posting.

Learn More

Check out current deals!

Go Shop
X

Save

X

Delete

X

No, I don’t want to share ideas Yes, I agree to these terms

Most Liked Authors

(Last 7 days)

View All