Support in other languages: 
Showing results for 
Search instead for 
Do you mean 
Reply
Blue Screen Again
900Trophy
Posts: 4
Registered: ‎10-26-2011
Location: United States
0

Machine Check Exceptions with T520i (Linux)

I have a T520 (or T520i) which is great.

I just flashed to the latest BIOS (1.32) and it did not resolve my problem, which I've had since purchase.

mcelog (Machine Check Exception log) is showing stuff like this all the time:

 

CPU 1 THERMAL EVENT TSC a600a328282
TIME 1319660817 Wed Oct 26 15:26:57 2011
Processor 1 below trip temperature. Throttling disabled
STATUS c000000088220800 MCGSTATUS 0
MCGCAP c07 APICID 1 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor

 

According to the kernel folks, this is a hardware problem, since the BIOS itself is tripping the MCE.

What's going on here?

 

Fanfold Paper
homeyclaus
Posts: 8
Registered: ‎10-11-2011
Location: VA, USA
0

Re: Machine Check Exceptions with T520i (Linux)

[ Edited ]

What distro, release, and kernel release are you running?

Did you change any of the defaults for fan management?

 

ETA: it could be that your fan has failed or is failing. I don't know about your bios setup, but there may be a page in the bios that shows the temperatures and fan speeds. When you get the machine check, reboot, and look at that screen, and see if temperature is falling when the system is under bios control; that would indicate a configuration issue or something being overridden in the thermal controls. If it's still a problem when you're at that page, you most definitely have a failing fan.

Blue Screen Again
900Trophy
Posts: 4
Registered: ‎10-26-2011
Location: United States
0

Re: Machine Check Exceptions with T520i (Linux)

I haven't changed defaults. openSUSE 11.4 + Tumbleweed (for x86_64), kernel 2.6.37 and 3.0+.

Currently on 3.0.7-45-desktop.
The kernel folks tell me that this is a hardware/BIOS issue and not related to the kernel, since MCE is generated /by/ the hardware (the embedded controller, if I recall properly).
MTJ
Fanfold Paper
MTJ
Posts: 21
Registered: ‎09-28-2011
Location: Sweden
0

Re: Machine Check Exceptions with T520i (Linux)

[ Edited ]

I've got those on a T420s in linux. Have been observed on all new Lenovo machines (Sandy Bridge based) in Win7 as well. Happens whenever the machine is put under some load and CPUs get a  bit of a temperature (eg playing a simple 3D-game like Supertuxkart). Doesn't matter if fan is controlled by BIOS or user. It seems to be yet another case of poor programming from the Lenovo BIOS hackers.

 

Edit: Just to show a little piece from my own /var/log/mcelog

 

MCE 6
CPU 0 THERMAL EVENT TSC 1399f3fc7a
TIME 1319559793 Tue Oct 25 18:23:13 2011
Processor 0 below trip temperature. Throttling disabled
STATUS c000000088220800 MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42
Hardware event. This is not a software error.

 

At least there isn't a scaaaary warning to contact my hardware vendor. I'm on Kubuntu 11.10 and installed the mcelog tool just because there were cryptic mce events in dmesg.

 

/MTJ

MTJ
Fanfold Paper
MTJ
Posts: 21
Registered: ‎09-28-2011
Location: Sweden
0

Re: Machine Check Exceptions with T520i (Linux)

In fact, when I just now installed a kernel module for controlling battery charging, and unplugged the AC cord to see if it worked, the same boring mce errors reared their heads without the CPUs being in any way taxed.

 

Unplugged at 06:50, all CPU instances got the mce. Used the computer very lightly, and at 07:13 the CPUs got mce-s again:

 

MCE 0
CPU 1 THERMAL EVENT TSC 2cebb307e55f
TIME 1319777430 Fri Oct 28 06:50:30 2011
Processor 1 below trip temperature. Throttling disabled
STATUS c0000000882a0c08 MCGSTATUS 0
MCGCAP c07 APICID 1 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42
Hardware event. This is not a software error.

[...]

MCE 0
CPU 1 THERMAL EVENT TSC 3037eb5e9897
TIME 1319778777 Fri Oct 28 07:12:57 2011
Processor 1 below trip temperature. Throttling disabled
STATUS c0000000882d0c08 MCGSTATUS 0
MCGCAP c07 APICID 1 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42
Hardware event. This is not a software error.

[...]

MCE 6
CPU 3 THERMAL EVENT TSC 3037eb755374
TIME 1319778777 Fri Oct 28 07:12:57 2011
Processor 3 below trip temperature. Throttling disabled
STATUS c0000000882c0808 MCGSTATUS 0
MCGCAP c07 APICID 3 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42
Hardware event. This is not a software error.
MCE 7
CPU 0 THERMAL EVENT TSC 3037eb755b06
TIME 1319778777 Fri Oct 28 07:12:57 2011
Processor 0 below trip temperature. Throttling disabled
STATUS c0000000882c0808 MCGSTATUS 0
MCGCAP c07 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 42

 

Harmless, but irritating... The messages seem to say 'Hey, we CPUs are cool, so we won't be throttling our speeds...'. I can't interpret the wording otherwise.

 

/MTJ

 

Paper Tape
andre_lenovo
Posts: 3
Registered: ‎10-31-2011
Location: Germany
0

Re: Machine Check Exceptions with T520i (Linux)

same problem here, using Linux Kernel 3.1.0: When putting the CPU under slight load these error messages occur. According to "sensors" temperature never raises above 65 °C (= 149 °F). The mischief of it is that the CPU speed is limited to 800 MHz until hardware decides to disable throttling...

 

Some details to my hardware:

Bios Version: 1.32

Graphic Card: Intel, nvidia disabled

 

Did someone contact Lenovo support? What do they say about that issue?

MTJ
Fanfold Paper
MTJ
Posts: 21
Registered: ‎09-28-2011
Location: Sweden
0

Re: Machine Check Exceptions with T520i (Linux)

I've set the option in BIOS to never throttle the CPUs, not even on battery, and I only use the 'ondemand' kernel module for speed-modulation. No daemon or module for throttling based on CPU-temp or power source. And certainly not because of some silly mce events ;-)

 

I got/get the mce-s both when only using the Intel integrated GPU, and now that I use the Optimus mode (running the Nvidia card for specific apps, otherwise switched off -  automated by  the https://launchpad.net/~mj-casalogic/+archive/ironhide/ version of GPU-control).

 

My T420s is using the 1.26 BIOS that came with the machine, and I won't upgrade until they fix the widely complained about fan issue via a promised November BIOS release. Perhaps the mce-s will disappear with that, since both fan and mce is controlled from the EC part.

 

I'm not one for contacting support in any capacity unless something seriously disrupt normal usage. But you go ahead. It obviously is an issue if unwanted - and unnecessary - throttling occurs.

 

/MTJ

 

802.11n
Volker1
Posts: 334
Registered: ‎03-02-2010
Location: Dublin
0

Re: Machine Check Exceptions with T520i (Linux)

For the record, the W520 exhibits the same issues, plenty of MCEs in the log that seem to not have any impact on stability:

 

Oct 31 14:58:55 volker-laptop-two kernel: [57585.517378] CPU1: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517382] CPU3: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517386] CPU4: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517390] CPU6: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517394] CPU5: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517398] CPU7: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517402] CPU2: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.517405] CPU0: Package power limit notification (total events = 2)
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528410] CPU3: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528414] CPU4: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528416] CPU2: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528419] CPU5: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528422] CPU1: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528425] CPU7: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528428] CPU6: Package power limit normal
Oct 31 14:58:55 volker-laptop-two kernel: [57585.528430] CPU0: Package power limit normal
Oct 31 14:59:41 volker-laptop-two kernel: [57631.467155] [Hardware Error]: Machine check events logged

Paper Tape
andre_lenovo
Posts: 3
Registered: ‎10-31-2011
Location: Germany
0

Re: Machine Check Exceptions with T520i (Linux)


My T420s is using the 1.26 BIOS that came with the machine, and I won't upgrade until they fix the widely complained about fan issue via a promised November BIOS release. Perhaps the mce-s will disappear with that, since both fan and mce is controlled from the EC part.


today I tried the november BIOS. It does not fix the MCE issues!

Blue Screen Again
900Trophy
Posts: 4
Registered: ‎10-26-2011
Location: United States
0

Re: Machine Check Exceptions with T520i (Linux)

I'm on 1.33 (T520) and still get MCE.

These are unambiguously hardware/BIOS/EC related and not the fault of the O/S.

 

Lenovo - please look into and fix these issues.