10-07-2009 12:26 PM - edited 10-07-2009 12:27 PM
Trying to resolve the nature of a video corruption error, which is to say I'm trying to determine (if possible) if it is caused by software, drivers, the video board, the vram, the bus... or what?
The issue only shows up when running a very, very graphics-intensive OpenGL based animation, and has not been seen to occur unless the app has been running at least 8 (and in most cases 24+) hours constantly. It has been observed on four different A31p's, each configured the same as the others (bios/software/the works).
Initial suspect was the app, but the fact that the video corruption affects the entire screen (even the windows taskbar) rather than just the application windows made me think it might be otherwise.
Mind you, I'm not looking for help in the way of "try this...and if that doesn't work try this...etc." I need to pin down exactly what is wrong and an exact fix for it (if available) -- hopefully without having to try a dozen different things (each taking from 8 to 24+ hours to attempt!)
A rather odd additional clue appeared on one of the machines - after showing the corruption following a weekend run, it was left in that state for two more days. When restarted, the shell icon cache had been corrupted. I can't find any info on exactly how the icon cache is used, it may well be loaded from disk into some small segment of vram, so I guess that might explain that addtional problem.
I have some screenshots that would probably be helpful, but don't see a way to attach them here. Can e-mail them to anyone interested, though.
10-07-2009 12:34 PM
bcoryell, welcome to the forum,
It would be helpful if you could post which ThinkPad and Operating System you have, even better post the Type and Model Number this will help members to help you. If it ends in CTO please provide some details about the ThinkPad especially which graphics it has. Please do NOT post your s/n.
If you want to post pictures; it's preferred you host them externally and provide links in your post. Flikr is one possible host.
Please remember to come back and mark the post that you feel solved your question as the solution, it earns the member + pointsDid you find a post helpfull? You can thank the member by clicking on the star to the left awarding them Kudos
Please add your type, model number and OS to your signature, it helps to help you.
Forum Search OptionT430 2347-G7U W8 x64, Yoga 10 HD+, Tablet 1838-2BG, T61p 6460-67G W7 x64, T43p 2668-G2G XP, T23 2647-9LG XP, plus a few more.
FYI Unsolicited Personal Messages will be ignored.Deutsche Community Comunidad en Español English Community Русскоязычное СообществоPepperonI blog
10-07-2009 12:45 PM
Sorry, most of that info was in there, but not highlighted...
As I said, all 4 machines are identical...
Windows XP SP2
BIOS and ECP versions a couple back from current, don't have my notes with me just now but can post later.
ATI Video driver version 8.133.2-050525a-024243C-IBM (shows as 188.8.131.5246 in Windows XP dev manager)
Started a test with 2 of the machines this morning after loading the latest BIOS/ECP/VIDEO updates...but I really need to isolate what causes and what fixes (if possible) the problem, they don't like the shotgun approach here.
10-07-2009 05:15 PM
Your machines have a GPU issue that is extremely common on A31p units, and comes in two flavours:
a) Detachment of the actual card from the motherboard. This is lesser of a problem, and can be solved by re-flowing/re-balling the solder in the GPU area.
b) VRAM corruption. No known cure. Equals motherboard replacement.
You can run PC-Doctor tests and see whether you get the card or VRAM to fail them, which may or may not happen.
One fairly foolproof way to determine which issue you're dealing with is to apply pressure on the GPU while the machine is booting and see if this clears it. Given the location of the card on these units, you'll have to do it by inserting a couple of "post-it" sheets under the heatsink. If the picture clears, your GPU is detaching. If it doesn't after you've stuck as many sheets as you can possibly fit in, your VRAM-and your motherboard-are terminally ill.
Hope this helps.
Good luck and keep us posted.
10-08-2009 07:32 AM
I was thinking heat stress on either the GPU or VRAM (or both) due to the heavy graphics load, but here's the thing that seemed odd to me - in each case, a simple restart of Windows clears the problem up, and if I then restart the app doing the same heavy graphics load right away, it might run for days without showing the trouble. Hence I thought that didn't seem likely to be an overheating issue. Maybe intermittent failure in the VRAM? Does a Windows restart cause a reset of all video memory?
Aside from the test I'm running right now, with the latest BIOS, ECP, and Video updates, the machines have been running with BIOS v1.05 and ECP v1.01
10-08-2009 09:25 AM
I've never mentioned overheating...not in this case, anyway.
Here's the story: solder under the GPU, which is not really cooled properly on A31p, goes through a number of heating and cooling cycles every time that the machine is turned on and off, and over the course of years (the youngest A31p will likely be six years old this year) it gets weak so the GPU starts detaching...if you can stress it out quickly enough, and it doesn't clear immediately, that's likely the problem.
What you're describing sounds more like a VRAM issue, though, or a driver-related one, if you're very lucky.
Try these and let us know if they've done anything for you:
10-08-2009 09:28 AM
Correction to earlier post, and an interesting turn of events...
All the machines that have shown this transient video corruption are NU4 models, not NU1. I've been running a stress test on an NU1 for almost a week with no problem, so it may be specific to the NU4
10-08-2009 10:45 AM
They all share the same motherboard and GPU, so the model number difference is not an issue here.