On Sunday 24 June 2007, Tony Nelson wrote: > At 12:47 AM -0700 6/24/07, Konstantin Svist wrote: > >I think you're missing the point here. > >ECC ram will not guard against failures, it will simply reduce the > >probability of a failure. In other words, it just prolongs the inevitable. > > ... > > No. ECC RAM does guard against RAM failures; that is exactly what it is > for and what it does. Without ECC failures are undetected and produce bad > data. ECC turns those into detected failures with good data. ECC prevents > almost all RAM data errors, and allows detection of faulty RAM, allowing it > to be replaced before total or unrecoverable failures, while preventing > transient soft errors from accumulating as bad data. I think you're overestimating what ECC can do... In worst case - 3 or more bit errors - std ECC algorithms will actually correct the wrong way... On large memory systems you unfortunately see that every now and then - you get corrected ECC errors logged and your app still crashes or returns bad data. Anyway, if the application really is all that important, ECC is a good idea - it will give you at least a logging ability for your memory subsystem. However, in the end there is no way of getting around the fact that you have to run your app twice and see if it returns the same data both times. Peter.