[SunRescue] Logging memory errors
Paul Khoury
rescue at sunhelp.org
Tue Dec 5 02:17:24 CST 2000
On Mon, 04 Dec 2000 21:12:14 -0800, Paul Theodoropoulos wrote:
>That's controlled by your /etc/syslog.conf. just specify that you
>want whatever was sent to console (should also be insted in your
>syslog.conf) to go to a specific logfile. I have it go to
>/var/adm/messages.
>
>I've been getting the following in my messages log on one of my
>e4500's for months now -
>
>Dec 4 20:40:31 e4500a unix: CPU0 CE Error: AFSR
>0x00000000.00100000 AFAR 0x00000000.7f755c10 UDBH MemMod Board 0
>J3800
>Dec 4 20:40:31 e4500a unix: Syndrome 0xf8 Size 3 Offset 0 UPA
>MID 0
>Dec 4 20:40:31 e4500a unix: Softerror: Persistent ECC Memory Error
>Dec 4 20:40:31 e4500a unix: Corrected MemMod Board 0 J3800
>Dec 4 20:40:31 e4500a unix: ECC Data Bit 11 was corrected
>
>Haven't had time to swap out the module. Just keeps running and
>running, doesn't bat an eyelash.
>
>I refuse to use anything but SPARC running Solaris for core
>infrastructure. Nothing is as reliable.
>
I agree. I was comtemplating shutting down the machine, and swapping out RAM
that's known good, but why should I when the machine has been running 68 days? =)
How do the memory errors work, BTW? Does Solaris just map around them in realtime?
I'm sure Linux would have a fit if it encountered that.
Paul
More information about the rescue
mailing list