On Mon, May 23, 2005 at 04:48:29PM -0700, Peter J. Stieber wrote: > PJS = Peter J. Stieber > PJS>> I was under the impression the test kernel had > PJS>> some type of debug messages in it someone > PJS>> would be interested in. Was I wrong about that? > > DJ = Dave Jones > DJ> No, you are correct. But nothing triggered with the latest builds. > > PJS>> I guess your saying I should go a head and update > PJS>> and see what happens? > > DJ> Theres a number of other fixes in there which may have > DJ> caused the problem to go into hiding.. I can't reproduce > DJ> it at all any more, and some others who were seeing it > DJ> haven't seen it recently either. > > This morning I ran Memtest-86 v3.2 on the machine in question. I let it > run for a little over 6 hours. It made 7 passes of the memory tests and > had no errors. > > Next I updated the kernel to 2.6.11-1.27_FC3smp. The problem happened > pretty quickly for me: > > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe000(0000000000401b80). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe008(000000000000000b). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe010(0000000000000220). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe018(000000000000000c). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe020(0000000000000220). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe028(000000000000000d). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe030(00000000000001f7). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe038(000000000000000e). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe040(00000000000001f7). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe048(0000000000000017). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe058(000000000000000f). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe060(00007ffffffff081). > May 23 16:34:36 maggie kernel: collect2:9582: mm/memory.c:98: bad pmd > ffff8100190fe080(0034365f36387800). > > The collect2 command is coming from the build sequence I described in > earlier emails. If I try it a second time ir runs successfully. I've > seen sh cause it too. > > Note that 34365f363878 in the last line = 46_68x in ASCII, which is > x86_64 in reverse. > > Is there anything else I should be looking for? > > I'd be willing to try any debug kernel to help find the problem. Give the test kernel at http://people.redhat.com/davej/kernels/Fedora/ a shot (-28_FC3). That should be slightly different output. Dave