Re: 2.6.22-rc6 bad page error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 7 Jul 2007, Bob Tracy wrote:
> Michal Piotrowski wrote:
> > On 05/07/07, Bob Tracy <[email protected]> wrote:
> > > This happened on an Alpha computer (433au) running 2.6.22-rc6.
> > 
> > Can you reproduce this bug on 2.6.22-rc7-git6?

My guess is that you'll find yes.

> > 
> > >  The
> > > error was triggered when I started up a NX KDE session, which I've
> > > done successfully on all occasions prior to this one.  In other words,
> > > I wasn't doing anything out of the ordinary...  The machine is still up
> > > and running as I type this, and can't/won't be rebooted for a few hours
> > > if there's any other troubleshooting information I need to retrieve that
> > > might be helpful.
> 
> ACK on the suggestion to try -rc7-git6, but I'll probably wait until
> Monday to give it a try.  The client side of my NX "test fixture" is
> at my day job location :-).

Fair enough: and I'm expecting this to be a regression we introduced in
2.6.15, rather than recently in 2.6.22 (now, that's better isn't it ;-?)

Helpful bad page state messages from your original post:

>Bad page state in process 'gnome-terminal'
>page:fffffc0000a407f0 flags:0x000000000000000c mapping:0000000000000000 mapcount:1 count:1
>Bad page state in process 'syslogd'
>page:fffffc0000a40828 flags:0x000000000000000c mapping:0000000000000000 mapcount:1 count:1

And very helpful info from your followup:

> Same type of "stuff" happening with 2.6.22-rc7.  I've managed to narrow
> it down somewhat.  The page corruption happens only when I've got the sound
> driver loaded: might be a DMA kind of thing that's killing me.  Anyone got
> any ideas how I might narrow it down further?  The sound modules loaded
> include ALSA's "snd_es18xx" with all the various OSS-compatibility modules.

I think sound/isa/es18xx.c's
	snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_DEV,
will take it to sound/core/memalloc.c's
	res = dma_alloc_coherent(dev, PAGE_SIZE << pg, dma, gfp_flags);
where we've carefully included __GFP_COMP in gfp_flags to avoid this
kind of problem (replacing the pre-2.6.15 use of PageReserved).

> At the risk of making an inflammatory comment, I don't recall this kind of
> thing happening when I was using the equivalent OSS driver.  N.B.: the ALSA
> driver only recently (in the past few months) became usable on the Alpha,
> and I made the switch to ALSA as soon as it was feasible to do so.  I may
> try going back to the OSS driver (if it's still available) to see if the
> problem goes away.  That would at least confirm my suspicions as to where
> the problem lies.

No, don't blame ALSA for this.  Blame me or Nick for removing the
special PageReserved usage, or Alpha for ignoring our gfp_flags:

  #define dma_alloc_coherent(dev, size, addr, gfp)	\
		pci_alloc_consistent(alpha_gendev_to_pci(dev), size, addr)

When you get a chance, please would you try patch below?  It's not ideal:
trying to satisfy a high-order allocation with GFP_ATOMIC is difficult
once you get beyond system startup, but apparently that part worked for
you before; and I don't think it'll do any harm - there were reasons
we couldn't just add __GFP_COMP behaviour everywhere, but I don't
think those could apply to the Alpha pci_alloc_consistent.

It would be a lot better if an Alpha (CONFIG_PCI=y) dma_alloc_coherent
were written to respect the dma flags passed in (masking off __GFP_DMA
and __GFP_HIGHMEM? other arches seem to do that).  And I wonder if its
pci_alloc_consistent couldn't use GFP_KERNEL rather than GFP_ATOMIC,
other arches seem to manage that.  But I've no Alpha nor Alpha
experience, I wouldn't dare to offer more than this one-liner.

Thanks for reporting: please let us all know whether this
patch does fix your problem: I may be guessing wrong.

Hugh

--- 2.6.22-rc7/arch/alpha/kernel/pci_iommu.c	2007-06-05 06:19:19.000000000 +0100
+++ linux/arch/alpha/kernel/pci_iommu.c	2007-07-07 15:00:04.000000000 +0100
@@ -402,7 +402,7 @@ pci_alloc_consistent(struct pci_dev *pde
 {
 	void *cpu_addr;
 	long order = get_order(size);
-	gfp_t gfp = GFP_ATOMIC;
+	gfp_t gfp = GFP_ATOMIC|__GFP_COMP;
 
 try_again:
 	cpu_addr = (void *)__get_free_pages(gfp, order);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux