Re: + proc-dont-lock-task_structs-indefinitely-cpuset-fix-2.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John, Jack,

  Adding you to this one.  We've seen this before, and
  I wasted Andrew's and others time chasing it again.

  I speculate at the bottom of this message that I
  should add a panic on the kzalloc that fails if one
  has 1024 CPUS, SPINLOCK debug, and just slightly larger
  data structures with a new patch than we had before.

  Feel free to comment on whether such a panic, or other
  remedy would be desirable or not.

===

Andrew wrote:
> OK.   This is awful.   I cannot see it.

Crap - I should have recognized this problem a day ago.

I wasted your time.  Sorry.

The extra data pushed the size of our (big SN2) sysfs_cpus[] array
past the point that it could be kzalloc'd.

The initial failure is in the file:

    arch/ia64/kernel/topology.c

function:

    topology_init

line:

    sysfs_cpus = kzalloc(sizeof(struct ia64_cpu) * NR_CPUS, GFP_KERNEL);

With our large NR_CPUS of 1024, and the additional cost of
the CONFIG_DEBUG_SPINLOCK* debug stuff, and the little bit of
additional data added by this patch, that kzalloc() fails.

The final collapse occurs in the file:

    fs/sysfs/group.c

function:

    sysfs_create_group

line:

    BUG_ON(!kobj || !kobj->dentry);

where kobj->dentry points to 0x58.

The offset of dentry in struct kobject is 0x50, and the offset of that
kobj in the containing struct sys_dev is another 0x8 bytes, resulting
in the failed reference:

    Unable to handle kernel NULL pointer dereference (address 0000000000000058)

The drivers/base/cpu.c array:

    static struct sys_device *cpu_sys_devices[NR_CPUS];

is never filled in, as a result of the above kzalloc() failure, causing
the routine get_cpu_sysdev() in drivers/base/cpu.c to return a NULL
pointer (unknowingly).

I should stare at the code between this point of initial failure and
the point that the house of cards finally collapsed and see if
something should have squeaked sooner.

Though I'm not a guru in this code, so I'm saying a little prayer that
someone else will have a more useful suggestion.

I've added a couple of SGI wizards in this area to the cc list.

I suspect that the short term solution is to proceed without
prejudice to the patch that triggered this:

  gregkh-driver-allow-sysfs-attribute-files-to-be-pollable.patch

while I look at some way, if just a stop gap measure, to complain
earlier in the boot, closer to the scene of the original crime,
so that others hitting this won't waste more time.

Perhaps failing that first kzalloc should cause a complaint,
if not a panic.  It would seem that the system is beyond repair
if that kzalloc fails.  And since the system hasn't even finished
booting yet, and is for sure trying to boot some larger than tried
before configuration, might just as well announce ones death boldly.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux