Re: [git pull] scheduler updates for v2.6.24

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ingo Molnar wrote:
> * Andrew Morton <[email protected]> wrote:
> 
>> On Mon, 15 Oct 2007 16:17:23 +0200
>> Ingo Molnar <[email protected]> wrote:
>>
>>> Linus, please pull the latest scheduler git tree from:
>>>
>>>    git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git
>> Did Paul Jackson's crash get fixed?
> 
> yes - that crash was a showstopper that was holding up the pull request 
> for 2 days. Paul bisected it down to the culprit and the fix was to do 
> this in wake_up_new_task():
> 
> -       if (!p->sched_class->task_new || !current->se.on_rq) {
> +       if (!p->sched_class->task_new || !current->se.on_rq || !rq->cfs.curr) {
> 
> (during early bootup the cfs_rq has no curr pointer yet.) It's not clear 
> why this race did not trigger earlier. (and the two checks can probably 
> be consolidated into a single "!rq->cfs.curr" condition.)

Maybe not related to that but now my box is killed after this merge.

When I do not much on the box I get maybe 6h uptime , by doing some work ( compiling etc ) is random freeze.

I was able to capture the OOps finally :

...

[15692.917111] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000044
[15692.917159]  printing eip:
[15692.917174] c0111f90
[15692.917185] *pde = 00000000
[15692.917200] Oops: 0000 [#1]
[15692.917208] PREEMPT SMP
[15692.917240] Modules linked in: fuse netconsole configfs pc87360 hwmon_vid eeprom adm1021 uhci_hcd sr_mod shpchp pci_hotplug ohci_hcd iTCO_wdt iTCO_vendor_support intel_agp i82860_edac i2c_i801 ehci_hcd usbcore edac_core cdrom agpgart 3c59x mii ext4dev jbd2 capability commoncap loop lp parport_pc parport evdev
[15692.917623] CPU:    0
[15692.917625] EIP:    0060:[<c0111f90>]    Not tainted VLI
[15692.917629] EFLAGS: 00010046   (2.6.23-g65a6ec0d #330)
[15692.917661] EIP is at pick_next_task_fair+0x1f/0x2d
[15692.917672] eax: c150a7b8   ebx: 00000000   ecx: 00000000   edx: 00000000
[15692.917689] esi: c1507a48   edi: 00000000   ebp: 00eaaf7a   esp: cb1fdf14
[15692.917701] ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
[15692.917715] Process sed (pid: 28999, ti=cb1fc000 task=cfdc3500 task.ti=cb1fc000)
[15692.917725] Stack: c02f8268 c02ef7b5 00000002 cb1fdf58 cb1fdf50 00000000 c0400f38 c0403780
[15692.917833]        cfdc3500 cfdc3634 c150a780 00000000 c011a8e7 00000000 c1077aa0 000000ff
[15692.917942]        00000000 00000000 00000000 cb1fdf8c 00000010 cfdc3500 cb1fdf8c c011ace5
[15692.918048] Call Trace:
[15692.918072]  [<c02ef7b5>] schedule+0x321/0x58f
[15692.918109]  [<c011a8e7>] do_exit+0x293/0x6c6
[15692.918143]  [<c011ace5>] do_exit+0x691/0x6c6
[15692.918169]  [<c011ad87>] sys_exit_group+0x0/0xd
[15692.918195]  [<c01026e6>] sysenter_past_esp+0x5f/0x85
[15692.918232]  =======================
[15692.918244] Code: 8b 53 28 89 43 34 89 53 38 5b 5e c3 53 31 d2 83 78 40 00 74 20 83 c0 38 8b 50 20 31 db 85 d2 74 0a 8d 5a f8 89 da e8 a9 ff ff ff <8b> 43 44 85 c0 75 e6 8d 53 d0 89 d0 5b c3 57 56 53 89 c6 89 d7
[15692.918981] EIP: [<c0111f90>] pick_next_task_fair+0x1f/0x2d SS:ESP 0068:cb1fdf14

...

After that the box is death need to hard reset it.

Interesting thing is when I compile the kernel with debug I don't get that ( or maybe its need longer to triggers it ? )

Config , lspci , dmesg , hardware specs , Oops message , and the top output when it Oops'ed there :


http://194.231.229.228/lara/

> 
> 	Ingo

Regards,

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux