Paul Jackson wrote:
>
> I share you preference for not nesting these semaphores.
>
> The other choice I am aware of would be for the hotplug code to be less
> cpuset-friendly. In the move_task_off_dead_cpu() code, at the point it
> says "No more Mr. Nice Guy", instead of looking for the nearest
> enclosing cpuset that has something online, which is what the
> cpuset_cpus_allowed() does, instead we could just take any damn cpu that
> was online.
>
> Something along the lines of the following fix:
>
> --- pj/kernel.old/sched.c 2005-05-11 13:00:17.000000000 -0700
> +++ pj/kernel.new/sched.c 2005-05-11 13:02:24.000000000 -0700
> @@ -4229,7 +4229,7 @@ static void move_task_off_dead_cpu(int d
>
> /* No more Mr. Nice Guy. */
> if (dest_cpu == NR_CPUS) {
> - tsk->cpus_allowed = cpuset_cpus_allowed(tsk);
> + tsk->cpus_allowed = cpu_online_map;
> dest_cpu = any_online_cpu(tsk->cpus_allowed);
Well, CPU_MASK_ALL rather than cpu_online_map, I would think. That is
what the behavior was before the cpuset merge, anyway. It might be
the best short term solution, more below...
> So what we'd really like to do would be to first fallback to all the
> cpus allowed in the specified tasks cpuset (no walking the cpuset
> hierarchy), and see if any of those cpus are still online to receive
> this orphan task. Unless someone has botched the system configuration,
> and taken offline all the cpus in a cpuset, this should yield up a cpu
> that is still both allowed and online. If that fails, then to heck with
> honoring cpuset placement - just take the first online cpu we can find.
>
> This is doable without holding cpuset_sem. We can look at a current
> tasks cpuset without cpuset_sem, just with the task lock.
Yes, but your patch doesn't lock the task itself (unless I'm
misreading patches again). However, have a look at the comments above
task_lock in sched.h:
/*
* Protects ->fs, ->files, ->mm, ->ptrace, ->group_info, ->comm, keyring
* subscriptions and synchronises with wait4(). Also used in procfs.
*
* Nests both inside and outside of read_lock(&tasklist_lock).
* It must not be nested with write_lock_irq(&tasklist_lock),
* neither inside nor outside.
*/
static inline void task_lock(struct task_struct *p)
{
spin_lock(&p->alloc_lock);
}
Unfortunately, move_task_off_dead_cpu is called from
migrate_live_tasks while the latter has a write_lock_irq on
tasklist_lock. So we can't use task_lock in this context, assuming
the comments are valid. Right?
Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]