Drew Moseley (on Wed, 17 May 2006 11:34:57 -0700) wrote:
>On Tuesday 16 May 2006 9:56 pm, Keith Owens wrote:
>> Drew Moseley (on Tue, 16 May 2006 19:43:05 -0700) wrote:
>...
>> >
>> >I modified the i386/kernel/smp.c file to be functionally equivalent to the
>> >x86_64/kernel/smp.c file to address this issue.
>> >
>...
>>
>> NAK this patch. There is a real potential deadlock if you call
>> smp_call_function() with interrupts disabled, it has caused kernel
>> hangs in the past. This patch removes the check for the potential
>> deadlock and will allow bad code to creep back into the kernel. See
>> http://lkml.org/lkml/2004/5/2/116. I find it quite disturbing that
>> x86_64 has removed that check :(
>
>So presumably the warn_on() call should be added back to the x86_64 arch?
Yes. smp_call_function() in i386 has to stay the way it is, and x86_64
has to revert to be the same as i386 and all the other architectures,
including the WARN_ON(irqs_disabled()).
That still leaves the panic and sysrq cases, and anywhere else that
needs to do an emergency stop or reboot of the system from a context
where interrupts are disabled. I do not like the idea of reenabling
interrupts just to suppress a warning from smp_call_function(). By
definition, this is an unusual state and reenabling interrupts runs
significant risks of not doing what the caller wants.
The cleanest solution is the way that alpha, arm and ia64 do it.
smp_send_stop() sends IPI_CPU_STOP to all cpus and does not go through
smp_call_function(). That avoids all the problems of calling
smp_send_stop() in an emergency. If it works for ia64 with 1,024 cpus
in a single system image then it must be good :)
Alternatively smp_send_stop() could pass a special wait value into
smp_call_function() to say that interrupts being disabled is expected.
smp_call_function() would then skip its lock and it would skip waiting
for all the other cpus to respond, or at least timeout after a short
period and continue. That would avoid the deadlock. Like this (i386,
not even compiled, let alone tested).
/* wait == 0, do not wait for the function to complete on other cpus. wait ==
* 1, wait. wait < 0, do not take locks and do not wait forever, not irqsafe.
*/
int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
int wait)
{
struct call_data_struct data;
int cpus, timeout, irqsafe = wait >= 0;
cpus = num_online_cpus() - 1;
timeout = cpus + 1;
if (irqsafe) {
/* Holding any lock stops cpus from going down. */
spin_lock(&call_lock);
if (!cpus) {
spin_unlock(&call_lock);
return 0;
}
/* Can deadlock when called with interrupts disabled */
WARN_ON(irqs_disabled());
} else {
wait = 0;
}
data.func = func;
data.info = info;
atomic_set(&data.started, 0);
data.wait = wait;
if (wait)
atomic_set(&data.finished, 0);
call_data = &data;
mb();
/* Send a message to all other CPUs and wait for them to respond */
send_IPI_allbutself(CALL_FUNCTION_VECTOR);
/* Wait for response */
while (atomic_read(&data.started) != cpus) {
cpu_relax();
if (!irqsafe) {
mdelay(1);
if (timeout-- == 0)
break;
}
}
if (wait)
while (atomic_read(&data.finished) != cpus)
cpu_relax();
if (irqsafe)
spin_unlock(&call_lock);
return 0;
}
void smp_send_stop(void)
{
smp_call_function(stop_this_cpu, NULL, 1, irqs_disabled() ? -1 : 0);
local_irq_disable();
disable_local_APIC();
local_irq_enable();
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]