Re: 2.6.16.18 kernel freezes while pppd is exiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2006-06-08 at 14:09 -0400, Chuck Ebbert wrote:
> Very infrequently I get kernel freezes while pppd is exiting.

> [1410445.728958] Pid: 887, comm:             sendmail
> [1410445.743307] EIP: 0060:[<c03b29f8>] CPU: 1
> [1410445.755837] EIP is at lock_kernel+0x18/0x30
...
> [1410462.415500] Pid: 22020, comm:                 pppd
> [1410462.430365] EIP: 0060:[<c015eaae>] CPU: 0
> [1410462.442913] EIP is at kfree+0x4e/0x70
...
> pppd seems to be looping here while holding the BKL:
> 
> static void tty_buffer_free_all(struct tty_struct *tty)
> {
>         struct tty_buffer *thead;
>         while((thead = tty->buf.head) != NULL) {
>                 tty->buf.head = thead->next;
>                 kfree(thead);
>         }
>         while((thead = tty->buf.free) != NULL) {
>                 tty->buf.free = thead->next;
> ====>           kfree(thead);
>         }
>         tty->buf.tail = NULL;
> }
> 
> I did alt-sysrq-p over and over and all I got was basically these two
> traces -- CPU 1 in lock_kernel() and CPU 0 in kfree().

It looks like the free list is corrupt.

in drivers/char/tty_io.c, flush_to_ldisc processes
buffers and frees them:

static void flush_to_ldisc(void *private_)
{
...
	spin_lock_irqsave(&tty->buf.lock, flags);
	while((tbuf = tty->buf.head) != NULL) {
		while ((count = tbuf->commit - tbuf->read) != 0) {
			char_buf = tbuf->char_buf_ptr + tbuf->read;
			flag_buf = tbuf->flag_buf_ptr + tbuf->read;
			tbuf->read += count;
			spin_unlock_irqrestore(&tty->buf.lock, flags);
			disc->receive_buf(tty, char_buf, flag_buf, count);
			spin_lock_irqsave(&tty->buf.lock, flags);
		}
		if (tbuf->active)
			break;
		tty->buf.head = tbuf->next;
		if (tty->buf.head == NULL)
			tty->buf.tail = NULL;
		tty_buffer_free(tty, tbuf);
	}
	spin_unlock_irqrestore(&tty->buf.lock, flags);
...
}

If two copies of flush_to_ldisc run simultaneously on different
CPUs, the free list can be corrupted. tbuf is read from
the head, the list lock is dropped to pass tbuf to disc->receive_buf.
While in receive_buf, the other flush_to_ldisc can get a pointer
to the same buf. Both end up freeing the same buf, corrupting the list.

The following should correct that by forcing a re-read of the
list head after passing tbuf to receive_buf. I'm posting now for
quick feedback (hi Alan). I'm going to implement and test this before
posting a patch (possibly tomorrow).

	spin_lock_irqsave(&tty->buf.lock, flags);
	while((tbuf = tty->buf.head) != NULL) {
		if ((count = tbuf->commit - tbuf->read) == 0) {
			if (tbuf->active)
				break;
			tty->buf.head = tbuf->next;
			if (tty->buf.head == NULL)
				tty->buf.tail = NULL;
			tty_buffer_free(tty, tbuf);
			continue;
		}
		while ((count = tbuf->commit - tbuf->read) != 0) {
			char_buf = tbuf->char_buf_ptr + tbuf->read;
			flag_buf = tbuf->flag_buf_ptr + tbuf->read;
			tbuf->read += count;
			spin_unlock_irqrestore(&tty->buf.lock, flags);
			disc->receive_buf(tty, char_buf, flag_buf, count);
			spin_lock_irqsave(&tty->buf.lock, flags);
		}
	}
	spin_unlock_irqrestore(&tty->buf.lock, flags);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux