Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dan Bonachea <[email protected]> wrote:
>
> Hi - I believe I've discovered a thread-safety bug in the Linux 2.6.x kernel 
>  implementation of write(2).
> 
>  The small C program below the problem - in a nutshell, if multiple threads 
>  write() to STDOUT_FILENO, and stdout has been redirected to a file, then some 
>  of the output lines get lost. The actual result is non-deterministic (even in 
>  a "correct" run) - however the expected correct behavior is 10 lines of output 
>  (in some non-deterministic order). However, the test program reproducibly 
>  generates some lost output (less than 10 lines of total output) on every run 
>  where the output is redirected to a new file. This appears to be a violation 
>  of the POSIX spec (POSIX 1003.1-2001:2.9.1 requires thread-safety of write()). 
> 
> 
>  The problem does not appear to occur if output goes to the console, or is 
>  redirected to append to an existing file, only when stdout is redirected to a 
>  new file.

This comes up occasionally.  Is very simple:

asmlinkage ssize_t sys_write(unsigned int fd, const char __user * buf, size_t count)
{
	struct file *file;
	ssize_t ret = -EBADF;
	int fput_needed;

	file = fget_light(fd, &fput_needed);
	if (file) {
		loff_t pos = file_pos_read(file);
		ret = vfs_write(file, buf, count, &pos);
		file_pos_write(file, pos);
		fput_light(file, fput_needed);
	}

	return ret;
}

we can have multiple threads in vfs_write() using the same `pos'.

Locking for file.f_pos is generally file->f_dentry->d_inode->i_mutex.  We
could use that if we were to restructure the code a lot.  Or we could add a
new lock to `struct file'.

Or we could do nothing, because a) the application is going to produce
inderterminate output anyway and b) because it only affects silly testcases
and not real-world apps.

OK, there _might_ be a real-world case: threads appending logging
information to a flat file.  Trivially workable-around with a userspace
lock, or by switching to stdio (same thing).

Yes, really we should fix it.  But it's not worth adding more overhead to
do so.  So the fix would involve widespread (but simple) change, to draw
that f_pos update inside i_mutex.

(We could pseudo-fix it by updating f_pos _before_ entering vfs_write(), too).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux