At 09:46 PM 4/12/2006, Andrew Morton wrote:
Locking for file.f_pos is generally file->f_dentry->d_inode->i_mutex. We
could use that if we were to restructure the code a lot. Or we could add a
new lock to `struct file'.
Or we could do nothing, because a) the application is going to produce
inderterminate output anyway and b) because it only affects silly testcases
and not real-world apps.
OK, there _might_ be a real-world case: threads appending logging
information to a flat file. Trivially workable-around with a userspace
lock, or by switching to stdio (same thing).
Yes, really we should fix it. But it's not worth adding more overhead to
do so. So the fix would involve widespread (but simple) change, to draw
that f_pos update inside i_mutex.
Hi Andrew - thanks for the detailed response.
I don't know enough about the kernel implementation to comment on your
proposed fixes.
However, I should clarify that this problem definitely affects more than just
"silly testcases", and the fact that a program generates non-deterministically
ordered output does not necessarily make it erroneous, invalid or unuseful.
This problem arose in the parallel runtime system for a scientific language
compiler (nearly a million lines of code total - definitely a "real-world"
program) - the example code is merely a pared-down demonstration of the
problem. In parallel scientific computing, it's very common for many threads
to be writing to stdout (usually for monitoring purposes) and it's expected
and normal for output from separate threads to be arbitrarily interleaved, but
it's *not* ok for output to be lost entirely. This is essentially equivalent
to the real-world example you gave of many threads logging to a file.
We've worked around the problem in Linux 2.6 by adding locking at user-level
around our writes, as you suggest, although this of course penalizes our
performance on kernels that already correctly implement the thread-safety
required by the POSIX spec. In any case it seemed like a problem that we
should report, to be good open-source citizens - especially given that it
appears to be a regression with respect to the Linux 2.4 kernel. How you
choose to handle the report is of course your decision.
Thanks for your time.
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]