On Tue, 2006-10-17 at 18:35 -0400, Dave Jones wrote: > On Wed, Oct 18, 2006 at 12:05:14AM +0200, Alfredo Ferrari wrote: > > Seriously, I believe this is a big issue. Let me summarize: > > > > a) there was a kernel update for FC5 > > b) this kernel has a known bug which could results in corrupting > > ext3 filesystems with 1k block size under heavy load > > it doesn't corrupt filesystems, it crashes instantly when the bug is hit. > > > c) ... nevertheless it has been pushed out with no special warning > > d) pratically all /boot partitions are ext3 1k (anaconda generated) > > e) many partitions on old machine upgraded from previous versions are > > ext3 1k as well > > /boot partitions don't see anywhere near the sustained IO that is needed > to hit this bug. it takes _hours_ of insane amounts of IO to hit it. > It should be noted that I was the only person to ever see this. > No bugzilla reports. No upstream reports. This is a real corner case > scenario, as usually filesystems that see that kind of IO want the higher > throughput that a larger blocksize brings. > Who in the world has a large amount of IO on /boot? Since that is usually a separate filesystem and is usually only 100 Mb in size, it is IME basically a static filesystem that only changes when the kernel is updated. I can easily see the reason that bug has not been encountered in the past. > > What was the rationale for releasing an official kernel update under such > > dangerous conditions? Just "anaconda doesn't generate 1k partitions (not > > true BTW)"? I still believe Linux is not (yet) Windows and if features are > > in the system (like 1k blocksize partitions) people can use them if > > they feel appropriate and they must work. Or perhaps there was a rush to > > push this 2.6.18 kernel out to get some extra guinea pigs finding all > > residual bugs? But this could be fair for the FC6 betas, not for FC5 where > > people is expecting reasonable stability, anyway no life-threatening > > issue like a (known) filesystem corruption bug. > > That code hasn't changed in months, so the 2.6.17 kernel in FC5 likely > was already affected by the same bug, and yet despite this, no-one was > hitting it because of the pathalogical circumstances needed to hit it. > > > Now how long do we have to wait before we have an update for FC5 fixing > > this critical issue? Or do we have to manually rollback kernels on all > > machines? > > I'm already working on the next update. > > Dave > > -- > http://www.codemonkey.org.uk >