Re: [PATCH] Use of getblk differs between locations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 10 Oct 2005, Mikulas Patocka wrote:
> On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote:
> > On Mon, Oct 10, 2005 at 10:20:07PM +0100, Anton Altaparmakov wrote:
> > > On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote:
> > > > I've just noticed that the use of sb_getblk differs between locations
> > > > inside the kernel. To be precise, in some locations there are tests
> > > > against its return value, and in some places there are not.
> > > > 
> > > > According to the comments in __getblk definition, the tests are not
> > > > necessary, as the function always return a buffer_head (maybe a wrong
> > > > one),
> > > 
> > > If you had read the source code rather than just the comments you would
> > > have seen that this is not true.  It can return NULL (see
> > > fs/buffer.c::__getblk_slow()).  Certainly I would prefer to keep the
> > > checks in NTFS, please.  They may only be good for catching bugs but I
> > > like catching bugs rather than segfaulting due to a NULL dereference.
> 
> The check should be rather a BUG() than dump_stack() and return NULL --- I
> think it's not right to write code to recover from programming errors.

Why programming errors?  It could be faulty memory or other corruption, 
perhaps even caused by a different driver altogether (e.g. I found a bug 
in ntfs last week which caused it to memset() to zero a random location in 
memory of a random size and it caused a lot of strange effects like my 
shell suddenly exiting and me being left on the login prompt...).  Also it 
could be that the function one day changes and it can return NULL.  It is 
far safer to do checking than to make assumptions about not being able to 
return NULL.

> Filesystem drivers are supposed to pass correct blocksize to getblk(). ---
> even for users it's better to crash, because user whose machine has locked up
> on BUG() will report bug more likely than user whose machine has written stack
> dump into log and corrupted filesystem --- by the time he discovers the
> corruption and mesage he might not even remember what triggered it.
> 
> As comment in buffer.c says, getblk will deadlock if the machine is out of
> memory. It is questionable whether to deadlock or return NULL and corrupt
> filesystem in this case --- deadlock is probably better.

What do you mean corrupt filesystem?  If a filesystem is written so badly 
that it will cause corruption when a NULL is returned somewhere, I 
certainly don't want to have anything to do with it.

Going BUG() is generally a bad thing if the error can be recovered from.  
Certainly all my code attempts to recover from all error conditions it can 
possibly encounter.

I would much rather see NULL and then handle the error gracefully with an 
error message than go BUG().  You can then still umount and remove the fs 
module and everything works fine (you may need an fsck you may not depends 
on how good your error handling is).  If you do a BUG() you are guaranteed 
to cause corruption...  I only use BUG() when something really cannot 
happen unless there is a bug in which case I want to know it...

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux