Re: [PATCH] fat: Editions to support fat_fallocate()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Steve, comments below.

On 12/22/07, Steven Cavanagh <[email protected]> wrote:
> From: Steven Cavanagh <[email protected]>
>
> Added support for fallocate for a msdos fat driver. This allows
> preallocation of clusters to an inode before writes to reduce
> file fragmentation

Not technically true.  This doesn't make any guarantees about
fragmentation (even if in general it works pretty well).  To really
reduce fragmentation, then cluster allocation needs to be made smarter
so it goes looking for contiguous clusters when allocating, but that's
a task for a separate patch.  Please adjust your description.

Also, please briefly describe the testing that you've performed.

More comments below.

> Signed-off-by: Steven.Cavanagh <[email protected]>
> ---
>
>  fs/fat/file.c |   45 +++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 45 insertions(+), 0 deletions(-)
>
> diff --git a/fs/fat/file.c b/fs/fat/file.c
> index 69a83b5..f753c6a 100644
> --- a/fs/fat/file.c
> +++ b/fs/fat/file.c
> @@ -15,6 +15,7 @@ #include <linux/buffer_head.h>
>  #include <linux/writeback.h>
>  #include <linux/backing-dev.h>
>  #include <linux/blkdev.h>
> +#include <linux/falloc.h>
>
>  int fat_generic_ioctl(struct inode *inode, struct file *filp,
>                       unsigned int cmd, unsigned long arg)
> @@ -312,8 +313,52 @@ int fat_getattr(struct vfsmount *mnt, st
>  }
>  EXPORT_SYMBOL_GPL(fat_getattr);
>
> +/*
> + * preallocate space for a file. This implements fat fallocate inode
> + * operation, which gets called from sys_fallocate system call. User
> + * space requests len bytes at offset.
> + */
> +long fat_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
> +{
> +       int ret = 0;
> +       loff_t filesize = inode->i_size;
> +
> +       /* preallocation to directories is currently not supported */
> +       if (S_ISDIR(inode->i_mode)) {
> +               printk(KERN_ERR
> +               "fat_fallocate(): Directory prealloc not supported\n");

This printk is probably not needed, or at least make it a pr_debug()

> +               return -ENODEV;
> +       }
> +
> +       if ((offset + len) <= MSDOS_I(inode)->mmu_private) {
> +               printk(KERN_INFO
> +                       "fat_fallocate():Blocks already allocated\n");

Drop the printk() here.  It will cause a write to the system log
everytime userspace does an unnecessary fat_fallocate().  That becomes
a performance hit which I don't think we want.

> +                       return 0;
> +       }
> +
> +       if ((offset + len) > MSDOS_I(inode)->mmu_private) {
> +
> +               mutex_lock(&inode->i_mutex);
> +               ret = fat_cont_expand(inode, (offset + len));
> +               if (ret) {
> +                       printk(KERN_ERR
> +                               "fat_fallocate():fat_cont_expand() error\n");
> +                       mutex_unlock(&inode->i_mutex);
> +                       return ret;
> +               }
> +               mutex_unlock(&inode->i_mutex);
> +       }
> +       if (mode & FALLOC_FL_KEEP_SIZE) {
> +               mutex_lock(&inode->i_mutex);
> +               i_size_write(inode, filesize);
> +               mutex_unlock(&inode->i_mutex);
> +       }

Race condition.  The file is increased in size and then returned to
the original size if FALLOC_FL_KEEP_SIZE is set, but the lock is
dropped then reobtained between increasing the size and restoring to
the original.  Another file operation can occur between the two
operations.  Badness!

However, digging further, when FALLOC_FL_KEEP_SIZE is set, I don't
think fat_cont_expand() has the behaviour that we want to implement.
When that flag is set, I think we simply want to add clusters
associated with the file to the FAT.  We don't want to clear them or
map them into the page cache yet (that should be done when the
filesize is increased for real).

I believe a call to fat_allocate_clusters() is all that is needed in
this case.  Hirofumi, please correct me if I'm wrong.

Cheers,
g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
[email protected]
(403) 399-0195
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux