On Wednesday 02 November 2005 17:12, Badari Pulavarty wrote:
> Hi Andrew & Andrea,
>
> Here is the updated patch with name change again :(
> Hopefully this would be final. (MADV_REMOVE).
>
> BTW, I am not sure if we need to hold i_sem and i_allocsem
> all the way ? I wanted to be safe - but this may be overkill ?
While looking into this, I probably found another problem, a race with
install_page(), which doesn't use the seqlock-style check we use for
everything else (aka do_no_page) but simply assumes a page is valid if its
index is below the current file size.
This is clearly "truncate" specific, and is already racy. Suppose I truncate a
file and reduce its size, and then re-extend it, the page which I previously
fetched from the cache is invalid. The current install_page code generates
corruption.
In fact the page is fetched from the caller of install_page and passed to it.
This affects anybody using MAP_POPULATE or using remap_file_pages.
> + /* XXX - Do we need both i_sem and i_allocsem all the way ? */
> + down(&inode->i_sem);
> + down_write(&inode->i_alloc_sem);
> + unmap_mapping_range(mapping, offset, (end - offset), 1);
In my opinion, as already said, unmap_mapping_range can be called without
these two locks, as it operates only on mappings for the file.
However currently it's called with these locks held in vmtruncate, but I think
the locks are held in that case only because we need to truncate the file,
and are hold in excess also across this call.
Instead, we need to protect against concurrent faults on the mapping (not
against concurrent mmaps)...and that is done through (struct address_space*)
mapping->truncate_count.
=====
Finally, there is MAP_POPULATE and other pre-faulting, i.e. install_page (no,
this is not peculiar to VM_NONLINEAR, even if some code is shared), but
install_page checks explicitly for truncation; the problem is that the check
is rather bogus, compared to the rest of checks:
/*
* This page may have been truncated. Tell the
* caller about it.
*/
err = -EINVAL;
inode = vma->vm_file->f_mapping->host;
size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
if (!page->mapping || page->index >= size)
goto err_unlock;
I remember there being a BUG_ON and Linus fixing it up.
It should be converted to the normal checks used for the rest
(->truncate_count based - see do_no_page()).
To do so, the caller (*_populate) needs to call again *_getpage, if
install_page detects a race, but it must also save and pass the
truncate_count.
So, we probably want need to cleanup and join {filemap,shmem}_populate
together, because the only real difference between them is the function
called to lookup the page from disk ({shmem,filemap}_getpage).
So, we should replace struct vm_operations_struct "populate" method with a
"getpage" method, by using the shmem_getpage prototype, which is better
engineered, see my comment:
page = filemap_getpage(file, pgoff, nonblock);
/* XXX: This is wrong, a filesystem I/O error may have happened. Fix
that as
* done in shmem_populate calling shmem_getpage */
if (!page && !nonblock)
return -ENOMEM;
> + truncate_inode_pages_range(mapping, offset, end);
> + inode->i_op->truncate_range(inode, offset, end);
> + up_write(&inode->i_alloc_sem);
> + up(&inode->i_sem);
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]