Re: Readahead — Linux Kernel

On Mon, 26 Sep 2005, Andrew Morton wrote:

> Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> >  Can somebody please tell me where the code is that performs optimistic
> >  readahead when a process does sequential reads on a block device?
> 
> mm/readahead.c:__do_page_cache_readahead() is the main one.  Use
> dump_stack() to be sure.
> 
> >  And can someone explain why those readahead calls are allowed to extend 
> >  beyond the end of the device?
> 
> It has a check in there for reads past the blockdev mapping's i_size. 
> Maybe i_size is wrong, or maybe the code is wrong, or maybe it's a
> different caller.

Thanks for the tip.  The problem I was chasing down was the system's
attempts to read beyond the end of a CD disc.  It turns out the
cause is partly in the block layer and partly in the cdrom drivers.

Here's what happened.  CDs have 2048-byte blocks, and I've got a disc 
containing nothing but a single data track (written with cdrecord) of 
326533 blocks.  (The original .iso was 326518 blocks long and I added 15 
blocks of padding.)

Oddly enough, the values recorded in the disc's Table Of Contents indicate
that the track is 326535 blocks.  Maybe this is normal for cdrecord or for
CDROMs in general -- I don't know.  Anyway, the cdrom drivers believe this
value and report a capacity that is 2 blocks too high.

When I try using dd with bs=2048 to read the very last actual block,
number 326532, the block layer of course issues a read request for an
entire 4 KB memory page.  The drive returns the first 2 KB of data
successfully and reports an error reading the second 2 KB, which is beyond 
the actual end of the track.

Now according to a comment in drivers/scsi/sr.c:

	/*
	 * The SCSI specification allows for the value
	 * returned by READ CAPACITY to be up to 75 2K
	 * sectors past the last readable block.
	 * Therefore, if we hit a medium error within the
	 * last 75 2K sectors, we decrease the saved size
	 * value.
	 */

The code to do this has some flaws, but I fixed them.  The result is that
the stored capacity is reduced to 326533 blocks, as it should be, the SCSI
driver calls end_that_request_chunk(req, 1, 2048), and then it requeues
the request in order to retry the remaining 2048 bytes.  This naturally
fails, and the driver calls end_that_request_chunk(req, 0, 2048).  The
upshot is that the dd process receives an error instead of getting the
2 KB of data as it should.

The _next_ time I use dd to read that block, it works perfectly.  The 
block layer only tries to read 2048 bytes and there's no problem.

So evidently the block layer doesn't like it when a transfer only
partially succeeds, even though that part includes everything up to the
(new) end of the device.  Can this be fixed?  I wouldn't know where to
begin.

It's also worth noting that the IDE cdrom driver does not fix up the 
capacity as the SCSI driver does.  It would be a good idea to copy over 
the code -- I can probably handle that.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: Readahead
  - From: Andrew Morton <akpm@osdl.org>

Prev by Date: Re: [PATCH]: show_free_area shows free pages in pcp list
Next by Date: Re: [PATCH 1/3] Add disk hotswap support to libata RESEND #5
Previous by thread: Re: Readahead
Next by thread: [PATCH 2.4] asus vt8235 router buggy bios workaround
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind]