Fedora Users — Re: Disk defragmenter in Linux

Ed Hill wrote:

On Fri, 2005-12-30 at 16:40 -0600, Mike McCarty wrote:

Guy Fraser wrote:

Finally were back to the original post.

[snip]

the cause of the initial posting. This forum is not well suited todiscussing how files are allocated, because there are too manydifferent file systems that use different algorithms to determinewhen to allocate space for a file in a fragment. In basic terms
Untrue in this context, as the OP specifically requested to find
a defragmenter for ext3. That's what led to the claim that
a defragmenter is not necessary for ext3, as it has some inherent
immunities to fragmentation.
Hi Mike,

Hi! I preface this by saying that nothing in here is intended to
be or sound rude. Ok?

Even if there is fragmentation, it simply DOES NOT MATTER if it doesn't
result in a measurable performance hit.  So, what benchmarks can you

I never said otherwise.

cite that show us how fragmentation degrades performance on a Linux

(specifically, ext3) filesystem?

Or, can you create your own test?  I mean this very sincerely.  If you
want to argue that something matters then you need to back it up with

I don't want to spend the time necessary to try to devise a test.
I possibly could, though it would take some study of the file system,
and might require mods to the file system. I don't know.

some actual measurements.  If fragmentation matters then you should be
able to devise a test case that demonstrates it.

What I *did* claim is that ext3 is subject to fragmentation.
I don't recall stating that it was something I was particularly
concerned about. I responded to a claim which was demonstrably
false, and was being used as an argument to tell someone
asking a polite question that he shouldn't be asking the question.

Another question, which AFAIK remains unaswered, though posed
by Ed Hill, is just what is the performance degradation which
might be suffered. Unfortunately, that is completely dependent
on the use to which the file is put, and how often it is read.


Its not another question.  Its the only good reason for getting into
this discussion.

It is not. When someone asks a question, politely, in a reasonable
forum, he deserves a reasonable answer, not an argument. YOU don't
know what all his reason for asking was. Perhaps he wants to compress
all the used space down to one end of the drive for purposes of
splitting the partition. What difference does it make? If he posed
a reasonable question, he deserves a reasonable answer. He does
not deserve being told that his question has no basis, because
the circumstance doesn't occur, when in fact it *does* occur.

Most (all today?) disc drives have read-ahead caching built into
the drive, so that reads to sequential sectors are quite a bit
faster than random reads, even when no seek is necessary.

Yes, but such things only matter on the initial read from the disk.  The
Linux VFS+VM will in all likelihood obviate any need to repeatedly read
blocks from a disk for frequently accessed files.  So for commonly used
blocks, the cost is in all likelihood amortized.

You seem argue against points I don't make, and then don't respond to
the points I *do* make.

Perhaps I'm not being clear enough. I dunno. Truly, I'm getting
confused by this thread.

Can you demonstrate that the *initial* read really costs more?  And, if
so, how much?

It matters, as I said, depending on what use the file gets
read. (I didn't say how often it gets read sequentially, I said
what use.) It also depends on how large it is. You apparently have not
actually written disc caching code. I have. In particular, I have
made some mistakes writing disc caching code. When the size of the
file is somewhat larger than the cache can hold, and the whole file
gets read repeatedly sequentially, then many caching algorithms
thrash badly. In particular, LRU often thrashes badly. It is a
theorem that *any* caching algorithm has a circumstance which causes it
to behave *worse* than no caching. (The particular circumstance
may not be sequential read, BTW.)

Which is why I said that it unfortunately depends on the use the
file gets. I just didn't get into gritty details, because I
didn't expect anyone to argue against a theorem of computer science.

Mike
--
p="p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
You have found the bank of Larn.
I can explain it for you, but I can't understand it for you.
I speak only for myself, and I am unanimous in that!