Re: A little coding style nugget of joy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andi Kleen wrote:
Matt LaPlante <[email protected]> writes:

Since everyone loves random statistics, here are a few gems to give you a break from your busy day:

Number of lines in the 2.6.22 Linux kernel source that include one or more trailing whitespaces: 135209
Bytes saved by removing said whitespace: 151809

You don't actually save anything on disk on most file systems
(essentially everything except reiserfs on current Linux)
because all files are rounded to block size (normally 4K)
Same in page cache.

This is a terrible assumption in general (i.e. if filesize % blocksize is close to uniformly distributed). If you remove one byte and the data is stored with blocksize B, then you either save zero bytes with probability 1-1/B or you save B bytes with probability 1/B. The expected number of bytes saved is B*1/B=1. Since expectation is linear, if you remove x bytes, the expected number of bytes saved is x (even if there is more than one byte removed per file).

In my tree, about half of the files have size >= 4k, so the assumption is probably not _that_ far off the mark.

Alternatively, there are an average of about 16 bytes removed per file, and there are 11 which are <= 16 bytes short of a 4k boundary, so it's not at all unreasonable that we'd save 40-50k.


And in tar files bzip2/gzip is very good at compacting them.

That's true.

--Andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux