Disk usage vs file size (was Re: gorged harddrive)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2008-04-07 at 19:56 -0500, Paul Johnson wrote:
> On Mon, Apr 7, 2008 at 4:58 PM, Patrick O'Callaghan
> <pocallaghan@xxxxxxxxx> wrote:
> > On Wed, 2008-04-02 at 23:13 +0800, Ed Greshko wrote:
> >
> >
> >  "du -b" acts differently from "du", and "du -k", and "du -m". The latter
> >  three all give the real disk usage, which for a sparse file will be low
> >  until the file fills up. "du -b" gives the *apparent* file size, which
> >  can indeed be larger than the total filesystem size.
> >
> >  So it all comes down to RTFM ...
> >
> 
> I've seen this same trouble with torrent downloads.  And I'm not quite
> sure about its practical meaning.
> 
> In particular, I wonder "How to kill the sparsely filled incomplete
> torrent files"?
> 
> I'm using the KDE program ktorrent.  When you start to download a
> bittorrent it creates a  bunch of file names and reserves all that
> space. When the bit torrent client closes, it leaves behind those file
> names even if they are not completed.   It seems to me that "du" gives
> the reserved space totals, not the actually filled in space. (I don't
> use the --apparent-files option).   I there any way to know if the
> torrent files are downloaded completely?

By default Ktorrent creates a file of the right size by 1) creating the
file, 2) seeking to <size-1> bytes from the start, 3) writing a null
byte[1]. From the point of view of the system the file is now <size>
bytes long, but occupies only 1 byte of actual disk space (1 block in
fact, due to allocation policy). This is an example of a "sparse" file.
'du' will show the real disk usage, except with the '-b' option where it
shows the apparent size. 'ls -l' shows the apparent size. Here's a
little program to illustrate:

        #include <stdio.h>
        #include <unistd.h>
        
        int main(int argc, char *argv[])
        {
            int fd = creat("alloc-file", 0644);
            if (fd < 0) {
                perror("alloc-file");
                return 1;
            }
            if (lseek(fd, 1024L * 1024L * 1024L, SEEK_SET) < 0) {
                perror("lseek()");
                return 2;
            }
            if (write(fd, "", 1) != 1) {
                perror("write()");
                return 3;
            }
            close(fd);
            return 0;
        }

To test it, run "make alloc" followed by "./alloc". Now look at the file
"alloc-file" with various tools:

        # ls -l alloc-file
        -rw-r--r-- 1 poc poc 1073741825 2008-04-07 17:20 alloc-file
        # du -b alloc-file
        1073741825      alloc-file
        # du alloc-file
        12      alloc-file
        # du -k alloc-file
        12      alloc-file
        # du -m alloc-file
        1       alloc-file
        
Ktorrent also has an option to preallocate the file space, whereupon it
physically writes into each block at startup, so the space is really
reserved. In this case there's no difference between the apparent and
real size of the file. The only reason to do this is to reserve enough
space to hold the entire torrent in case you're worried about running
out part-way through.
        
> I'm guessing it is impossible to tell if a file is filled up without
> the bit torrent client itself to compare the files.    Sometimes the
> client itself can't tell.

The BT clients are supposed to checksum each segment as its downloaded.
You can also do a global checksum to make sure everything is OK.

> I have the following problem with the KDE torrent client ktorrent.
> When ktorrent starts downloading a torrent, and then the system is
> turned off, then re-started, the ktorrent starts to download stuff
> again.  It downloads the working files to a temporary space, but when
> it finishes downloading into the temporary space and it tries to copy
> into the finished directory, then it mistakenly thinks that the files
> already exist and it asks for new file names.  In other words, even
> ktorrent is fooled by the "saved space" it marked out when it started
> previously.  If you give it new file names, it creates them side by
> side with the "apparent but really empty" files, and du shows the same
> information for them.  The only way to tell if the files are real is
> to try to load them in a program (in the case of music or video) or
> mount (in the case of iso files).

This doesn't happen to me. I can kill Ktorrent (or reboot my system),
start it up again, and it resumes where it left off. I don't know if
it's relevant, but I have "Automatically save downloads to:" set and
"Move completed downloads" and "Copy .torrent files to:" both unset.

poc

[1] Actually it's not 1, but I simplified.


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux