On 09/03/2010 04:40 PM, Cameron Simpson wrote: > On 03Sep2010 14:21, JD<jd1008@xxxxxxxxx> wrote: > | I have two mounted disks, both ext3 mounted > | as > | /sdb1 > | /sdc1 > | > | On /sdb1 I have a directory, let's call it dirx. > | > | 1. rm -rf /sdc1/dirx > | > | 2. cd /sdb1 > | 3. tar cf - dirx | tar -C /sdc1 -xpf - > | > | Neither dir (/sdb1 and /sdc1) are not accessed by any programs other > | than the tar program (and of course /sdb1 is the shell's CWD). > | The shell's history file is in my home dir. > | > | After tar: > | > | 4. du -sk dirx /sdc1/dirx > | 2904536 /sdc1/dirx > | 2802124 dirx > | > | So, why this size inflation by 104MiB ? > | > | I repeated the process twice. Same difference. > | > | Other dirs tarred in this way from sdb1 to sdc1 do not show this > | discrepancy. > > There are two possible sources of discrepancies that I can think of: > - different filesystem types > - different directory packing > - file fragmentation > > I presume we can discount the first one. > > Directory packing normally is _better_ in a new directory; older directories > can accumulate holes from file deletions. So the second one seems unlikely too. > The way to check is to walk the trees with find and tally sizes with > awk: > > find /sdb1/dirx -type d -ls | awk '{sum += $7} END { print sum }' > find /sdc1/dirx -type d -ls | awk '{sum += $7} END { print sum }' > > The size difference seems to large for this anyway. > That leaves file fragmentation. Does sdc1 have a lot of other data? > Maybe complete MP3s won't fit into the gaps, and must be broken up more. > Again, like new directories, there is normally less fragmentation in > copied files, not more. And MP3s tend to be written in one go anyway, so > the source files are probablem not fragmented either. > > None of these choices seem likely to me. > > There is a final option which should not apply because these are different > fileystems and also because your files are definitely copies: hard link > counting. du notices hard links and correctly does not count the second > name twice. If you do this: > > du -sk dir1 dir2 > > and dir1 and dir2 have some files hard linked between them then du will > not count the hardlinked files when it encounters them, and you would > then see "dir2" have a lower count than you might expect otherwise. > > The way to check this one is to run two dus: > > du -sk dir1 > du -sk dir2 > > You can also scour your tree for hard links: > > find /sdc1/dirx -type f -nlink +1 -ls > > though your tar copy should preserve the hard linking in your copy, and > thus not change the totals. > > In short, several things are listed above that can produce different "on > disc" sizes for copied data, and I don't really think any of them > explain your results. But do some of the checks I suggest - if nothing > else they may reveal more clues. > > | Dirx contains mp3's. > > "MP3s", please. There are no apostrophes in plurals! > > Cheers, I believe it must be directory packing. sdc1 is 83% full and sdb1 is 81%full. There is also a high fragmentation of free space on both sdb1 and sdc1. Thanx for the info!!!! -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines