At 8:53 AM -0400 7/16/05, Matthew Miller wrote: >On Sat, Jul 16, 2005 at 08:04:22AM -0400, fredex wrote: >> Now, for performance reasons, it is often not a good idea ot have >> many thousands of files in a single directory. As the number of files >> grows large the time it takes to access a file grows larger. I haven't >> looked into this on any linux file system, but on other unixes I've >> observed delays reaching up into the whole-second region when many thousands >> of files are in a single directory. > >Shouldn't be a problem on a new install of ext3 on modern Linux. See ><http://lwn.net/Articles/11481/>. I suppose the simple test is to try it. Time (microseconds) per file to run a script that creates files by touching them, to ls a few near the end, and to remove all the files; see script at end of transcript. #files touch ls rm ------ ----- ----- ----- 1000 309 016 054 10000 331 008 051 100000 326 007 357 200000 330 007 1065 I didn't want to try a million files. Creation and listing seem to be linear with the number of files, but rm seems quadratic. I think this indicates that in my current 2.6.12 FC3 kernel using Ext3 the directory data structure is still a list and not a tree. I expect that some sort of directory hierarchy to limit the number of files per directory would still be a win. Transcript follows. Max line width 114 chars. ---------- cut here ---------- [tonyn@localhost ~]$ time ./manyfilestest 1000 real 0m0.309s user 0m0.066s sys 0m0.237s [tonyn@localhost ~]$ time ls manyfiles/99* manyfiles/99 manyfiles/991 manyfiles/993 manyfiles/995 manyfiles/997 manyfiles/999 manyfiles/990 manyfiles/992 manyfiles/994 manyfiles/996 manyfiles/998 real 0m0.016s user 0m0.007s sys 0m0.004s [tonyn@localhost ~]$ time rm -rf manyfiles real 0m0.054s user 0m0.000s sys 0m0.049s [tonyn@localhost ~]$ time ./manyfilestest 10000 real 0m3.312s user 0m0.728s sys 0m2.518s [tonyn@localhost ~]$ time ls manyfiles/999* manyfiles/999 manyfiles/9991 manyfiles/9993 manyfiles/9995 manyfiles/9997 manyfiles/9999 manyfiles/9990 manyfiles/9992 manyfiles/9994 manyfiles/9996 manyfiles/9998 real 0m0.075s user 0m0.051s sys 0m0.021s [tonyn@localhost ~]$ time rm -rf manyfiles real 0m0.519s user 0m0.006s sys 0m0.501s [tonyn@localhost ~]$ time ./manyfilestest 100000 real 0m32.561s user 0m7.494s sys 0m24.285s [tonyn@localhost ~]$ time ls manyfiles/9999* manyfiles/9999 manyfiles/99991 manyfiles/99993 manyfiles/99995 manyfiles/99997 manyfiles/99999 manyfiles/99990 manyfiles/99992 manyfiles/99994 manyfiles/99996 manyfiles/99998 real 0m0.686s user 0m0.513s sys 0m0.162s [tonyn@localhost ~]$ time rm -rf manyfiles real 0m35.653s user 0m0.082s sys 0m5.561s [tonyn@localhost ~]$ time ./manyfilestest 200000 real 1m6.031s user 0m15.243s sys 0m47.962s [tonyn@localhost ~]$ time ls manyfiles/9999* manyfiles/9999 manyfiles/99991 manyfiles/99993 manyfiles/99995 manyfiles/99997 manyfiles/99999 manyfiles/99990 manyfiles/99992 manyfiles/99994 manyfiles/99996 manyfiles/99998 real 0m1.459s user 0m1.113s sys 0m0.269s [tonyn@localhost ~]$ time rm -rf manyfiles real 3m32.851s user 0m0.159s sys 0m12.205s [tonyn@localhost ~]$ cat manyfilestest #!/bin/bash # Make lots of files in a directory. TESTDIR="manyfiles" MAXFILES=${1:-100000} mkdir $TESTDIR cd $TESTDIR for (( i=0; i*10<$MAXFILES; i=i+1 )) ; do touch ${i}0 ${i}1 ${i}2 ${i}3 ${i}4 ${i}5 ${i}6 ${i}7 ${i}8 ${i}9 ; done [tonyn@localhost ~]$ ---------- cut here ---------- ____________________________________________________________________ TonyN.:' <mailto:tonynelson@xxxxxxxxxxxxxxxxx> ' <http://www.georgeanelson.com/>