On Wed, 2010-05-19 at 15:40 -0430, Patrick O'Callaghan wrote: > On Wed, 2010-05-19 at 14:07 -0400, aragonx@xxxxxxxxxx wrote: > > The data in the files is of the unstructured binary type. When I do a > > search, I have _most_ of the file name. Enough to uniquely identify > > it. > > So you don't need to look into the file to get a match? Sounds like the > best procedure would just be to keep an index of all the filenames and > update it when files are added/removed (assuming you have control over > both of these processes). A simple database should be able to handle > this easily, which is pretty much what you suggested yourself. In fact > it looks so simple that a Berkeley DB file would do it, without needing > all the fancy DB machinery or MySQL or Postgres. See for example "man > DB_File". Is there any reason to not use the already existing updatedb/locate combo? The fedora updatedb seems to be based on mlocate, which as far as I know uses the mtime of directories to tell if a directory has changed since the last scan (mtime of the directory will change if files have been added or deleted). This should speed up runs unless a lot of directories change between runs. You can disable the default updatedb configuration and run it manually (or in cron jobs) specifying one file system for each job. Let them run in parallel with output to separate bases. Then globally set the environment variable to tell locate where to look so it finds all the bases. Look at the man pages for updatedb, updatedb.conf locate and mlocate.db. The last one is very optional. -- birger -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines