Dan Ladd wrote: > Yeah this is the only email address i could find because I had a > question about the fedora project. I didn't konw where to direct it. > So here goes... I was wondering about the WinFS file management > system. If something like that (meaning the built in database > system) was going to be included in fedora or other RedHat operating > systems or if something like is already in an operating system with > RedHat? It seems like it is old technology because the AS/400 or now > the iSeries has the built in database system to locate physical files > that are stored in logical files. If you could direct me to where i > can have that answered that would be awesome. Oohh. *BIG* question. Background: Unix invented much of the "everything is a file, stored in a file tree, and the OS just sees a normal file as a collection of bits" philosophy that is now more or less standard. Meanwhile, the relational database became the standard way of storing smaller, structured data. *Lots* of computer scientists have been wondering whether this split between ways of storing data is ideal. So there has been a lot of work done looking at the best ways to store general-purpose files in a database. It turns out that this is a Hard Problem. Storing files as opaque, binary objects in a database isn't a problem, a lot of modern filesystems effectively do this. The question is whether we can take anything else from the database world. Here you should understand that there is no agreed vision of how things should work. This is the main point I want to make. So you will have to work out what you want from a database filesystem, and see what provides it. The two big problems come under the heading of writing and reading. Writing is relatively easy, since you can define the problem: It Would Be Nice If Linux allowed multiple updates to one file or to many files to be treated as a transaction. Even there, there is the unfortunate detail of getting transactions to span filesystems. Reading is more of a problem. Many file formats keep internal metadata (author, image size, artist, etc.), and there is a demand to keep more data against files (e.g. Access Control Lists). Many people think that there should be a better way of finding all recordings of Vaughan Williams' works than finding all MP3s, all OGGs, all Real Media, etc. and running format-specific query programs against each file. (One still has a problem if some-one entered "Old Hundredth, arr. RWV", but that can be ignored in the first few versions...) Maybe icons for a file should be stored against the file. This is the metadata problem, or rather, series of problems. One is, simply, how do you present metadata under Unix-like systems? Solaris has a special system call and program to access the metadata: Hans Reiser is proposing to allow you to access each file as a directory with the metadata available underneath (obviously, this isn't practical with real directories). The other big problem is how much metadata should move around with a file. It's obvious that you want to be able to export files in an existing format, which will drop any metadata that isn't already in the format. (You still need to support existing filesystems, for example on CDs). Then when you're copying things around, some metadata (user to last modify) should change, others (user to first modify) shouldn't. This means that something is going to have to know a *lot* about the way that metadata works, which means you are going to have a lot of per-filetype programming and/or a lot of rigidity. The main contender to "solve" both of these problems is the Reiser 4 filesystem. This is still very new, and has a number of problems with it. * It's very new, not fully debugged, and has a number of security and reliability problems. * It stores metadata to a file by treating the file as a directory, and putting metadata as pseudo-files in that directory. That changes the way users and programs think about files, and will invalidate a lot of assumputions. See http://lwn.net/Articles/14035/ . Other contenders include user-space plugins to the Gnome or KDE virtual filesystems. These can be reasonably taught "this is an MP3, this is an XML document", and retrieve the meta-data on demand. It still isn't clear how best to make this visible to non-technical end- users. It's largely those people who *aren't* happy with shell scripting who would most benefit for easy ways to look for files with Vaughan Williams recordings. (Those who can will probably have the sense to put RVW in the pathname somewhere, and can use custom tools). On top of this, maybe some files should be word-indexed. It doesn't make sense doing this for Ogg files, though: Microsoft's Find Fast has long done this in userspace with a separate database, and this does seem much better than putting the suppot in the kernel. I don't know much about OS/400: it always sounded as though they implemented the database first, and then created the entire OS and related applications around the database. They had the advantage that everything knew it was going to be working with a database, and progams on OS/400 probably really want a database backend anyway. Linux doesn't have that, and is a much more general purpose OS. I've also come across http://lwn.net/Articles/56923/ : you might want to read that, too. Note WinFS itself appears to be delayed until the end of the decade. Sorry for the length of this e-mail: there's more I could say, but won't. James. -- E-mail address: james | Examiner: How does an AC motor start? @westexe.demon.co.uk | Student: vrrrrrrrrrrRrRRRRRRR... | Examiner: Stop! Stop! | Student: RRRRRRRmmmmm.