On Mon, Nov 07, 2005 at 10:50:00AM -0600, Les Mikesell wrote: > On Mon, 2005-11-07 at 09:05, Derek Martin wrote: > > I'm not sure how dovecot works internally, but I know UW-IMAP makes > > temporary copies of the whole mailbox when the user is deleting > > stuff... [snip] > With mbox format there isn't much choice but to copy the whole > file to make any change. Now why would you say a thing like that? ;-) On Mon, Nov 07, 2005 at 12:17:25PM -0500, Tony Nelson wrote: > Mbox is good for using less disk space and often faster searching. Indeed. Though some users argue that it is more convenient to search maildir, because you can use tools like grep and such. > However, an mbox is just a large file, and deleting anything from the > middle (though not the end) /requires/ making a copy of the file without > the deleted messages. OK, people keep chiming in with this misconception, so I gotta speak up. I'm sorry to say it, but this just isn't true... It usually is implemented this way for the sake of simplicity and a little added security (in the sense of data assurance), but it is *not* required by any means. Mbox mailboxes can be re-written in place, which eliminates the need to make copies of the entire mailbox. Though complex, message deletes can be done in place, which saves potentially a great deal of time. I can think of 2 ways to implement expunging deleted messages without making a copy of the mailbox, and without changing the order of the messages. Though the basic idea is the same; one method uses MMIO, and the other uses stream I/O, the basic algorithm for both is the same. I have implemented the MMIO version in the past... I may even still have the code around, if you're curious. The basic jist is to overwrite the deleted message with data from the next undeleted message, moving all the subsequent data down as you go. The advantage to doing it this way is that it is faster than making a copy of the entire mailbox, in the common case. Most people have a number of messages that they save in their mailbox(es), some of which have large attachments, and most of the deleting happens at the end of the mailbox. Deleting messages in place means you don't need to re-copy all that data that's going to hang around. You only need to move some data at the end of the file to earlier parts of the message, and then truncate the file. This can be done with seek and write operations, or it can be done with MMIO, by simply copying the memory. MMIO should be faster, unless the host OS's implementation of MMIO is broken. The downside is that if the system crashes while you're doing this, your mailbox is toast. But that's par for the course with mbox anyway. And of course it's complex to code, so the programmer has to be careful, or your mailboxes are toast. :-D Still, it's too bad it's not implemented this way more often, cuz it is better, and makes the "maildir deletes are faster than mbox" argument a lot less compelling... I can think of a third way to do it, if you don't care about maintaining message order of the mailbox. You can pull messages off the end of the mailbox, writing them to a temporary mailbox, and deleting them as you go. This isn't as good as writing them in place because you're still re-writing the whole mailbox, but at least you never use more space than the amount of mail you want to keep. I like mbox. For most things that I do, it's just plain faster than maildir, and for the rest it mostly doesn't matter. The main reason I use maildir is usually because locking is broken in the given environment, or some other stupid thing like that. Certain back-up software also breaks Mutt's new mail detection, so I occasionally use it to work around that problem. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0x81CFE75D
Attachment:
pgpoD9BKJjQQo.pgp
Description: PGP signature