5 messages in net.sourceforge.lists.courier-maildropRe: [maildropl] Compressing live Mail...
FromSent OnAttachments
Ronan MullallyOct 18, 2004 8:35 am 
Jay LeeOct 18, 2004 8:11 pm 
Ron JohnsonOct 18, 2004 9:14 pm 
Ronan MullallyOct 19, 2004 1:21 pm 
Ron JohnsonOct 19, 2004 6:52 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: [maildropl] Compressing live Maildir(++) directoriesActions...
From:Ron Johnson (ron.@cox.net)
Date:Oct 19, 2004 6:52:34 pm
List:net.sourceforge.lists.courier-maildrop

On Tue, 2004-10-19 at 21:17 +0100, Ronan Mullally wrote:

I'm looking at a scenario for a large scale system with hundreds of thousands of users. At that end of the scale, with the availibility requirements that are being looked for, disk space is not particularly cheap (the disks may be, but the supporting infrastructure is not).

The bottom line is that none of the APIs (and thus none of the apps) that deal with Maildir have taken zlib into account. Thus, the only way to use a compressed filesystem.

While the support infrastructure is expensive if you go SCSI, multi-TB NAS systems are pretty cheap.

I'm not looking to squeeze every last byte out of the storage system, but the potential to knock 20 - 30% off storage requirements could result in considerable savings over time.

There's obviously going to be a performance impact involved - but this would be carried by the front-end servers - which *are* trivially cheap nowadays. The potential may even exist to marginally reduce load on the back-end file servers because - depending on message size - less i/o is required to move compressed messages.

I envisage a system which could operate on both compressed and uncompressed message files. Small files for example needn't be compressed, large files with compressed attachments shouldn't be. Messages belonging to users who haven't logged in for 30 days could be.

-Ronan

On Mon, 18 Oct 2004, Ron Johnson wrote:

On Mon, 2004-10-18 at 23:11 -0400, Jay Lee wrote:

Ronan Mullally said:

Has anybody considered implementing compression within the Maildir backend of the Courier IMAP/maildrop?

Storage is cheaper than Processing. If you're putting entire Maildir folders into a single compressed file, you've just thrown out most of the advantages of Maildir's format, if you're keeping it to per file compression, your space saving isn't going to be a whole lot on your typical <4k email (to really save anything by compressing these files you'll need <4k cluster sizes for your filesystem or suballocation which is again, cpu intensive). Sure this is doable, but for live, active data these days such as mail folders, compression just doesn't make a whole lot of sense.

Jay is right. Disk storage is trivially cheap nowadays.

"You're a good example of why some animals eat their young." Jim Samuels