atom feed9 messages in com.selenic.mercurialrevlog experiments
FromSent OnAttachments
Chris MasonFeb 16, 2006 4:20 pm 
Bryan O'SullivanFeb 16, 2006 4:35 pm 
Matt MackallFeb 16, 2006 5:06 pm 
Chris MasonFeb 16, 2006 6:13 pm 
Chris MasonFeb 16, 2006 6:17 pm 
Matt MackallFeb 16, 2006 6:18 pm 
Chris MasonFeb 16, 2006 6:55 pm 
Matt MackallFeb 16, 2006 7:10 pm 
Chris MasonFeb 19, 2006 1:20 pm 
Subject:revlog experiments
From:Chris Mason (mas@suse.com)
Date:Feb 16, 2006 4:20:41 pm
List:com.selenic.mercurial

Hello everyone,

I've been poking from time to time at making the revlogs faster and smaller. The basic theory is that having two files in .hg/data per source file wastes filesystem space due to partial blocks and inodes, and it also introduces extra seeks for reading the inodes and directory entries.

So, I've done two experiments. First, I changed revlog to delta against the parent revision. This makes for better overall compression at the cost of extra seeks while reconstructing the revision. Second, I changed filelog to create one revlog per directory.

This code is very quick and dirty, so I won't bother attaching it, but the results are worth talking about. Matt, how do you feel about (optionally) doing one revlog per directory? I still owe Matt an experiment with gzip flushing windows.

linear delta (vanilla hg code): Total number of files under .hg: 41128 repo size: 330MB manifest size: 90MB Time to checkout (cold): 1m51s (31s user, 14s sys) (hot): 40s (28s user, 8s sys)

delta against parent, no packing: Total number of files under .hg: 41128 repo size: 285M manifest size: 28M Time to checkout (cold): 1m59s (32s user, 14s sys) (hot): 41s (29s user, 10s sys)

parent delta, packing: Total number of files under .hg: 2426 repo size: 173MB manifest size: 28MB Time to checkout (cold): 1m13s (35s user, 9s sys) (hot): 45s (35s user, 9s sys)

-chris