atom feed19 messages in com.selenic.mercurial-develRe: [PATCH 02 of 11] scmutil: add fil...
FromSent OnAttachments
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Idan KamaraJul 16, 2011 7:34 am 
Adrian BuehlmannJul 16, 2011 9:03 am 
Matt MackallJul 18, 2011 1:12 pm 
Adrian BuehlmannJul 18, 2011 1:32 pm 
Matt MackallJul 18, 2011 2:29 pm 
Adrian BuehlmannJul 18, 2011 3:26 pm 
Idan KamaraJul 19, 2011 3:23 am 
Adrian BuehlmannJul 19, 2011 4:53 am 
Subject:Re: [PATCH 02 of 11] scmutil: add filecache, a smart property-like decorator that compares stat info
From:Adrian Buehlmann (adr@cadifra.com)
Date:Jul 19, 2011 4:53:08 am
List:com.selenic.mercurial-devel

On 2011-07-19 12:23, Idan Kamara wrote:

On Tue, Jul 19, 2011 at 1:26 AM, Adrian Buehlmann <adr@cadifra.com <mailto:adr@cadifra.com>> wrote:

On 2011-07-18 23:29, Matt Mackall wrote:

On Mon, 2011-07-18 at 22:32 +0200, Adrian Buehlmann wrote:

On 2011-07-18 22:12, Matt Mackall wrote:

On Sat, 2011-07-16 at 18:03 +0200, Adrian Buehlmann wrote:

On 2011-07-16 16:34, Idan Kamara wrote:

# HG changeset patch # User Idan Kamara <idan@gmail.com <mailto:idan@gmail.com>> # Date 1310227619 -10800 # Node ID b99305dd59279aec962e23da2a362e0d8b785965 # Parent d36f5aec2f9e4214fafe048bccd0bb47ac5f9c16 scmutil: add filecache, a smart property-like decorator that compares stat info

The idea is being able to associate a file with a property, and watch that file stat info for modifications when we decide it's important for it to be up-to-date. Once it changes, we recreate the object.

As a consequence, localrepo.invalidate() will become much less expensive in the case where nothing changed on-disk.

diff -r d36f5aec2f9e -r b99305dd5927 mercurial/scmutil.py --- a/mercurial/scmutil.py Sat Jul 16 15:30:43 2011 +0300 +++ b/mercurial/scmutil.py Sat Jul 09 19:06:59 2011 +0300 @@ -709,3 +709,41 @@ raise error.RequirementError(_("unknown repository format: " "requires features '%s' (upgrade Mercurial)") % "', '".join(missings)) return requirements + +class filecache(object): + '''A property like decorator that tracks a file under .hg/ for updates. + + Records stat info when called in _invalidatecache. + + On subsequent calls, compares old stat info with new info, and recreates + the object when needed, updating the new stat info in _invalidatecache.''' + def __init__(self, path, instore=False): + self.path = path + self.instore = instore + + def __call__(self, func): + self.func = func + self.name <http://self.name> = func.__name__ + return self + + def __get__(self, obj, type=None): + path = self.instore and obj.sjoin(self.path) or obj.join(self.path) + + if self.name <http://self.name> in obj._invalidatecache: + cacheentry = obj._invalidatecache[self.name <http://self.name>] + stat = util.stat(path) + + if stat != cacheentry[1]: + cacheentry[1] = stat + result = cacheentry[0] = self.func(obj) + else: + result = cacheentry[0] + else: + # stat -before- reading so our cache doesn't lie if someone + # modifies between the time we read+stat it + stat = util.stat(path) + result = self.func(obj) + obj._invalidatecache[self.name <http://self.name>] = [result, stat, path] + + setattr(obj, self.name <http://self.name>, result) + return result

What happens if the file changed its contents without changing mtime nor size?

Excellent question. Answer: we lose.

We need to cache and compare -the whole stat result-. There's absolutely no reason not to here.

How does that solve the problem of missing a file change that changes file contents without changing size nor mtime? (and thus failing to call func again)

We've got three buckets we can dump filesystems into:

have subsecond timestamps (eg NTFS, Btrfs, ext4..): changes are detected by comparing timestamps have inodes (ext3, HFS+): changes made by non-append operations are made atomic rename and result in timestamp changes neither (eg VFAT): similar issues (and solutions) to dirstate apply

And good luck with Windows shares.

Anyway, I don't think this will work. This code is trying to be too clever.

And discovering all the cases where it fails will be very hard.

The current plan is to trust the cache when we have inode info or subsecond precision.

If it doesn't we will (for now) always reread the file. So filesystems that don't have that information, won't gain anything for now.

Ok. That sounds better.

On Windows, we might also use FileIndex http://msdn.microsoft.com/en-us/library/aa363788(v=vs.85).aspx http://hg.intevation.org/mercurial/crew/file/647071c6dfcf/mercurial/win32.py#l44

which resembles inode's. If the FileIndex has changed, we can infer that the file has changed.

Would be nice if we could eventually use subseconds inside .hg/dirstate as well (or in whatever file that higher resolution info would be saved).

Later on we can optimize it like Matt explained yesterday on IRC, if we also add the time the file was read to the equation.

Uh. More complicated tricks...