atom feed11 messages in org.haskell.librariesRe: Proposal: Allow gunfold for Data....
FromSent OnAttachments
Edward KmettAug 29, 2012 10:34 am 
Milan StrakaAug 29, 2012 12:24 pm 
Edward KmettAug 29, 2012 4:54 pm 
Johan TibellAug 29, 2012 5:02 pm 
Edward KmettAug 29, 2012 5:57 pm 
Johan TibellAug 29, 2012 6:06 pm 
Brent YorgeyAug 30, 2012 6:50 am 
José Pedro MagalhãesSep 1, 2012 4:13 am 
Edward KmettSep 1, 2012 5:22 am 
Edward KmettNov 21, 2012 7:31 pm 
Milan StrakaNov 22, 2012 12:18 am 
Subject:Re: Proposal: Allow gunfold for Data.Map, Data.IntMap, etc.
From:Edward Kmett (ekm@gmail.com)
Date:Aug 29, 2012 4:54:34 pm
List:org.haskell.libraries

On Wed, Aug 29, 2012 at 3:24 PM, Milan Straka <fo@ucw.cz> wrote:

Hi Edward,

I would like to propose improving the Data instances for a number of currently completely opaque data types in the containers package, by using virtual constructors.

The instance for Data.Map already uses fromList for gfoldl, it just stops there.

Extending it to be able to gunfold and mention the name of that constructor would enable generic traversal libraries like uniplate, etc. to work over the contents of the Map, rather than bailing out in fear or crashing at the sight of a mkNoRepType.

An example of the changes for Data.Map are highlighted below.

instance (Data k, Data a, Ord k) => Data (Map k a) where gfoldl f z m = z fromList `f` toList m toConstr _ = fromListConstr gunfold k z c = case constrIndex c of 1 -> k (z fromList) _ -> error "gunfold" dataTypeOf _ = mapDataType dataCast2 f = gcast2 f

fromListConstr :: Constr fromListConstr = mkConstr mapDataType "fromList" [] Prefix

mapDataType :: DataType mapDataType = mkDataType "Data.Map.Map" [fromListConstr]

I've used this approach for years on my own libraries to great effect.

+1 here.

I am not very familiar with the Data instances -- is it true that the parameter of the `fromList` in the Data instance will often be sorted (i.e., result of `toList` or `filter . toList`)? If so, we could use fromMaybeAscList which would look like fromMaybeAscList list | isDistinctAsc list = fromDistinctAscList list | otherwise = fromList list There is a big gain in using a linear-time fromDistinctAscList over O(N log N) fromList, but there is a linear-time check and the list must be kept around until isDistinctAsc finishes.

The users of Data.Data could in theory do anything they want to the keys, but I do confess for most scenarios they'll come back to you ordered.

Hrmm. A more nuanced fromList construction could definitely help, though I suppose that could apply in the general case as well.

We should be able to fuse this "try to construct linearly, but fall back on N-log-N" version of fromList in one pass even for normal uses of fromList.

e.g. assume that you are constructing a sorted tree until you find a key out of order, then take the tree you've built so far and union it appropriately with the slower constructed fromList of the remainder. That way you don't have to retain the storage for both the list and the map, and we only force the list once.

-Edward