Here's a tidbit which will explain part of it:
- Suppose that you have 1 million documents indexed by Google Desktop
- D = Of these, exactly 100,000 have the word "DAVE" in them.
- When you search, you ask for "the first 100 documents with DAVE".
- Google Desktop doesn't actually know how total many total documents
there are with DAVE. So it attempts to estimate it. Since determining
the exact value of D will require iterating through all 1 million
documents, and that is wasteful, GDS will extrapolate.
- As such, it may guess D=200,000 after the first 100 results, then
123,000 after 200, then 80,000 after 300, 103,000 after 500, etc...
- In general, the higher your offset, the more accurate the total
number will be.
- Eventually it will settle on 100,000, which is the correct number.
If you assume that "documents that are EMAILs" are equivalent to
"documents containing the word DAVE", then this may explain part of the
above situation.