| From | Sent On | Attachments |
|---|---|---|
| Leonard Mada | Jun 7, 2007 2:14 pm | |
| Patrick Durusau | Jun 7, 2007 3:53 pm | |
| Eike Rathke | Jun 8, 2007 9:39 am | |
| Leonard Mada | Jun 8, 2007 5:10 pm | |
| Patrick Durusau | Jun 9, 2007 4:11 am | |
| Leonard Mada | Jun 9, 2007 12:33 pm | |
| David A. Wheeler | Jun 10, 2007 5:17 pm |
| Subject: | DISTINCT Values | |
|---|---|---|
| From: | Leonard Mada (disc...@gmx.net) | |
| Date: | Jun 7, 2007 2:14:18 pm | |
| List: | org.oasis-open.lists.office-comment | |
I strongly miss a function to return the number of DISTINCT values existent in a given cell range. To my knowledge, this functionality is missing in every spreadsheet application, although most of the research involves such analysis.
I would be further interested to perform some operations using these distinct values.
Functions: DISTINCT( 'cell_range', AS.TEXT = TRUE, IGNORE.CASE = TRUE) returns number of distinct strings in the cell range
DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = 0) returns number of different values DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = 0.5, ORDER = "ASCENDING") returns number of different values; values within TOLERANCE are considered EQUAL - values are ranked first using the specified ORDER - IF( x[i] is within TOLERANCE of x[i-1]), the 2 values are considered equal DISTINCT( 'cell_range', AS.TEXT = FALSE, TOLERANCE = '5%', ORDER = "ASCENDING") returns number of different values; values within TOLERANCE are considered EQUAL - values are ranked first using the specified ORDER - IF( x[i] is within TOLERANCE of x[i-1]), the 2 values are considered equal - the absolute value for TOLERANCE is computed as x[i-1] * TOLERANCE
DISTINCT( 'cell_range', AS.TEXT = FALSE, 'cell_range2') returns number of different values within 'cell_range' - 'cell_range2' describes the bounds/intervals used for splitting the initial data (i.e. for splitting 'cell_range') - the values within 'cell_range2' are ranked - IF any value from 'cell_range' < MIN('cell_range2') => this is first group - any value from 'cell_range' is within RANKED(cell_range2)[1] RANKED(cell_range2)[2] => next group - ... - any value from 'cell_range' > MAX('cell_range2') => this is the last group
Of course, DISTINCT() is just one aspect of the analysis. Actually, I am more interested in doing specific calculations based on these distinct groups (like SUM(), COUNT(), ...). I will describe such calculations in a later post.
Sincerely,
Leonard Mada





