From | Sent On | Attachments |
---|---|---|

cesa...@gmail.com | May 16, 2008 4:58 pm | |

Iain Sproat | May 17, 2008 4:14 am | |

Andrew Kirillov | May 17, 2008 8:54 am | |

Iain Sproat | May 17, 2008 9:10 am | |

cesa...@gmail.com | May 17, 2008 11:21 am |

Subject: | Re: AForge.Math | |
---|---|---|

From: | Iain Sproat (iain...@gmail.com) | |

Date: | May 17, 2008 9:10:03 am | |

List: | com.googlegroups.aforge |

Ah! That makes perfect sense now :) Is there a class which works with the vector/raw data representations?

I will have a closer look at the documentation, and see if I can make it more obvious.

Thanks for your help.

Regards

On Sat, May 17, 2008 at 4:54 PM, Andrew Kirillov <andr...@gmail.com> wrote:

Hello,

I though it is clear from documentation ... it looks like it is not, so I will clarify again. The Statistics class works with histogram, not with vector. This is the key to understand. If you provide integer array like { 10, 5, 8, 3, 4 }, then it means that value #0 occurred 10 times, #1 - 5 times, etc. So, these value represent frequency of occurrence. Since we know total amount of tests (occurrence of any value), we can calculate probabilities. So, mean and median are calculated for the values from 0 to 4 using provided their probabilities.

The same is stated in documentation ...

With best regards, Andrew

On May 17, 2:14 pm, "Iain Sproat" <iain...@gmail.com> wrote:

Cesar,

I have been creating some Unit tests, and have recently started on the Math library. I also ran into the same issue/misunderstanding with the Statistic class.

It is certainly hard to follow exactly what these methods are expected to throw under certain inputs, and as I do not routinely work with statistics I think my understanding might be incorrect.

e.g. in mean method, if you pass in 5 integers all with the same value {100, 100, 100, 100, 100} the method returns 2. With regards to probabilities I am uncertain what the 2 exactly represents - 200% probability of the mean occuring?

I will write a test for the median and post it here in due course.

Regards

Iain

On Sat, May 17, 2008 at 12:58 AM, cesa...@gmail.com < cesa...@gmail.com> wrote:

Hi Andrew,

As I've told you before some months ago, I'm extending your Math package to support Principal Component Analysis and a few other Statistical functions and other Mathematics applications.

However, on the Statistics class, I didn't quite understood what the static method Median(double[] values) does. I thought it computed the statistical median from a vector of doubles, but looking at the code,

[code] int total = 0, n = values.Length;

// for all values for (int i = 0; i < n; i++) { // accumalate total total += values[i]; }

int halfTotal = total / 2; int median = 0, v = 0;

// find median value for (; median < n; median++) { v += values[median]; if (v >= halfTotal) break; }

return median; [/code]

It looks very different from what I was expecting, i.e., something like:

[code] int n = values.Length;

Array.Sort(values);

if ((n % 2) == 0) { return values[(n + 1) / 2]; // if N is odd } else { return (values[n / 2] + values[(n / 2) + 1]) / 2; // if N is even } [/code]

is that because you are dealing with "probabilities" inside the array values, as you said in your xml-comment field "remarks"?

<remarks>The input array is treated as histogram, i.e. its indexes are treated as values of stochastic function, but array values are treated as "probabilities" (total amount of hits).</remarks>

Thanks, Cesar- Hide quoted text -

- Show quoted text -