atom feed9 messages in org.apache.commons.dev[math] Re: bootstrap confidence inter...
FromSent OnAttachments
pste...@apache.orgJan 13, 2004 11:36 pm 
Mark R. DiggoryJan 14, 2004 6:26 am 
Phil SteitzJan 14, 2004 1:53 pm 
Mark R. DiggoryJan 14, 2004 3:17 pm 
pste...@apache.orgJan 14, 2004 9:18 pm 
Piotr KochañskiJan 16, 2004 1:00 am 
Phil SteitzJan 16, 2004 2:36 pm 
Piotr KochańskiJan 19, 2004 4:03 am.zip
Phil SteitzJan 19, 2004 10:47 pm 
Subject:[math] Re: bootstrap confidence intervals
From:Piotr Kochański (pi@uw.edu.pl)
Date:Jan 19, 2004 4:03:12 am
List:org.apache.commons.dev
Attachments:
bootstrap.zip - 7k

Phil Steitz napisał(a):

Excellent points. We would appreciate any other comments (or patches :-) that you have on the code or algorithms in [math].

Calculation of confidence intervals using bootstrap is a bit complicated, so I started with Standard Error - basically, one might calculate bootstrap CI using such SE, however this is often not good enough in practice.

In order to calculate CI in a right way a fast algorithm for calculation of normal distribution function is needed, it is not present in commons-math (as far as I can remember I send some code doing this to a dev mailing list).

I attach classes, which calculate SE using bootstrap, if you are interested in adding this code please let me know, I will have to clean the code, javadoc and write tests. All information about package organization, etc. would be helpfull as well.

Here I put everything into o.a.c.math.stat.bootstrap package.

The short description goes below:

Test class shows the usage of bootstrap to calculate Standard Error.

Bootstrap class does actual resampling - it provides also a method that returns an array of values of some statistics calculated for every sample.

StandardError is an interface that is common for all possible ways of calculating standard error. MeanStandardError and BootstrapStandardError implements this interface to calculate SE in a particular situation.

The usage is the follwing: StandardError bse = new BootstrapStandardError();

((BootstrapStandardError)bse).setB(200);

bse.setStat(new Mean()); System.out.println("se for the mean (bootstrap): " + bse.getStandardError(valSmall)); bse.setStat(new Median()); System.out.println("se for the median (bootstrap): " + bse.getStandardError(valSmall));

StandardError se = new MeanStandardError(); se.setStat(new Mean()); System.out.println("se for the mean (standard formula): " + se.getStandardError(valSmall));

System.out.println("---"); //this requieres patching the Mean class Mean m = new Mean(); m.setStandardError(bse); System.out.println("se for the mean (bootstrap thru Mean): " + m.getStandardError(valSmall));

I would add Confidence Intervals calculation code in a similar way, if this is ok. for you.

Regards

Piotr Kochanski

--- Piotr Kochański <pi@uw.edu.pl> wrote:

Another problem with bootstrap confidence intervals is that they are non-parametric, and, inevitably, they provide less power when doing statistical tests then any parametric method.

Some people can be dissapointed with the fact that it is harder to obtain significant results, so I think that usuall calculation method, based on normal distribution assumption, should be also provided (it can be done only for a few statistics though).

There is also a "parametric" bootstrap, but this cannot be programmed in a generic way, applicable in any situation.

But still, bootstrap is very safe solution, given it's distribiution independence and ability to use it easyly for any statistics.

Piotr Kochanski (pi@uw.edu.pl)