From | Sent On | Attachments |
---|---|---|

pste...@apache.org | Jan 13, 2004 11:36 pm | |

Mark R. Diggory | Jan 14, 2004 6:26 am | |

Phil Steitz | Jan 14, 2004 1:54 pm | |

Mark R. Diggory | Jan 14, 2004 3:18 pm | |

pste...@apache.org | Jan 14, 2004 9:18 pm | |

Piotr Kochañski | Jan 16, 2004 1:00 am | |

Phil Steitz | Jan 16, 2004 2:36 pm | |

Piotr Kochański | Jan 19, 2004 4:03 am | .zip |

Phil Steitz | Jan 19, 2004 10:47 pm |

Subject: | [math] Re: bootstrap confidence intervals | |
---|---|---|

From: | Piotr Kochański (pi...@uw.edu.pl) | |

Date: | Jan 19, 2004 4:03:34 am | |

List: | org.apache.commons.dev | |

Attachments: | bootstrap.zip - 7k |

Phil Steitz napisał(a):

Excellent points. We would appreciate any other comments (or patches :-) that you have on the code or algorithms in [math].

Calculation of confidence intervals using bootstrap is a bit complicated, so I started with Standard Error - basically, one might calculate bootstrap CI using such SE, however this is often not good enough in practice.

In order to calculate CI in a right way a fast algorithm for calculation of normal distribution function is needed, it is not present in commons-math (as far as I can remember I send some code doing this to a dev mailing list).

I attach classes, which calculate SE using bootstrap, if you are interested in adding this code please let me know, I will have to clean the code, javadoc and write tests. All information about package organization, etc. would be helpfull as well.

Here I put everything into o.a.c.math.stat.bootstrap package.

The short description goes below:

Test class shows the usage of bootstrap to calculate Standard Error.

Bootstrap class does actual resampling - it provides also a method that returns an array of values of some statistics calculated for every sample.

StandardError is an interface that is common for all possible ways of calculating standard error. MeanStandardError and BootstrapStandardError implements this interface to calculate SE in a particular situation.

The usage is the follwing: StandardError bse = new BootstrapStandardError();

((BootstrapStandardError)bse).setB(200);

bse.setStat(new Mean()); System.out.println("se for the mean (bootstrap): " + bse.getStandardError(valSmall)); bse.setStat(new Median()); System.out.println("se for the median (bootstrap): " + bse.getStandardError(valSmall));

StandardError se = new MeanStandardError(); se.setStat(new Mean()); System.out.println("se for the mean (standard formula): " + se.getStandardError(valSmall));

System.out.println("---"); //this requieres patching the Mean class Mean m = new Mean(); m.setStandardError(bse); System.out.println("se for the mean (bootstrap thru Mean): " + m.getStandardError(valSmall));

I would add Confidence Intervals calculation code in a similar way, if this is ok. for you.

Regards

Piotr Kochanski

--- Piotr Kochański <pi...@uw.edu.pl> wrote:

Another problem with bootstrap confidence intervals is that they are non-parametric, and, inevitably, they provide less power when doing statistical tests then any parametric method.

Some people can be dissapointed with the fact that it is harder to obtain significant results, so I think that usuall calculation method, based on normal distribution assumption, should be also provided (it can be done only for a few statistics though).

There is also a "parametric" bootstrap, but this cannot be programmed in a generic way, applicable in any situation.

But still, bootstrap is very safe solution, given it's distribiution independence and ability to use it easyly for any statistics.

Piotr Kochanski (pi...@uw.edu.pl)

__________________________________ Do you Yahoo!? Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes http://hotjobs.sweepstakes.yahoo.com/signingbonus