Bootstraping is a statistical resampling method to estimate the accuracy of sample estimates.
Such resampling methods repeatedly draw (re)samples from a given sample, and evaluate the same statistics for each of them.
Thereby, the distributions of these statistics can be estimated including their mean and variance.
Bootstrapping is a resample method with replacement. Thus, a resample may contain the same data point of the original sample multiple times.
Bootstrapping in HouseKeepR
HouseKeepR uses bootstrapping to identify the house-keeping genes with the highest and at the same time most stable expression in condition versus control samples.
Here, each bootstrap resample is a set of your selected data sets that can possibly contain the same selected data set multiple times.
For each such bootstrap resample (set of data sets) genes are ranked by high average expression and low expression variance.
Finally, a final ranking of genes is produced by sorting genes by smallest average rank across resamples. In case of ties, smallest rank variance across resamples decides.
Bootstrap sample size
The number of data sets contained in each bootstrap resample can be specified.
We strongly recommend leaving this value at the default (the number of selected data sets).
Bootstrap replications
The more bootstrap replications are performed, i.e. the more bootstrap resamples are drawn, the more accurately distributions of sample statistics can be estimated.
Choosing lower values, will greatly reduce running time, but lower statistical accuracy of results.
We strongly recommend not setting this value lower than the default (100).