[R] Matrix resampling (bootstraps)

Ernesto Jardim ernesto at ipimar.pt
Tue Sep 9 11:49:07 CEST 2003


On Tue, 2003-09-09 at 05:11, Hector L. Ayala-del-Rio wrote:
> Dear all,
>     I am trying to generate bootstrap replicate matrixes (rows=samples, 
> column=species, sampling with replacement) from a matrix dataset, but I do 
> not know how to do it in R.  I have tried boot() and bootstrap(), but they 
> require an statistic, which in my case is cluster analysis (generating 
> bootstrap values for a cluster analysis is a topic that has been mentioned 
> previously in this list).  I have been trying to use sample() and matrix() 
> to generate the replicate matrix but they seem to generate a single vector 
> rather than the entire matrix.  What I want is to resample the entire 
> matrix, but by resampling different columns (species).  In that way, the 
> bootstrap values will give me an idea of how similar the samples are.  Any 
> ideas will be very very helpful.  An example of that data matrix is below.
> 
> Thanks
> 
> Hector
> 
>    X36C X40C X58C X60C X62C X66C X77C X92C X95C X96C X107C X109C X116C
> 26Y        0    0    0   59  919  351  128    0  104  214     0     0     0
> C-0        0    0    0  368 1343 1826  211    0  253  352     0     0     0
> C-50       0    0    0  211 1032 1701   50    0   54   56     0     0     0
> C-90      64    0   65  260  769  876    0    0   87    0     0    91    96
> C-127-1    0    0  127  149  364 3990    0    0    0    0     0     0     0
> C-164      0    0    0   68  179 2373    0    0  105    0     0     0     0
> C-198      0    0    0   89  327 1458  314    0  209  298     0     0     0
> C-226      0    0    0    0  206  858    0    0  363  304     0     0     0
> C-268      0    0    0   75  270  629    0    0  107    0     0     0     0
> C-294-C   54    0    0  112  379  753    0  220  823  325     0     0     0
> C-310      0    0    0    0  116  305    0  396 1049  355     0     0     0
> C-357-2   96    0    0  445  201  405    0  114 2265    0   178    99   125
> C-375     90    0   56  231  385  817    0  211 2776    0    57    79   106
> C-399    110    0   50  563 1060 1244    0  414 2933    0    54   107   123
> C-414     64    0    0  197  408  825    0  111 1875    0     0    82   104
> C-428     63    0    0   80  100  695    0  162 2374    0   481   132   369
> C-434      0    0    0  269  261 1689    0 2923 3496    0     0     0     0
> C-454     77    0    0  257  170  963    0  377 3984    0     0    90    96
> C-465      0    0    0  234  406  860    0  428 1601    0     0     0     0
> C-479    111    0    0  349  297 1538   51  494 3753    0    75   102    95
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Hi Hector,

I'm not sure I've understood your problem, you should describe your data
for people fully understand your problem.

I think your should try to use the boot function. It has a lot of
analysis allready programed that are extremely usefull. Your statistic
must be the result of a R function applied to your dataset, just be
carefull to assure that the result of your function allways have the
same dimension, otherwise boot will fale. Regarding the species issue,
what I understant is that you want to bootstrap the observations of each
species independently and than compute the statistic. You can do that by
using the "strata" argument in boot. Change the matrix to a dataframe
with columns for species, samples and observations and tell boot that
species is the strata.

Hope this helps

EJ
-- 
Ernesto Jardim <ernesto at ipimar.pt>
Biólogo Marinho/Marine Biologist
IPIMAR - Instituto Nacional de Investigação Agrária e das Pescas
IPIMAR - National Research Institute for Agriculture and Fisheries
Av. Brasilia, 1400-006
Lisboa, Portugal
Tel: +351 213 027 000
Fax: +351 213 015 948
http://ernesto.freezope.org




More information about the R-help mailing list