[R] Subsampling out of site*abundance matrix

Jari Oksanen jari.oksanen at oulu.fi
Tue Feb 8 16:33:01 CET 2011


David Winsemius <dwinsemius <at> comcast.net> writes:

> 
> 
> On Feb 7, 2011, at 6:43 PM, B77S wrote:
> 
> >
> > So, after thinking about this a bit, I realized that the previous  
> > solution
> > wasn't exactly what I needed.  I really needed replacement=F and to  
> > be able
> > to choose any sample size (n.sample) less than or equal to the site  
> > (row)
> > with the lowest total abundance.
> 
> The reason I suggested ,  replace =FALSE,  is that I thought those  
> were population parameters. Furthermore, even if we think of them as  
> samples, it seems unlikely that they are the entire universe for  
> inference, since knowing such a universe would make statistics  
> superfluous. My advice is to consult a statistician before you set  
> replace=FALSE.
> 
I guess they are not population parameters if you ask for a *sub*sample:
 then it must be a sample from a sample.

The problem with regarding them as population parameters is that many
(or most) species are missing in any sample, and then their estimated 
frequencies are falsely zero. True replicate sampling should be 
able to find species that do not occur in the sample, just like you would do 
if you resample an adjacent plot in similar conditions in the wild. 

That said, package vegan has function rrarefy (NB the initial 'rr') which gives 
you random subsamples without replacement from a abundance (count) 
data. It is a sister function of rarefy (also in vegan, with one r) which gives 
you the expected number of species when subsampling without replacement. 

Cheers, Jari Oksanen



More information about the R-help mailing list