[R] Resample with replacement to produce many rarefaction curves with same number of samples

Luisfo luisfo89 at yahoo.es
Thu Sep 8 12:55:32 CEST 2016


Hi Nick,

Yes, you are right. There's one small bug on my code.
The 'if' within the for-loop is wrong. Try it now with the code below.

rrarefy.custom <- function (x, sample, rep.param=F)
{
   if (!identical(all.equal(x, round(x)), TRUE))
     stop("function is meaningful only for integers (counts)")
   x <- as.matrix(x)
   if (ncol(x) == 1)
     x <- t(x)
   if (length(sample) > 1 && length(sample) != nrow(x))
     stop(gettextf("length of 'sample' and number of rows of 'x' do not 
match"))
   sample <- rep(sample, length = nrow(x))
   colnames(x) <- colnames(x, do.NULL = FALSE)
   nm <- colnames(x)
   if (!rep.param && any(rowSums(x) < sample))
     warning("Some row sums < 'sample' and are not rarefied")
   for (i in 1:nrow(x)) {
     if (!rep.param && sum(x[i, ]) <= sample[i])
       next
     row <- sample(rep(nm, times = x[i, ]), sample[i], replace = rep.param)
     row <- table(row)
     ind <- names(row)
     x[i, ] <- 0
     x[i, ind] <- row
   }
   x
}

I have test it now before pasting it.

Best,

*Luisfo Chiroque*
/PhD Student | PhD Candidate
IMDEA Networks Institute/
http://fourier.networks.imdea.org/people/~luis_nunez/ 
<http://fourier.networks.imdea.org/people/%7Eluis_nunez/>

On 09/07/2016 10:27 PM, Nick Pardikes wrote:
> Hey Luisfo,
>
> This looks great, however I still get the same plot as before (seen
> below). The output looks the same. Here is the figure that was
> generated from this code:
>
> rrarefy.custom <- function (x, sample, rep.param=F)
> {
>    if (!identical(all.equal(x, round(x)), TRUE))
>      stop("function is meaningful only for integers (counts)")
>    x <- as.matrix(x)
>    if (ncol(x) == 1)
>      x <- t(x)
>    if (length(sample) > 1 && length(sample) != nrow(x))
>      stop(gettextf("length of 'sample' and number of rows of 'x' do not match"))
>    sample <- rep(sample, length = nrow(x))
>    colnames(x) <- colnames(x, do.NULL = FALSE)
>    nm <- colnames(x)
>    if (!rep.param && any(rowSums(x) < sample))
>      warning("Some row sums < 'sample' and are not rarefied")
>    for (i in 1:nrow(x)) {
>      if (rep.param && sum(x[i, ]) <= sample[i])
>        next
>      row <- sample(rep(nm, times = x[i, ]), sample[i], replace = rep.param)
>      row <- table(row)
>      ind <- names(row)
>      x[i, ] <- 0
>      x[i, ind] <- row
>    }
>    x
> }
>
> raredata <- rarecurve(rrarefy.custom(netdata, sample=100,rep.param=T),
> label=F, col=rgb(0, 0, 1, 0.1))
>
> However, I like what you did to the rrarefy function to add the sample
> with replacement option.
>
> On Wed, Sep 7, 2016 at 8:17 AM, Luisfo <luisfo89 at yahoo.es> wrote:
>> Hi Nick,
>>
>> If you use the following
>>      raredata <- rarecurve(rrarefy(netdata, sample=100), label=F, col=rgb(0,
>> 0, 1, 0.1))
>> should work for any sample size, e.g. sample=100.
>> However, you will have a 'warning' if you don't have samples enough, because
>> it has not replacement.
>>
>> If you type 'rrarefy' on the R console (without brackets), or any other
>> function name, you will see the R code of the function.
>> rrarefy uses the function 'sample()' for sampling, but has no option for
>> replacement.
>> I did the following. I created my custom rrarefy function from the original.
>> rrarefy.custom <- function (x, sample, rep.param=F)
>> {
>>    if (!identical(all.equal(x, round(x)), TRUE))
>>      stop("function is meaningful only for integers (counts)")
>>    x <- as.matrix(x)
>>    if (ncol(x) == 1)
>>      x <- t(x)
>>    if (length(sample) > 1 && length(sample) != nrow(x))
>>      stop(gettextf("length of 'sample' and number of rows of 'x' do not
>> match"))
>>    sample <- rep(sample, length = nrow(x))
>>    colnames(x) <- colnames(x, do.NULL = FALSE)
>>    nm <- colnames(x)
>>    if (!rep.param && any(rowSums(x) < sample))
>>      warning("Some row sums < 'sample' and are not rarefied")
>>    for (i in 1:nrow(x)) {
>>      if (rep.param && sum(x[i, ]) <= sample[i])
>>        next
>>      row <- sample(rep(nm, times = x[i, ]), sample[i], replace = rep.param)
>>      row <- table(row)
>>      ind <- names(row)
>>      x[i, ] <- 0
>>      x[i, ind] <- row
>>    }
>>    x
>> }
>> You can check the differences with the original code if you type 'rrarefy'
>> on the R console.
>>
>> So now, if you type the following
>>      raredata <- rarecurve(rrarefy.custom(netdata, sample=100,rep.param=T),
>> label=F, col=rgb(0, 0, 1, 0.1))
>> you will have the desired behaviour.
>>
>> WARNING: I do not understand about rarefunction curves or communities in
>> your context. So, be careful when resampling. It might not be statistically
>> correct.
>>
>> Regards,
>> Luisfo Chiroque
>> PhD Student | PhD Candidate
>> IMDEA Networks Institute
>> http://fourier.networks.imdea.org/people/~luis_nunez/
>>
>> On 09/07/2016 12:07 AM, Nick Pardikes wrote:
>>
>> I am currently having difficulty producing a graph using rarecurve in the
>> vegan package. I have produced rarefaction curves (seen below) using the
>> following code.
>>
>>
>> library(vegan)
>>
>> myMat <- round(matrix(rlnorm(2000), 50)) #creates distribution of
>> communities
>>
>> netdata <- as.data.frame(myMat) #generates a matrix of communities (rows),
>> species (columns)
>>
>> raredata <- rarecurve(netdata, label=F, col=rgb(0, 0, 1, 0.1))  #uses
>> rarecurve to plot a rarefaction for each individual community (n=50)
>>
>>
>> However I would like to produce a graph in which all rarefaction curves end
>> at the same sample size. For example, in this graph it would be great to
>> extend the x-axis (sample size) to 100 and have all curves end at this
>> point. Is there any way to use rarecurve to resample a community (row) with
>> replacement the same number of times for all 50 communities? With
>> replacement is important because the communities differ greatly in their
>> size (number of species).
>>
>>
>> I understand that rarefaction is useful to compare communities with
>> different sample efforts, but I would still like to generate the figure. My
>> actual data has 5000 simulated communities that differ greatly in matrix
>> size and number of samples.
>>
>>
>> Thank you in advance for your help and suggestions.
>>
>>
>> Cheers,
>>
>> Nick
>>
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>


	[[alternative HTML version deleted]]



More information about the R-help mailing list