[R] Monte Carlo simulation for ratio and its CI

David L Carlson dc@r|@on @end|ng |rom t@mu@edu
Tue Mar 26 20:48:12 CET 2019


I second the vote on needing a tutorial. You need to learn about how R does things and get familiar with vectorization and the apply() family of functions. You defined dat but not dat1 in your code so I'll just use dat. First, to get the ratios:

(ratios <- colSums(dat[-3], na.rm=TRUE)/colSums(dat[3], na.rm=TRUE))
#  v1  v2 
# 1.2 0.8

Then create a function for the Monte Carlo simulation that generates a sample and computes the ratios. Finally, use the function with replicate() to generate the 100 samples:

nratios <- function(x) {
     sdat <- x[sample.int(18,16), ]
     colSums(sdat[-3], na.rm=TRUE)/colSums(sdat[3], na.rm=TRUE)
}

mcrat <- replicate(100, nratios(dat))
str(mcrat)
#  num [1:2, 1:100] 1 0.8 1.222 0.778 1.111 ...
#  - attr(*, "dimnames")=List of 2
#   ..$ : chr [1:2] "v1" "v2"
#  ..$ : NULL

100 values of ratio1 are stored as mcrat["v1", ] and 100 values of ratio2 are stored as mcrat["v2", ].

Now you can generate your summary statistics. 

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352



-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Jeff Newmiller
Sent: Tuesday, March 26, 2019 9:27 AM
To: r-help using r-project.org; Marna Wagley <marna.wagley using gmail.com>; Bert Gunter <bgunter.4567 using gmail.com>
Cc: r-help mailing list <r-help using r-project.org>
Subject: Re: [R] Monte Carlo simulation for ratio and its CI

Do you really not know how to use a for loop? The tutorial recommendation seems apropos...

On March 26, 2019 5:57:17 AM PDT, Marna Wagley <marna.wagley using gmail.com> wrote:
>Dear Bert,
>Thank you very much for the response.
>I did it manually but I could not put them in a loop so that I created
>the
>table manually with selecting the rows randomly several times. Here
>what I
>have done so far, please find it. I want to create the table 100 times
>and
>calculate its mean and CI from those 100 values. If anyone can give me
>some
>hint to make a loop, that would be great. I am very grateful with your
>help.
>Thanks,
>
>library(dplyr)
>
>library(plyr)
>
>dat<-structure(list(v1 = c(NA, TRUE, TRUE, TRUE, TRUE, TRUE, NA, TRUE,
>
>NA, NA, TRUE, TRUE, TRUE, TRUE, NA, NA, TRUE, TRUE), v2 = c(TRUE,
>
>NA, NA, NA, NA, TRUE, NA, NA, TRUE, TRUE, NA, TRUE, TRUE, NA,
>
>NA, TRUE, TRUE, NA), v3 = c(TRUE, TRUE, NA, TRUE, TRUE, NA, NA,
>
>TRUE, TRUE, NA, NA, TRUE, TRUE, TRUE, NA, NA, TRUE, NA)), .Names =
>c("v1",
>
>"v2", "v3"), class = "data.frame", row.names = c(NA, -18L))
>
>
>ratio1 <- with(dat, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>ratio2 <- with(dat, sum(v2,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>#
>
>A1<-sample_n(dat1, 16)# created a table with selecting a 16 sample size
>(rows)
>
>A1.ratio1<-with(A1, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A1.ratio2 <- with(A1, sum(v2,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A1.Table<-data.frame(Ratio1=A1.ratio1, Ratio2=A1.ratio2)
>
>#
>
>A2<-sample_n(dat1, 16)
>
>A2.ratio1<-with(A2, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A2.ratio2 <- with(A2, sum(v2,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A2.Table<-data.frame(Ratio1=A2.ratio1, Ratio2=A2.ratio2)
>
>#
>
>A3<-sample_n(dat1, 16)
>
>A3.ratio1<-with(A3, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A3.ratio2 <- with(A3, sum(v2,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A3.Table<-data.frame(Ratio1=A3.ratio1, Ratio2=A3.ratio2)
>
>#
>
>##..............
>
># I was thinking to repeat this procedure 100 times and calculate the
>ratio
>
>A100<-sample_n(dat1, 16)
>
>A100.ratio1<-with(A100, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A100.ratio2 <- with(A100, sum(v2,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>
>A100.Table<-data.frame(Ratio1=A100.ratio1, Ratio2=A100.ratio2)
>
>#
>
>Tab<-rbind(A1.Table, A2.Table, A3.Table, A100.Table)
>
>
>#Compute the mean for each ratio
>
>Ratio1<-mean(Table1[,1])
>
>Ratio2<-mean(Table1[,2])
>
>
>summary <- ddply(subset(Tab), c(""),summarise,
>
>               N    = length(Tab),
>
>               mean.R1 = mean(Ratio1, na.rm=T),
>
>               median.R1=median(Ratio1, na.rm=T),
>
>               sd.R1   = sd(Ratio1, na.rm=T),
>
>               se.R1   = sd / sqrt(N),
>
>               LCI.95.R1=mean.R1-1.95*se.R1,
>
>               UCI.95.R1=mean.R1+1.95*se.R1,
>
>
>
>               mean.R2 = mean(Ratio2, na.rm=T),
>
>               median.R2=median(Ratio2, na.rm=T),
>
>               sd.R2   = sd(Ratio2, na.rm=T),
>
>               se.R2   = sd / sqrt(N),
>
>               LCI.95.R2=mean.R2-1.95*se.R2,
>
>               UCI.95.R2=mean.R2+1.95*se.R2
>
>               )
>
> summary
>
>
>
>On Mon, Mar 25, 2019 at 4:50 PM Bert Gunter <bgunter.4567 using gmail.com>
>wrote:
>
>>
>> > ratio1 <- with(dat, sum(v1,na.rm = TRUE)/sum(v3,na.rm=TRUE))
>> > ratio1
>> [1] 1.2
>>
>> It looks like you should spend some more time with an R tutorial or
>two.
>> This is basic stuff (if I understand what you wanted correctly).
>>
>> Also, this is not how a "confidence interval" should be calculated,
>but
>> that is another off topic discussion for which
>stats.stackexchange.com is
>> a more appropriate venue.
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
>along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Mon, Mar 25, 2019 at 4:31 PM Marna Wagley <marna.wagley using gmail.com>
>> wrote:
>>
>>> Hi R User,
>>> I was trying to calculate ratios with confidence interval using
>Monte
>>> Carlo
>>> simulation but I could not figure it out.
>>> Here is the example of my data (see below), I want to calculate
>ratios
>>> (dat$v1/dat$v3 & dat$v2/dat$v3) and its confidence intervals using a
>100
>>> randomly selected data sets.
>>> Could you please give me your suggestions how I can estimate ratios
>with
>>> CI?
>>> I will be very grateful to you.
>>> Sincerely,
>>>
>>> MW
>>> ---
>>> dat<-structure(list(v1 = c(NA, TRUE, TRUE, TRUE, TRUE, TRUE, NA,
>TRUE,
>>>
>>> NA, NA, TRUE, TRUE, TRUE, TRUE, NA, NA, TRUE, TRUE), v2 = c(TRUE,
>>>
>>> NA, NA, NA, NA, TRUE, NA, NA, TRUE, TRUE, NA, TRUE, TRUE, NA,
>>>
>>> NA, TRUE, TRUE, NA), v3 = c(TRUE, TRUE, NA, TRUE, TRUE, NA, NA,
>>>
>>> TRUE, TRUE, NA, NA, TRUE, TRUE, TRUE, NA, NA, TRUE, NA)), .Names =
>c("v1",
>>>
>>> "v2", "v3"), class = "data.frame", row.names = c(NA, -18L))
>>>
>>>
>>> ratio1<-length(which(dat$v1 == "TRUE"))/length(which(dat$v3 ==
>"TRUE"))
>>>
>>> ratio2<-length(which(dat$v2 == "TRUE"))/length(which(dat$v3 ==
>"TRUE"))
>>>
>>>
>>> Thanks
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list