[R] Convergence in Monte Carlo Simulation

Mon Jun 15 14:19:38 CEST 2020

Dear Michael, 

So I shouldn't be setting the seed at all then since it is automatic? Or is the suggestion here that a new seed is chosen each time?

I think rather than having you guess at the problem (my apologies) I will post the entire of the code (with omissions where it is not directly impacting the problem at hand). Sometimes I hesitate to post huge blocks because it can be a bit daunting, but I realize in coding that even the smallest glitch can throw everything off.

Set.seed(123) <<<Placing the seed here leads to no variation at all in my simulations as noted previously 

clusterDef <- defDataAdd(varname = "u_3", dist = "normal", formula = 0, variance = 25.77) 
patDef <- defDataAdd(varname = "u_2", dist = "normal", formula = 0, variance = 120.62) 
patError <- defDataAdd(varname = "error", dist = "normal", formula = 0, variance = 38.35) 

...(Data definition code omitted)

setkey(patTm, id, cluster, period)

#Define outcome y 
outDef <- defDataAdd(varname = "y", formula = "17.87 + 5.0*Ijt - 5.42*I(period == 1) - 5.72*I(period == 2) - 7.03*I(period == 3) - 6.13*I(period == 4) - 9.13*I(period == 5) + u_3 + u_2 + error", dist = "normal")

patTm <- addColumns(outDef, patTm)

powercrosssw <- function(nclus, clsize) {

	set.seed() < not sure if placing it the function rather than at the top is appropriate to generate a new and independent dataset for each of the 1000 iterations

Regarding the convergence issue, it seems that what you are saying is I have it all set up correctly (i.e. it will iterate until 1000 iterations converge). I do get this rather peculiar error though in some cases:

Error in lme.formula(y ~ factor(period) + factor(Ijt), data = patTm, random = ~1 |  : 
  nlminb problem, convergence error code = 1
  message = false convergence (8)

I am not quite sure what the problem is there.

Edward

On 2020-06-14, 10:16 AM, "Michael Dewey" <lists using dewey.myzen.co.uk> wrote:

    Dear Edward

    Every time you call your function powercrosssw() it resets the seed so 
    you must be calling it multiple times in some way.

    Michael

    On 14/06/2020 13:57, Phat Chau wrote:
    > Thank you Michael.
    > 
    > I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package.
    > 
    > I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power).
    > 
    > Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures.
    > 
    > Thank you.
    > Edward
    > 
    > On 2020-06-14, 6:46 AM, "Michael Dewey" <lists using dewey.myzen.co.uk> wrote:
    > 
    >      I am not 100% clear what your code is doing as it gets a bit wangled as
    >      you posted in HTML but here are a couple of thoughts.
    >      
    >      You need to set the seed outside any loops so it happens once and for all.
    >      
    >      I would test after trycatch and keep a separate count of failures and
    >      successes as the failure to converge must be meaningful about the
    >      scientific question whatever that is. At the moment your count appears
    >      to be in the correct place to count successes.
    >      
    >      Michael
    >      
    >      On 14/06/2020 02:50, Phat Chau wrote:
    >      > Hello,
    >      >
    >      > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?
    >      >
    >      > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?
    >      >
    >      > powercrosssw <- function(nclus, clsize) {
    >      >
    >      >    set.seed()
    >      >
    >      >    cohortsw <- genData(nclus, id = "cluster")
    >      >    cohortsw <- addColumns(clusterDef, cohortsw)
    >      >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
    >      >    cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
    >      >
    >      >    pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")
    >      >
    >      >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
    >      >    dx <- addColumns(patError, dx)
    >      >
    >      >    setkey(dx, id, cluster, period)
    >      >
    >      >    dx <- addColumns(outDef, dx)
    >      >
    >      >    return(dx)
    >      >
    >      > }
    >      >
    >      > i=1
    >      >
    >      > while (i < 1000) {
    >      >
    >      >    dx <- powercrosssw()
    >      >
    >      >    #Fit multi-level model to simulated dataset
    >      >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
    >      >                       warning = function(w) { "warning" }
    >      >    )
    >      >
    >      >    if (! is.character(model5)) {
    >      >
    >      >      coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
    >      >      pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
    >      >      error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
    >      >      bresult <- c(bresult, coeff)
    >      >      presult <- c(presult, pvalue)
    >      >      eresult <- c(eresult, error)
    >      >
    >      >      i <- i + 1
    >      >    }
    >      > }
    >      >
    >      > Thank you so much.
    >      >
    >      >
    >      >
    >      > 	[[alternative HTML version deleted]]
    >      >
    >      > ______________________________________________
    >      > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
    >      > https://stat.ethz.ch/mailman/listinfo/r-help
    >      > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    >      > and provide commented, minimal, self-contained, reproducible code.
    >      >
    >      >
    >      
    >      --
    >      Michael
    >      http://www.dewey.myzen.co.uk/home.html
    >      
    > 
    > 
    > 

    -- 
    Michael
    http://www.dewey.myzen.co.uk/home.html