[R] smooth.spline error while fitting bacterial growth curves with grofit

Jeffrey David Johnson jefdaj at berkeley.edu
Mon May 18 04:08:23 CEST 2015


Thanks, I think you're right. I removed the strains whose final OD was
below 0.2 since all the ones that clearly grew are above that, and
grofit produces fewer errors on the remaining 6. The error still happens
occasionally, but if I stick to 1000 bootstraps instead of 10000 it's
not often. Of course I won't rely on these numbers! I'll try again once
my current timecourse is done with 6 replicates per strain, and if
everything is still messy rethink the experimental design.

... Which brings up another question. Would it be better to estimate
growth parameters (mu, lambda, etc.) for each replicate and then take
the mean and standard deviation of those, or to average the growth data
first and calculate one set of parameters per strain? (Sorry if that's
very basic statistics)
Jeff

On Sun, 17 May 2015 11:42:27 -0700
Bert Gunter <gunter.berton at gene.com> wrote:

> 1. Very likely, you have insufficient data in some of your growth
> curves to do the fits using gcv. If  you remove the curves where the
> bacteria didn't grow, things should work. Alternatively, there may
> well be ways of expressing the model that would allow pooling across
> cultures that didn't grow. (Sounds like a mixtures problem, actually:
> you are mixing cultures that grow  with those that don't and need to
> determine the mixing proportion and the growth parameters of those
> that grew).
> 
> 2. HOWEVER, IF you remove the curves, you may very well be getting the
> wrong (biased) results -- i.e. your results will be irreproducible
> garbage, as you will only be taking data from cultures that grew well.
> I would **strongly** suggest you work with a local statistical expert
> to help you deal with these issues. I do not think you should trust
> remote advice from the internet on such complex data (including mine!)
> 
> Cheers,
> Bert
> 
> 
> Cheers,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> Clifford Stoll
> 
> 
> 
> 
> On Sun, May 17, 2015 at 10:42 AM, Jeffrey David Johnson
> <jefdaj at berkeley.edu> wrote:
> > I'm trying to use the grofit package to compare growth rates between
> > bacterial cultures, but I've come across a couple glitches/things I
> > don't understand. I'm not sure if they're related to the package or to a
> > problem with my growth data, which is messy. Some strains don't follow
> > a proper logarithmic growth curve because they died or didn't grow over
> > the course of the experiment. I could remove those but it will get more
> > time consuming once I have more cultures going.
> >
> > I've attached the 'time' matrix and 'data' data frame. This code should
> > fit the growth curves, but when I run it I get an error related to
> > `smooth.spline`:
> >
> > require(grofit)
> > mytime <- as.matrix(read.table('time.txt'))
> > mydata <- read.csv('data.csv')
> > dimnames(mytime) <- NULL
> > fits <- gcFit(mytime, mydata, grofit.control(
> >   interactive=FALSE, # don't ask if the graphs look OK
> >   nboot.gc=1000,     # number of bootstraps
> >   fit.opt="s"        # just do splines, no models
> > ))
> >
> > = 1. growth curve =================================
> > ----------------------------------------------------
> > = 2. growth curve =================================
> > ----------------------------------------------------
> > = 3. growth curve =================================
> > ----------------------------------------------------
> > Error in smooth.spline(time, data, spar = control$smooth.gc) :
> >   'tol' must be strictly positive and finite
> > Error in gcFitSpline(time.cur, data.cur, gcID, control.change) :
> >   object 'y.spl' not found
> >
> > That error usually occurs at some point, though I've run through all 17
> > successfully a couple times. The documentation says:
> >
> >> smooth.gc: Parameter describing the smoothness of the spline fit;
> >> usually (not necessary) in (0;1]. Set ‘smooth.gc=NULL’ causes the
> >> program to query an optimal value via cross validation techniques.
> >> Note: This is partly experimental. In future improved implementations
> >> of the ‘smooth.spline’ function may lead to different results. See
> >> documentation of the R function ‘smooth.spline’ for further details.
> >> Especially for datasets with few data points the option ‘NULL’ might
> >> result in a too small smoothing parameter, which produces an error in
> >> ‘smooth.spline’. In that case the usage of a fixed value is
> >> recommended. Default: ‘NULL’.
> >
> > I tried setting different values (0.1, 0.5, 0.9, 1, 10) and they all
> > cause the same error. If instead I use the `gcBootSpline` function
> > directly, it gives a different error about the number of bootstraps
> > being 0, when they clearly aren't:
> >
> > fits <- gcBootSpline(mytime, mydata, grofit.control(nboot.gc=1000))
> >
> > Error in gcBootSpline(mytime, mydata, grofit.control(nboot.gc =
> > 1000)) : Number of bootstrap samples is zero! See grofit.control()
> >
> > Am I using these right? Is there something about the data that would
> > make it un-fittable?
> > Jeff
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list