[R] smooth.spline error while fitting bacterial growth curves with grofit

Bert Gunter gunter.berton at gene.com
Mon May 18 16:44:17 CEST 2015


Your question is OFFTOPIC for this list. Post on a statistics list
like stats.stackexchange.com .

But both your proposals are wrong, though depending on your data and
purpose, they may be adequate. I suggest you consult wit a local
statistician on the use of mixed effects models for repeated
measures/growth curves or post it on the same topics.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Sun, May 17, 2015 at 7:08 PM, Jeffrey David Johnson
<jefdaj at berkeley.edu> wrote:
> Thanks, I think you're right. I removed the strains whose final OD was
> below 0.2 since all the ones that clearly grew are above that, and
> grofit produces fewer errors on the remaining 6. The error still happens
> occasionally, but if I stick to 1000 bootstraps instead of 10000 it's
> not often. Of course I won't rely on these numbers! I'll try again once
> my current timecourse is done with 6 replicates per strain, and if
> everything is still messy rethink the experimental design.
>
> ... Which brings up another question. Would it be better to estimate
> growth parameters (mu, lambda, etc.) for each replicate and then take
> the mean and standard deviation of those, or to average the growth data
> first and calculate one set of parameters per strain? (Sorry if that's
> very basic statistics)
> Jeff
>
> On Sun, 17 May 2015 11:42:27 -0700
> Bert Gunter <gunter.berton at gene.com> wrote:
>
>> 1. Very likely, you have insufficient data in some of your growth
>> curves to do the fits using gcv. If  you remove the curves where the
>> bacteria didn't grow, things should work. Alternatively, there may
>> well be ways of expressing the model that would allow pooling across
>> cultures that didn't grow. (Sounds like a mixtures problem, actually:
>> you are mixing cultures that grow  with those that don't and need to
>> determine the mixing proportion and the growth parameters of those
>> that grew).
>>
>> 2. HOWEVER, IF you remove the curves, you may very well be getting the
>> wrong (biased) results -- i.e. your results will be irreproducible
>> garbage, as you will only be taking data from cultures that grew well.
>> I would **strongly** suggest you work with a local statistical expert
>> to help you deal with these issues. I do not think you should trust
>> remote advice from the internet on such complex data (including mine!)
>>
>> Cheers,
>> Bert
>>
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> (650) 467-7374
>>
>> "Data is not information. Information is not knowledge. And knowledge
>> is certainly not wisdom."
>> Clifford Stoll
>>
>>
>>
>>
>> On Sun, May 17, 2015 at 10:42 AM, Jeffrey David Johnson
>> <jefdaj at berkeley.edu> wrote:
>> > I'm trying to use the grofit package to compare growth rates between
>> > bacterial cultures, but I've come across a couple glitches/things I
>> > don't understand. I'm not sure if they're related to the package or to a
>> > problem with my growth data, which is messy. Some strains don't follow
>> > a proper logarithmic growth curve because they died or didn't grow over
>> > the course of the experiment. I could remove those but it will get more
>> > time consuming once I have more cultures going.
>> >
>> > I've attached the 'time' matrix and 'data' data frame. This code should
>> > fit the growth curves, but when I run it I get an error related to
>> > `smooth.spline`:
>> >
>> > require(grofit)
>> > mytime <- as.matrix(read.table('time.txt'))
>> > mydata <- read.csv('data.csv')
>> > dimnames(mytime) <- NULL
>> > fits <- gcFit(mytime, mydata, grofit.control(
>> >   interactive=FALSE, # don't ask if the graphs look OK
>> >   nboot.gc=1000,     # number of bootstraps
>> >   fit.opt="s"        # just do splines, no models
>> > ))
>> >
>> > = 1. growth curve =================================
>> > ----------------------------------------------------
>> > = 2. growth curve =================================
>> > ----------------------------------------------------
>> > = 3. growth curve =================================
>> > ----------------------------------------------------
>> > Error in smooth.spline(time, data, spar = control$smooth.gc) :
>> >   'tol' must be strictly positive and finite
>> > Error in gcFitSpline(time.cur, data.cur, gcID, control.change) :
>> >   object 'y.spl' not found
>> >
>> > That error usually occurs at some point, though I've run through all 17
>> > successfully a couple times. The documentation says:
>> >
>> >> smooth.gc: Parameter describing the smoothness of the spline fit;
>> >> usually (not necessary) in (0;1]. Set ‘smooth.gc=NULL’ causes the
>> >> program to query an optimal value via cross validation techniques.
>> >> Note: This is partly experimental. In future improved implementations
>> >> of the ‘smooth.spline’ function may lead to different results. See
>> >> documentation of the R function ‘smooth.spline’ for further details.
>> >> Especially for datasets with few data points the option ‘NULL’ might
>> >> result in a too small smoothing parameter, which produces an error in
>> >> ‘smooth.spline’. In that case the usage of a fixed value is
>> >> recommended. Default: ‘NULL’.
>> >
>> > I tried setting different values (0.1, 0.5, 0.9, 1, 10) and they all
>> > cause the same error. If instead I use the `gcBootSpline` function
>> > directly, it gives a different error about the number of bootstraps
>> > being 0, when they clearly aren't:
>> >
>> > fits <- gcBootSpline(mytime, mydata, grofit.control(nboot.gc=1000))
>> >
>> > Error in gcBootSpline(mytime, mydata, grofit.control(nboot.gc =
>> > 1000)) : Number of bootstrap samples is zero! See grofit.control()
>> >
>> > Am I using these right? Is there something about the data that would
>> > make it un-fittable?
>> > Jeff
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list