[R] nls start values

Nick Fankhauser lists at nyk.ch
Fri Dec 16 18:25:32 CET 2011


Hi!

thanks a lot for this suggestion! I tried to implement it like this, and
it worked nicely.
I used the method suggested by Gabor Grothendieck for simplification:

frml <- gene_expression ~ sin(tpoints * afreq + phase) * amp + shift
gridfit <- nls2(frml, algorithm = "grid-search", data=gendat, start =
startdf)
sine_nls <- nls2(gridfit, data=gendat, start = gridfit, algorithm="port")

But I also tried the method of running the nls with start-values for
phase between 0 and 2pi in five steps and then choosing the one which
has the lowest Standardized Residual Sum of Squares. This worked even
better in some cases and I think it's also conceptually simpler.

Yours,
Nick

On 13/12/11 16:53, Hans W Borchers wrote:
> Niklaus Fankhauser <niklaus.fankhauser <at> cell.biol.ethz.ch> writes:
>
>   
>> I'm using nls to fit periodic gene-expression data to sine waves. I need
>> to set the upper and lower boundaries, because I do not want any
>> negative phase and amplitude solutions. This means that I have to use
>> the "port" algorithm. The problem is, that depending on what start value
>> I choose for phase, the fit works for some cases, but not for others.
>> In the example below, the fit works using phase=pi,  but not using
>> phase=0. But there are many examples which fit just fine using 0.
>>
>> Is there a comparable alternative to nls that is not so extremely
>> influenced by the start values?
>>
>>     
> Use package `nls2' to first search on a grid, and then apply `nls' again
> to identify the globally best point:
>
>     # Data for example fit
>     afreq <- 1 / (24 / 2 / pi)
>     tpoints <- c(0,0,0,2,2,2,4,4,4,6,6,6,8,8,8,12,12,12,
>                  14,14,14,16,16,16,18,18,18,20,20,20,24,24,24)
>     gene_expression <-
>     c(1.551383, 1.671742, 1.549499, 1.694480, 1.632436, 1.471568, 1.623381,
>       1.579361, 1.809394, 1.753223, 1.685918, 1.754968, 1.963069, 1.820690,
>       1.985159, 2.205064, 2.160308, 2.120189, 2.194758, 2.165993, 2.189981,
>       2.098671, 2.122207, 2.012621, 1.963610, 1.884184, 1.955160, 1.801175,
>       1.829686, 1.773260, 1.588768, 1.563774, 1.559192)
>     shift=mean(gene_expression) # y-axis (expression) shift
>
>     # Grid search
>     library("nls2")
>     frml <- gene_expression ~ sin(tpoints * afreq + phase) * amp + shift
>     startdf <- data.frame(phase=c(0, 2*pi), amp = c(0, 2))
>     nls2(frml, algorithm = "grid-search", start = startdf,
>                control = list(maxiter=200))
>
>     # Perfect fit
>     startvals <- list(phase = 4.4880, amp = 0.2857)
>     sine_nls <- nls(frml, start=startvals)
>     #  phase    amp 
>     # 4.3964 0.2931 
>     # residual sum-of-squares: 0.1378
>
> Maybe this can be done in one step.
> Hans Werner
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list