[R] Identifying breakpoints/inflection points?

Bert Gunter gunter.berton at gene.com
Tue Apr 27 19:21:27 CEST 2010


... but you should be warned that this is an inherently difficult issue. You
are trying to estimate a second derivative from noisy data. The result is
likely to be **very** dependent on the fitting methods and parameters chosen
(e.g. "span" of a kernel smoother), even if the fit itself is fairly robust.

I suggest you try some sensitivity analyses and/or bootstrapping of your
results if the software does not already provide uncertainty estimates.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Clint Bowman
Sent: Tuesday, April 27, 2010 9:32 AM
To: Charlotte Chang
Cc: r-help at r-project.org
Subject: Re: [R] Identifying breakpoints/inflection points?

Charlotte,

Try:

library(msProcess)  # you may have to install msProcess
year[peaks(birds.pr$fit)]

-- 
Clint Bowman			INTERNET:	clint at ecy.wa.gov
Air Quality Modeler		INTERNET:	clint at math.utah.edu
Department of Ecology		VOICE:		(360) 407-6815
PO Box 47600			FAX:		(360) 407-7534
Olympia, WA 98504-7600

On Mon, 26 Apr 2010, Charlotte Chang wrote:

> Hi Clint,
>
> Thank you for your help with the code. The span recommendation really
> improved the fit of my LOESS curve. I appreciate your thoughtful
> assistance!
>
> My remaining question is how could I go about identifying the
> inflection points for the LOESS curve? I was thinking about trying to
> find the 2nd derivative and then using the uniroot function.
>
> My code is here (but it's buggy and doesn't work):
>
> birds.lo<-loess.smooth(x,y,span=0.45)
> d2 <- function(x) {
> 	predict(birds.lo, x, deriv=2)$y
> }
> x<-year
> y<-piproute
>
>> d2(x)
> Error in predict(birds.lo, x, deriv = 2)$y :
>  $ operator is invalid for atomic vectors
>
> #Desired next step:
> uniroot(d2,c(7,10))
>
> Any ideas about this would be profoundly appreciated! I'm hitting a dead
end.
>
> Yours,
>
> Charlotte
>
> On Mon, Apr 26, 2010 at 3:32 PM, Clint Bowman <clint at ecy.wa.gov> wrote:
>> Charlotte,
>>
>> Try:
>>
>> birds.lo <- loess(piproute~year,span=.25)
>> # play with span to see your desired pattern
>> birds.pr<-predict(birds.lo, data.frame(year = seq(1967, 2009, 1)), se =
>> FALSE)
>> #
>> plot($year,birds.pr$fit,ylim=c(0,5))
>> par(new=T)
>> plot(year,birds.pr$fit,pch="+",col=2,ylim=c(0,5))
>>
>>
>> --
>> Clint Bowman                    INTERNET:       clint at ecy.wa.gov
>> Air Quality Modeler             INTERNET:       clint at math.utah.edu
>> Department of Ecology           VOICE:          (360) 407-6815
>> PO Box 47600                    FAX:            (360) 407-7534
>> Olympia, WA 98504-7600
>>
>> On Mon, 26 Apr 2010, Charlotte Chang wrote:
>>
>>> Hello!
>>> I have a dataset with the following two vectors:
>>>
>>>
>>>
year<-c(1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,198
0,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995
,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009)
>>>
>>>
>>>
piproute<-c(0.733333333,0.945945946,1.886363636,1.607843137,4.245614035,3.17
5675676,2.169014085,2,2.136363636,2.65625,2.080645161,2.114754098,2.09090909
1,3.012195122,2.935897436,2.592105263,1.075757576,1.210526316,1,1.1875,1.903
614458,1.385542169,1.788990826,1.163793103,1.558558559,1.595238095,1.7583333
33,1.858267717,2.169117647,1.403225806,2.859375,3.236220472,2.054263566,3.85
4166667,1.812080537,2.708029197,2.75862069,2.625954198,4.540740741,3.6865671
64,2.8,2.968253968,3.517730496)
>>>
>>> Pipits is the response variable (it is the number of birds counted at
>>> each survey site in each year) and year is the independent variable.
>>> If you plot it in R (plot(year,piproute,pch=19)), you'll see that the
>>> relationship looks like a quintic polynomial.
>>>
>>> Initially I was trying to fit this curve using an iterative equation,
>>> but it's not working. I suspect that the curve-fitting equation itself
>>> is inappropriate (it's a modified version of the logistic growth
>>> equation). Now what I'd like to do is identify the 3 break/inflection
>>> points in the population trend. That way, I can make an argument that
>>> the break points corresponded to shifts in government policy with
>>> respect to land use management. I've been looking at the segmented
>>> package, and initially I looked at change.pt test in the circ.stats
>>> package (which is inappropriate b/c my data is not amenable to
>>> circular statistical analysis). Any ideas on what I could do would be
>>> appreciated!
>>>
>>> Thank you!
>>>
>>> -Charlotte
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>



More information about the R-help mailing list