[R] strucchange Fstats() example

Wed May 30 08:22:58 CEST 2012

On Tue, 29 May 2012, Mabille, Geraldine wrote:

<snip>

> In the second example, the authors state the presence of "at least" two 
> breakpoints. When plotting the F-statistics using the following code, we see 
> indeed two peaks in the F-statistics, that coincides with the dates given by 
> the authors: c.a 1973 and 1983 but when trying to add those breakpoints to 
> the time series, only one is taken into account

The breakpoints() method for "Fstats" objects can just extract a single 
breakpoint. The reason is that maximizing the F statistics is equivalent to 
minimizing the residual sum of squares of a model with a single breakpoint. If 
you want to estimate more than a single breakpoint, you need to minimize the 
corresponding segmented sums of squares. This can be done with the formula 
method of breakpoints(), see ?breakpoints.

More specificially: In your example with breakpoints(fs, breaks = 2), the 
breaks argument is simply ignored. The method just does not have a breaks 
argument and it goes through ...

> We see that even though the F-statistics seem to show the existence of 2 
> breakpoints, only one is detected by the breakpoints() function. Does anyone 
> know how this is possible? I'm totally new to strucchange so it might well be 
> something obvious I'm missing here!

Please have a closer look at the package's documentation and the corresponding 
papers. See citation("strucchange") for the most important references and the 
corresponding manual pages for more details. For the breakpoints issue you 
should probably start reading the CSDA paper.

> OTHER SIDE QUESTION: can strucchange be used if the y variable is binary???

Testing for breakpoints can be done with the function gefp(). See its manual 
pages for references and details. The manual page just has a Poisson GLM 
example but the corresponding papers (in Stat Neerl and CSDA) also have binary 
response examples.

If you have a binary response and just want to test whether the proportion of 
successes changes across "time" (or some other variable of interest), then 
maxstat_test() from package "coin" might be an interesting nonparametric 
alternative.

hth,
Z