[Rd] Residual DF of NLS models (PR#14194)

Henrik Aalborg Nielsen han at enfor.dk
Tue Jan 26 14:21:15 CET 2010


OK - I do think the bug is pretty obvious when looking at the code (and
remembering that resid(object) may contain NA's). Anyway, I've attached
a script and the output (from 'R CMD BATCH --vanilla').

BR
Henrik


On Tue, 2010-01-26 at 12:04 +0000, Prof Brian Ripley wrote:

> Can we please have a reproducible example (as we did ask in the 
> FAQ, the posting guide ...).
> 
> On Tue, 26 Jan 2010, han at enfor.dk wrote:
> 
> > Full_Name: Henrik Aalborg Nielsen
> > Version: 2.10
> > OS: Linux (SLES 10 / openSUSE 11.1)
> > Submission from: (NULL) (77.66.63.89)
> >
> >
> > There seems to be a bug in df.residual.nls which is triggered when nls is called
> > with argument na.action = na.exclude; in that case 'resid(object)' will contain
> > NA-values which should be disregarded when counting the number of residuals:
> >
> > df.residual.nls <- function(object, ...) {
> >    w <- object$weights
> >    n <- if(!is.null(w)) sum(w != 0) else length(resid(object))
> >    n - length(coef(object))
> > }
> >
> > The bug cause the F-test of anova.nls to be wrong.
> >
> > Replace 'length(resid(object))' with 'sum(!is.na(resid(object)))' ?
> >
> > ... and thank you for producing this fantastic software!
> >
> > BR
> > Henrik
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> 
-------------- next part --------------

R version 2.10.1 (2009-12-14)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ## Example demonstating bug PR#14194 (based on the help-page of nls)
> 
> utils::data(muscle, package = "MASS")
> 
> musc.1 <- nls(Length ~ cbind(1, exp(-Conc/th)), muscle,
+               start = list(th=1), algorithm="plinear")
> 
> b <- coef(musc.1)
> musc.2 <- nls(Length ~ a[Strip] + b[Strip]*exp(-Conc/th),
+               muscle,
+               start = list(a=rep(b[2],21), b=rep(b[3],21), th=b[1]))
> 
> ## Adding data with response 'NA'
> muscle2 <- muscle
> muscle2[,"Length"] <- NA
> muscle2 <- rbind(muscle, muscle2)
> 
> musc2.1 <- nls(Length ~ cbind(1, exp(-Conc/th)), muscle2, na.action=na.exclude,
+               start = list(th=1), algorithm="plinear")
> 
> b <- coef(musc2.1)
> musc2.2 <- nls(Length ~ a[Strip] + b[Strip]*exp(-Conc/th),
+               muscle2, na.action=na.exclude,
+               start = list(a=rep(b[2],21), b=rep(b[3],21), th=b[1]))
> 
> ## Thes two ANOVA's should be identical:
> anova(musc.1, musc.2)
Analysis of Variance Table

Model 1: Length ~ cbind(1, exp(-Conc/th))
Model 2: Length ~ a[Strip] + b[Strip] * exp(-Conc/th)
  Res.Df Res.Sum Sq Df Sum Sq F value    Pr(>F)    
1     57    1241.09                                
2     17      21.13 40 1220.0  24.538 2.721e-09 ***
---
Signif. codes:  0 ???***??? 0.001 ???**??? 0.01 ???*??? 0.05 ???.??? 0.1 ??? ??? 1 
> anova(musc2.1, musc2.2)
Analysis of Variance Table

Model 1: Length ~ cbind(1, exp(-Conc/th))
Model 2: Length ~ a[Strip] + b[Strip] * exp(-Conc/th)
  Res.Df Res.Sum Sq Df Sum Sq F value    Pr(>F)    
1    117    1241.09                                
2     77      21.13 40 1220.0  111.14 < 2.2e-16 ***
---
Signif. codes:  0 ???***??? 0.001 ???**??? 0.01 ???*??? 0.05 ???.??? 0.1 ??? ??? 1 
> ## ... but the F-values are different.
> 
> ## DF:
> df.residual(musc.1)
[1] 57
> df.residual(musc2.1)
[1] 117
> length(residuals(musc2.1))
[1] 120
> sum(!is.na(residuals(musc2.1))) - length(coef(musc2.1))
[1] 57
> 
> ## Workaround:
> anova(update(musc2.1, na.action=na.omit), update(musc2.2, na.action=na.omit))
Analysis of Variance Table

Model 1: Length ~ cbind(1, exp(-Conc/th))
Model 2: Length ~ a[Strip] + b[Strip] * exp(-Conc/th)
  Res.Df Res.Sum Sq Df Sum Sq F value    Pr(>F)    
1     57    1241.09                                
2     17      21.13 40 1220.0  24.538 2.721e-09 ***
---
Signif. codes:  0 ???***??? 0.001 ???**??? 0.01 ???*??? 0.05 ???.??? 0.1 ??? ??? 1 
> 
> proc.time()
   user  system elapsed 
  0.376   0.028   0.398 


More information about the R-devel mailing list