[Rd] subset argument in nls() and possibly other functions

Balasubramanian Narasimhan n@r@@ @end|ng |rom @t@n|ord@edu
Thu Jul 15 00:16:17 CEST 2021


For the example provided below, the subsetting happens in evaluating the 
call to stats::model.formula in line 583 of nls.R 
(https://github.com/wch/r-source/blob/e91be22f6f37644e5a0ba74a3dfe504a3a29e9f7/src/library/stats/R/nls.R#L583) 
returning an appropriate (subsetted) data frame.

-Naras

On 7/13/21 4:21 PM, J C Nash wrote:
> In mentoring and participating in a Google Summer of Code project "Improvements to nls()",
> I've not found examples of use of the "subset" argument in the call to nls(). Moreover,
> in searching through the source code for the various functions related to nls(), I can't
> seem to find where subset is used, but a simple example, included below, indicates it works.
> Three approaches all seem to give the same results.
>
> Can someone point to documentation or code so we can make sure we get our revised programs
> to work properly? The aim is to make them more maintainable and provide maintainer documentation,
> along with some improved functionality. We seem, for example, to already be able to offer
> analytic derivatives where they are feasible, and should be able to add Marquardt-Levenberg
> stabilization as an option.
>
> Note that this "subset" does not seem to be the "subset()" function of R.
>
> John Nash
>
> # CroucherSubset.R -- https://walkingrandomly.com/?p=5254
>
> xdata = c(-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9)
> ydata = c(0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001)
> Cform <- ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)
> Cstart<-list(p1=1,p2=0.2)
> Cdata<-data.frame(xdata, ydata)
> Csubset<-1:8 # just first 8 points
>
> # Original problem - no subset
> fit0 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, start=list(p1=1,p2=.2))
> summary(fit0)
>
> # via subset argument
> fit1 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, start=list(p1=1,p2=.2), subset=Csubset)
> summary(fit1)
>
> # via explicit subsetting
> Csdata <- Cdata[Csubset, ]
> Csdata
> fit2 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Csdata, start=list(p1=1,p2=.2))
> summary(fit2)
>
> # via weights -- seems to give correct observation count if zeros not recognized
> wts <- c(rep(1,8), rep(0,2))
> fit3 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, weights=wts, start=list(p1=1,p2=.2))
> summary(fit3)
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list