[R] Nonlinear Regression Parameter Shared Across Multiple DataSets

Keith Jewell k.jewell at campden.co.uk
Fri Oct 15 17:48:01 CEST 2010


Hi,

I've had to do something like that before. It seems to be a "feature" of nls 
(in R, but not as I recall in Splus) that it accepts a list with vector 
components as 'start' values, but flattens the result values to a single 
vector.

I can't spend much time explaining, but here's a fragment of code that might 
get you started:
---------------------------
# Fit the nls, correct coef names lost by nls
   val <- nls(formula=formulaIn, data=DataList, start= tCoefs, 
control=control)
  CoefList <- list()  # initialise CoefList
   for(aName in names(tCoefs)) {                # for each vector of 
coefficients
         tvec <- get(aName, envir=val$m$getEnv()) # get it from the nls 
environment
         names(tvec) <- names(tCoefs[[aName]])     # correct its names
          assign(aName, tvec, envir=val$m$getEnv()) # return it to nls
         CoefList[[aName]] <- tvec        # store in CoefList
        }
------------------------
As I recall,
tCoefs is a list of starting values that can have vector components
CoefList ends up as a similar structure and named list of result values

hth

Keith J

"Jared Blashka" <evilamarant7x at gmail.com> wrote in message 
news:AANLkTimPrOwM-mNE9N_Bu8tY=jYLiTQ4eWHPPH1HygAS at mail.gmail.com...
> Looking at the source for nlrob, it looks like it saves the coefficients
> from the results of running an nls and then passes those coefficients back
> into the next nls request. The issue that it's running into is that nls
> returns the coefficients as upper, LOGEC501, LOGEC502, and LOGEC503, 
> rather
> than just upper and a vector named LOGEC50. Does anyone know a way to
> restructure the formula/start parameter so that coef returns a vector
> instead of each element individually? Right now, I've had to 'hack' nlrob 
> so
> it recombines similarly named elements into a vector, but was wondering if
> there was a way to accomplish the end goal without those measures.
>
> Thanks,
> Jared
>
> On Wed, Oct 13, 2010 at 3:14 PM, Jared Blashka 
> <evilamarant7x at gmail.com>wrote:
>
>> As an addendum to my question, I'm attempting to apply the solution to 
>> the
>> robust non-linear regression function nlrob from the robustbase package, 
>> and
>> it doesn't work in that situation. I'm getting
>>
>> allRobustFit <- nlrob(Y ~ (upper)/(1+10^(X-LOGEC50[dset])), data=all
>> ,start=list(upper=max(all$Y),LOGEC50=c(-8.5,-8.5,-8.5)))
>> Error in nls(formula, data = data, start = start, algorithm = algorithm,
>>  :
>>   parameters without starting value in 'data': LOGEC50
>>
>> I'm guessing this is because the nlrob function doesn't know what to do
>> with a vector for a start value. Am I correct and is there another method 
>> of
>> using nlrob in the same way?
>>
>> Thanks,
>> Jared
>>
>> On Tue, Oct 12, 2010 at 8:58 AM, Jared Blashka 
>> <evilamarant7x at gmail.com>wrote:
>>
>>> Thanks so much! It works great.
>>>
>>> I had thought the way to do it relied on combining the data sets, but I
>>> couldn't figure out how to alter the formula to work with the 
>>> combination.
>>>
>>> Jared
>>>
>>>
>>> On Tue, Oct 12, 2010 at 7:07 AM, Keith Jewell 
>>> <k.jewell at campden.co.uk>wrote:
>>>
>>>>
>>>> "Jared Blashka" <evilamarant7x at gmail.com> wrote in message
>>>> news:AANLkTinFfMuDugqNkUDVr=FMf0wrRTsbjXJExuki_MRH at mail.gmail.com...
>>>> > I'm working with 3 different data sets and applying this non-linear
>>>> > regression formula to each of them.
>>>> >
>>>> > nls(Y ~ (upper)/(1+10^(X-LOGEC50)), data=std_no_outliers,
>>>> > start=list(upper=max(std_no_outliers$Y),LOGEC50=-8.5))
>>>> >
>>>> > Previously, all of the regressions were calculated in Prism, but I'd
>>>> like
>>>> > to
>>>> > be able to automate the calculation process in a script, which is why
>>>> I'm
>>>> > trying to move to R. The issue I'm running into is that previously, 
>>>> > in
>>>> > Prism, I was able to calculate a shared value for a constraint so 
>>>> > that
>>>> all
>>>> > three data sets shared the same value, but have other constraints
>>>> > calculated
>>>> > separately. So Prism would figure out what single value for the
>>>> constraint
>>>> > in question would work best across all three data sets. For my 
>>>> > formula,
>>>> > each
>>>> > data set needs it's own LOGEC50 value, but the upper value should be
>>>> the
>>>> > same across the 3 sets. Is there a way to do this within R, or with a
>>>> > package I'm not aware of, or will I need to write my own nls function
>>>> to
>>>> > work with multiple data sets, because I've got no idea where to start
>>>> with
>>>> > that.
>>>> >
>>>> > Thanks,
>>>> > Jared
>>>> >
>>>> > [[alternative HTML version deleted]]
>>>> >
>>>> An approach which works for me (code below to illustrate principle, not
>>>> tried...)
>>>>
>>>> 1) combine all three "data sets" into one dataframe with a column (e.g.
>>>> dset) indicating data set (1, 2 or 3)
>>>>
>>>> 2) express your formula with upper as single valued and LOGEC50 as a
>>>> vector
>>>> inderxed by dest e.g.
>>>>     Y ~ upper/(1+10^(C-LOGEC50[dset]))
>>>>
>>>> 3) in the start list, make LOGEC50 a vector e.g. using -8.5 as start 
>>>> for
>>>> all
>>>> three LOGEC50 values
>>>>   start =
>>>> list(start=list(upper=max(std_no_outliers$Y),LOGEC50=c(-8.5, -8.5, -8.5))
>>>>
>>>> Hope that helps,
>>>>
>>>> Keith J
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>



More information about the R-help mailing list