[R] [FORGED] Regression with factors ?

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Wed Jul 13 20:09:35 CEST 2016


The formula interface as used in lm and nls searches for separate 
coefficients for each variable.. it will take someone more clever than I 
to figure out how to get the formula interface to think of two variables 
as instances of one factor.

However, R can do nonlinear optimization just fine:

##------------
# as if read in using read.csv( fname, as.is=TRUE )
dta <- data.frame( y = observed_data$y
                  , p1 = as.character( observed_data$p1 )
                  , p2 = as.character( observed_data$p2 )
                  , stringsAsFactors = FALSE
                  )

lvls <- with( dta, unique( c( p1, p2 ) ) )
dta$p1f <- factor( dta$p1, levels = lvls )
dta$p2f <- factor( dta$p2, levels = lvls )

idxvmult <- length( lvls ) + 1L
idxvoffs <- length( lvls ) + 2L

# all values in a numeric vector
# x = c( valice, vbob, ..., vmult, voffs )
calcY <- function( x ) {
   vmult <- x[ idxvmult ]
   voffs <- x[ idxvoffs ]
   vp1 <- x[ dta$p1f ]
   vp2 <- x[ dta$p2f ]
   vmult * ( voffs - ( vp1 - vp2 )^2 )
}

optfcn <- function( x ) {
   sum( ( dta$y - calcY( x ) ) ^ 2 )
}

oresult <- optim( par = rep( 1, idxvoffs ), optfcn)

result <- list( multiplier = oresult$par[ idxvmult ]
               , offset = oresult$par[ idxvoffs ]
               , values = data.frame( lvls = lvls
                                    , values = oresult$par[ seq.int( 
length( lvls ) ) ] )
               )
result

#---------

I highly recommend reading the help page for optim and the CRAN Task View 
on optimization [1]

[1] https://cran.r-project.org/web/views/Optimization.html

On Wed, 13 Jul 2016, stn021 wrote:

>> Is this what is intended?
>>
>>> observed_data$p1ab <- persons$ability[ match(observed_data$p1, persons$name) ]
>>> observed_data$p2ab <- persons$ability[ match(observed_data$p2, persons$name) ]
>
>
> Hello David,
>
> thank you for your answer.
>
>
> The code in my previous post was intended as an answer to the question
> in an earlier post about example-data, quote:
>
>>> Would you like me to make a complete example dataset with more records and noise ?
>> Yes. And preferably do it with R code.
>
> I should have re-stated this connection in the post.
>
>
> The code generates a matrix 'observed_data' which is the data the
> experimenter would get during the experiment.
>
> This matrix is output in the last line. All other output is only meant
> to document the generation-process.
>
> So the only thing visible to the experimenter before analysis is
> exactly that matrix 'observed_data'  (usually in the form of some
> written documentation which is later entered into statistical
> software). Everything before that last line simulates those unknown
> parameters that the experiment is supposed to reveal.
>
> The unknown parameters are specifically
> - the matrix 'persons'
> - and the variable 'multiplyer'
>
> Both are supposed to be revealed by the analyis. p1ab and p2ab would
> therefore depend on the unknown parameters and could not be added to
> 'observed_data' before the analysis.
>
> Sorry again for omitting the back-reference.
>
>
> I would like to know:
>
> - how to get R to use p1 and p2 as levels of the same factor
> (=persons) instead of levels of two different factors.
>
> - how to get R to multiply the numerical levels of factors during the
> search for the solution. Factors cannot be multiplied before running
> lm() or some other package because before the analysis their numerical
> values are not known.
>
>
> THX, Stefan
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list