[Rd] suggested modification to the 'mle' documentation?

Fri Dec 7 16:29:45 CET 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gabor Grothendieck wrote:
> On Dec 7, 2007 8:43 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>> On 12/7/2007 8:10 AM, Peter Dalgaard wrote:

>>>>
>>> This is at least cleaner than abusing the "fixed" argument. 

   Agreed.

>>> As you know,
>>> I have reservations, one of which is that it is not a given that I want
>>> it to behave just like other modeling functions, e.g. a likelihood
>>> function might refer to more than one data set, and/or data that are not
>>> structured in the traditional data frame format. The design needs more
>>> thought than just adding arguments.

  Fair enough.

>> We should allow more general things to be passed as data arguments in
>> cases where it makes sense.  For example a list with names or an
>> environment would be a reasonable way to pass data that doesn't fit into
>> a data frame.

  Well, my current design specifies a named list: I *think* (but am not
sure) it works gracefully with a data frame as well.  Hadn't thought of
environments -- I'm aiming this more at a lower-level user to whom that
wouldn't occur.  (But I hope it would be possible to design a system
that would be usable by intermediate users and still useful for experts.)

>>> I still prefer a design based a plain likelihood function. Then we can
>>> discuss how to construct such a function so that  the data are
>>> incorporated in a flexible way.  

   My version still allows a plain likelihood function (I agree that
there will always be situations that are too complicated to encapsulate
as a formula).

>>> There are many ways to do this, I've
>>> shown one, here's another:
>>>
>>>> f <- function(lambda) -sum(dpois(x, lambda, log=T))
>>>> d <- data.frame(x=rpois(10000, 12.34))
>>>> environment(f)<-evalq(environment(),d)
>> We really need to expand as.environment, so that it can convert data
>> frames into environments.  You should be able to say:
>>
>> environment(f) <- as.environment(d)
>>
>> and get the same result as
>>
>> environment(f)<-evalq(environment(),d)
>>
>> But I'd prefer to avoid the necessity for users to manipulate the
>> environment of a function.  

    HEAR, HEAR.

I think the pattern
>>
>> model( f, data=d )
>>
>> being implemented internally as
>>
>> environment(f) <- as.environment(d, parent = environment(f))
>>
>> is very nice and general.  It makes things like cross-validation,
>> bootstrapping, etc. conceptually cleaner:  keep the same
>> formula/function f, but manipulate the data and see what happens.
>> It does have problems when d is an environment that already has a
>> parent, but I think a reasonable meaning in that case would be to copy
>> its contents into a new environment with the new parent set.
>>

  OK.

>> Duncan Murdoch
> 
> Something close to that is already possible in proto and its cleaner in proto
> since the explicit environment manipulation is unnecessary as it occurs
> implicitly:
> 
> 1. In terms of data frame d from Peter Dalgaard's post the code
> below is similar to my last post but it replaces the explicit
> manipulation of f's environemnt with the creation of proto object
> p on line ###.  That line converts d to an anonymous proto object
> containing the components of d, in this case just x, and then
> creates a child object p which can access x via delegation/inheritance.
> 
> library(proto)
> set.seed(1)
> f <- function(lambda) -sum(dpois(x, lambda, log=T))
> d <- data.frame(x=rpois(100, 12.34))
> p <- proto(as.proto(as.list(d)), f = f) ###
> mle(p[["f"]], start=list(lambda=10))
> 
> 2. Or the ### line could be replaced with the following line
> which places f and the components of d, in this case just x,
> directly into p:
> 
> p <- proto(f = f, envir = as.proto(as.list(d)))
> 
> again avoiding the explicit reset of environment(f) and the evalq.
> 
>>
>>>> mle(f, start=list(lambda=10))
>>> Call:
>>> mle(minuslogl = f, start = list(lambda = 10))
>>>
>>> Coefficients:
>>>  lambda
>>> 12.3402
>>>

 *** I still feel very strongly that end users shouldn't have
to deal with closures, environments, protos, etc. --  I want
mle to LOOK LIKE a standard modeling function if at all possible,
even if it can be used more creatively and flexibly by
those who know how. ***

>>> It is not at all an unlikely design to have mle() as a generic function
>>> which works on many kinds of objects, the default method being
>>> function(object,...) mle(minuslogl(obj)) and minuslogl is an extractor
>>> function returning (tada!) the negative log likelihood function.

   Agreed.  This would work for formulas, too.

  Have any of you guys looked at bbmle?  The evaluation stuff is
quite ugly, since I was groping around in the dark.  I would love
to clean it up in a way that made everyone happy (?) with it and
possibly allowed it to be merged back into mle.

   Ben

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHWWbpc5UpGjwzenMRApxZAJwLYuW+9beykCO1fJvBO4ICZxbEJwCfXgYR
F0nNR+/+/xy11xav9uDZSBE=
=bgiY
-----END PGP SIGNATURE-----