[Rd] arr.ind argument to which.min and which.max

Martin Maechler maechler at stat.math.ethz.ch
Tue Jul 6 11:53:52 CEST 2010


>>>>> "HenrikB" == Henrik Bengtsson <hb at stat.berkeley.edu>
>>>>>     on Mon, 5 Jul 2010 22:53:59 +0200 writes:

    HenrikB> ...and, of course, just after sending it off I found out that from R
    HenrikB> v2.11.0 there is now an arrayInd() in the 'base' package doing exactly
    HenrikB> the same thing.  See help(arrayInd).

yes... and if you wondered *who* re-factored which() into its
internal the arrayInd() part .....
.....
had not known about the R.utils function.


[............]

    >> On Mon, Jul 5, 2010 at 8:27 PM, Patrick Burns <pburns at pburns.seanet.com> wrote:
    >>> On 05/07/2010 10:56, Martin Maechler wrote:
    >>>>>>>>> 
    >>>>>>>>> "PatB" == Patrick Burns<pburns at pburns.seanet.com>
    >>>>>>>>>     on Sun, 04 Jul 2010 09:43:44 +0100 writes:
    >>>> 
    >>>>     PatB>  Is there a reason that 'which.min' and
    >>>>     PatB>  'which.max' don't have an 'arr.ind'
    >>>>     PatB>  argument?
    >>>> 
    >>>> well,  help(which.min)  tells you that they really were aimed at
    >>>> doing their job *fast* for vectors.
    >>>> 
    >>>> Of course you are right and a generalization to arrays might be
    >>>> convenient at times.
    >>>> 
    >>>>     PatB>  The context in which I wanted that was
    >>>>     PatB>  a grid search optimization, which seems
    >>>>     PatB>  like it would be reasonably common to me.
    >>>> 
    >>>> well, as the author of these two functions, I can only say
    >>>> 
    >>>>       "patches are welcome!"
    >>>> 
    >>>> and I think should be pretty simple, right ?
    >>>> You just have to do very simple remapping of the 1d index 'i' back
    >>>> to the array index, i.e., the same operation
    >>>> you need to transform seconds into days:hours:minutes:seconds
    >>>> {{and yes, we old-timers may recall that APL had an operator (I
    >>>>   think "T-bar") to do that ...}
    >>> 
    >>> I think the exercise is just to copy the definition of
    >>> 'which' and add four characters.

Well, yes.  But then, one reason for refactoring 'which' into its
vector and arrayInd() part was that people could use arrayInd()
on its own.

Wouldn't it make more sense to call

   arrayInd(which.min(mat), dim(mat))

instead of  
	    which.min(mat, arr.ind = TRUE)

in the spirit of modularity, maintainability, ... ?
Honestly, in my first reply I had forgotten about my own
arrayInd() modularization.... 

    >>> If the order of the if condition were reversed, then
    >>> possibly the slight reduction in speed of 'which.min'
    >>> and 'which.max' would be more than made up for in the
    >>> slight increase in speed of 'which'.

thanks for the hint, but

      "increase in speed of 'which'"  -- really, can you measure that?

(I'll reverse the order anyway)

If we are interested in speed increase, we should add an option
to *not* work with dimnames at all (*) and if we have programmer
time left, we could take it .Internal() and get a real
boost... not now though. 

(*) I'm doing that for now, *and* I would like to change the
    default behavior or arrayInd(), but of course *not* the
    default behavior of which(),
    to *not* attach dimnames to the result, by default.

  I.e., I'm proposing to add   'useNames = FALSE' as argument to
  arrayInd() but have  which() call arrayInd(..., useNames=TRUE).

  This is a back-compatibility change in arrayInd() -- which has
  existed only since 2.11.0 anyway, so would seem ok, to me.

  Opinions ?

--
Martin



More information about the R-devel mailing list