[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

Michael Chirico m|ch@e|ch|r|co4 @end|ng |rom gm@||@com
Sun Sep 15 14:52:34 CEST 2019


Finally read in detail your response Gabe. Looks great, and I agree it's
quite intuitive, as well as agree against non-recycling.

Once the length(n) == length(dim(x)) behavior is enabled, I don't think
there's any need/desire to have head() do x[1:6,1:6] anymore. head(x, c(6,
6)) is quite clear for those familiar with head(x, 6), it would seem to me.

Mike C

On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker <gabembecker using gmail.com>
wrote:

> Hi Michael and Abby,
>
> So one thing that could happen that would be backwards compatible (with
> the exception of something that was an error no longer being an error) is
> head and tail could take vectors of length (dim(x)) rather than integers of
> length for n, with the default being n=6 being equivalent to n = c(6,
> dim(x)[2], <...>, dim(x)[k]), at least for the deprecation cycle, if not
> permanently. It not recycling would be unexpected based on the behavior of
> many R functions but would preserve the current behavior while granting
> more fine-grained control to users that feel they need it.
>
> A rapidly thrown-together prototype of such a method for the head of a
> matrix case is as follows:
>
> head2 = function(x, n = 6L, ...) {
>     indvecs = lapply(seq_along(dim(x)), function(i) {
>         if(length(n) >= i) {
>             ni = n[i]
>         } else {
>             ni =  dim(x)[i]
>         }
>         if(ni < 0L)
>             ni = max(nrow(x) + ni, 0L)
>         else
>             ni = min(ni, dim(x)[i])
>         seq_len(ni)
>     })
>     lstargs = c(list(x),indvecs, drop = FALSE)
>     do.call("[", lstargs)
> }
>
>
> > mat = matrix(1:100, 10, 10)
>
> > *head(mat)*
>
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>
> [1,]    1   11   21   31   41   51   61   71   81    91
>
> [2,]    2   12   22   32   42   52   62   72   82    92
>
> [3,]    3   13   23   33   43   53   63   73   83    93
>
> [4,]    4   14   24   34   44   54   64   74   84    94
>
> [5,]    5   15   25   35   45   55   65   75   85    95
>
> [6,]    6   16   26   36   46   56   66   76   86    96
>
> > *head2(mat)*
>
>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>
> [1,]    1   11   21   31   41   51   61   71   81    91
>
> [2,]    2   12   22   32   42   52   62   72   82    92
>
> [3,]    3   13   23   33   43   53   63   73   83    93
>
> [4,]    4   14   24   34   44   54   64   74   84    94
>
> [5,]    5   15   25   35   45   55   65   75   85    95
>
> [6,]    6   16   26   36   46   56   66   76   86    96
>
> > *head2(mat, c(2, 3))*
>
>      [,1] [,2] [,3]
>
> [1,]    1   11   21
>
> [2,]    2   12   22
>
> > *head2(mat, c(2, -9))*
>
>      [,1]
>
> [1,]    1
>
> [2,]    2
>
>
> Now one thing to keep in mind here, is that I think we'd  either a) have
> to make the non-recycling  behavior permanent, or b) have head treat
> data.frames and matrices different with respect to the subsets they grab
> (which strikes me as a  *Bad Plan *(tm)).
>
> So I don't think the default behavior would ever be mat[1:6, 1:6],  not
> because of backwards compatibility, but because at least in my intuition
> that is just not what head on a data.frame should do by default, and I
> think the behaviors for the basic rectangular datatypes should "stick
> together". I mean, also because of backwards compatibility, but that could  *in
> theory* change across a long enough deprecation cycle, but  the
> conceptually right thing to do with a data.frame probably won't.
>
> All of that said, is head(mat, c(6, 6)) really that much  easier to
> type/better than just mat[1:6, 1:6, drop=FALSE] (I know this will behave
> differently if any of the dims of mat are less than 6, but if so why are
> you heading it in the first place ;) )? I don't really have a strong
> feeling on the answer to that.
>
> I'm happy to put a patch for head.matrix, head.data.frame, tail.matrix and
> tail.data.frame, plus documentation, if people on R-core are interested in
> this.
>
> Note, as most here probably know, and as alluded to above,  length(n) > 1
> for head or tail currently give an error, so  this would  be an extension
> of the existing functionality in the mathematical extension sense, where
> all existing behavior would remain identical, but the support/valid
> parameter space would grow.
>
> Best,
> ~G
>
>
> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle <spurdle.a using gmail.com> wrote:
>
>> > I assume there are lots of backwards-compatibility issues as well as
>> valid
>> > use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out
>> of
>> > the question.
>>
>> Agree.
>>
>> > Is there any scope for adding a new argument to head.matrix that would
>> > allow this flexibility?
>>
>> I agree with what you're trying to achieve.
>> However, I'm not sure this is as simple as you're suggesting.
>>
>> What if the user wants "head" in rows but "tail" in columns.
>> Or "head" in rows, and both "head" and "tail" in columns.
>> With head and tail alone, there's a combinatorial explosion.
>>
>> Also, when using tail on an unnamed matrix, it may be desirable to name
>> rows and columns.
>>
>> And all of this assumes standard matrix objects.
>> Add in a matrix subclasses and related objects, and things get more
>> complex
>> still.
>>
>> As I suggested in a another thread, a few days ago, I'm planning to write
>> an R package for matrices and matrix-like objects (possibly extending the
>> Matrix package), with an initial emphasis on subsetting, printing and
>> formatting.
>> So, I'm interested to hear more suggestions on this topic.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list