[Rd] Patch to allow negative argument in head() and tail()

Vincent Goulet vincent.goulet at act.ulaval.ca
Wed Jul 26 17:45:53 CEST 2006


Le Mardi 18 Juillet 2006 04:42, Martin Maechler a écrit :
> >>>>> "Vincent" == Vincent Goulet <vincent.goulet at act.ulaval.ca>
> >>>>>     on Mon, 17 Jul 2006 15:03:34 -0400 writes:
>
>     Vincent> Dear developeRs (and other abuseRs ;-),
>
>     Vincent> I would like to contribute a patch against
>     Vincent> functions head() and tail() of package utils to
>     Vincent> allow for a negative 'n' argument. This allows to
>     Vincent> extract all but the first/last 'n'
>     Vincent> elements/rows/lines of an object, similar to the
>     Vincent> "drop" operator of APL. [1]
>
> Hmm, if you reread  Bill Venables proposal (URL below), you did
> something different : In Bill's (and my!) "book",
>
>   head would always give the *first* few entries and
>   tail would always give the *last*  few entries.
>
> That's different from APL's drop, but for a good reason:
> The words 'head' and 'tail' exactly suggest so.

I modified my patch to go along these lines. My brain must be wired 
differently since it remains counter intuitive to me but then, who am I to go 
against R Core Team members and Unix itself! ;-)

>     Vincent> I put the patched head.R and head.Rd files, along with diff
> files in Vincent> http://vgoulet.act.ulaval.ca/pub/R/
>
>     Vincent> The differences were obtained against today's version of
> r-devel (more Vincent> specifically revision 30277 of head.R and revision
> 30915 of head.Rd).
>
> That's good (to take the "current" sources for the diffs).

This time I just attached the result of 'svn diff' against revision 38701 of 
r-devel.

Please note that in head.default() and tail.default() I deleted the line

    if(length(dim(x)) == 1) array(ans, n, list(names(ans))) else ans

since I could not find any use for it. Perhaps did this become obsolete with 
the following bug fix in v2.2.0 (from the release notes):

o	Subsetting a matrix or an array as a vector used to attempt to
	use the row names to name the result, even though the
	array might be longer than the row names.  Now this is only
	done for 1D arrays when it is done in all cases, even matrix
	indexing.  (Tidies up after the fix to PR#937.)

But then, even in 2.1.1:

> x <- array(1:10, 10, list(letters[1:10]))
> x
 a  b  c  d  e  f  g  h  i  j 
 1  2  3  4  5  6  7  8  9 10 
> x[1:3]
a b c 
1 2 3 
> identical(x[1:3], array(x[1:3], 3, list(names(x[1:3]))))
[1] TRUE

If the deleted line should remain, please (explain why and) tell me and I'll 
add it back.

>     Vincent> Some comments:
>
>     Vincent> - The current version of head() and tail() will
>     Vincent> accept a vector of length > 1 for argument 'n' but
>     Vincent> will silently use the smallest value. This became
>     Vincent> awkward to reproduce in my versions and did not
>     Vincent> seem interesting anyway.  Instead, I added an error
>     Vincent> message if length(n) > 1.
>
> that's ok in my view

Still there.

>     Vincent> - I used the word "scalar" in the aforementioned
>     Vincent> error message to mean a vector of length 1. Perhaps
>     Vincent> is this not the correct R terminology?
>
> indeed, it's rarely used in R terminology; for one reason
> because S (and hence R) does not differentiate between length-1
> vectors and scalars the way APL does.

Changed for "single integer", as found in another help page.

>     Vincent> - I added a 'addrownums = TRUE' argument to head() used when n
> < 0, similar to Vincent> tail() with n > 0. This required to write separate
> methods for Vincent> classes 'data.frame' and 'matrix'.
>
> seems not unreasonable {I did not yet look at your implementation there}

As mentioned in another message, this is no longer needed.

>     Vincent> - The 'function' methods are not modified.
>
>     Vincent> - In the man page, the 'function' method was not documented in
> the usage Vincent> section. Done now.
>
> ok, though not necessary: The recommended approach is to only
>    document methods when they have ``surprising arguments'', i.e.,
>    arguments not in the generic function.
>
> In our case, 'n = 6' is not part of the generic, so strictly
> speaking *is* a "surprising argument".
> Probably it was not made part of the generic, since it's
> imaginable to have objects whose "head" is always of a fixed
> given size, and where specifying 'n' does not make sense.

For symmetry, I think it should be added.

>     Vincent> - I don't think the patch would break any existing code,
> except code using the Vincent> (undocumented) "feature" mentioned in my
> first remark, above.

This should remain valid.

[...]

Best regards to all,

-- 
  Vincent Goulet, Associate Professor
  École d'actuariat
  Université Laval, Québec 
  Vincent.Goulet at act.ulaval.ca   http://vgoulet.act.ulaval.ca


More information about the R-devel mailing list