[Rd] str(<1d-array>)

Martin Maechler maechler at stat.math.ethz.ch
Mon Jan 26 11:56:24 CET 2009


>>>>> "MS" == Marc Schwartz <marc_schwartz at comcast.net>
>>>>>     on Fri, 23 Jan 2009 08:41:38 -0600 writes:

    MS> on 01/23/2009 07:36 AM Martin Maechler wrote:
    >>>>>>> "TP" == Tony Plate <tplate at acm.org> on Thu, 22 Jan
    >>>>>>> 2009 11:01:21 -0700 writes:
    >> 
    TP> Martin Maechler wrote:
    >> >>>>>>> "TP" == Tony Plate <tplate at acm.org> >>>>>>> on
    >> Fri, 16 Jan 2009 13:10:04 -0700 writes:
    >> >>>>>>> 
    >> >> 
    TP> Martin Maechler wrote:
    >> >> >>>>>>> "PatB" == Patrick Burns
    >> <pburns at pburns.seanet.com> >> >>>>>>> on Tue, 13 Jan 2009
    >> 17:00:40 +0000 writes:
    >> >> >>>>>>> 
    >> >> >> 
    PatB> Henrik Bengtsson wrote:
    >> >> >> >> Hi.
    >> >> >> >> 
    >> >> >> >> On Mon, Jan 12, 2009 at 11:58 PM, Prof Brian
    >> Ripley >> >> >> <ripley at stats.ox.ac.uk> wrote:
    >> >> >> >> 
    >> >> >> >>> What you have is a one-dimensional array: they
    >> crop up >> >> >>> in R most often from table() in my
    >> experience.
    >> >> >> >>> 
    >> >> >> >>> 
    >> >> >> >>>> f <- table(rpois(100, 4)) str(f)
    >> >> >> >>>> 
    >> >> >> >>> 'table' int [, 1:10] 2 6 18 21 13 16 13 4 3 4 -
    >> attr(*, >> >> >>> "dimnames")=List of 1 ..$ : chr [1:10]
    >> "0" "1" "2" "3" >> >> >>> ...
    >> >> >> >>> 
    >> >> >> >>> and yes, f is an atmoic vector and yes, str()'s
    >> notation >> >> >>> is confusing here but if it did [1:10]
    >> you would not >> >> >>> know it was an array.  I recall
    >> discussing this with >> >> >>> Martin Maechler (str's
    >> author) last century, and I've >> >> >>> just checked
    >> that R 2.0.0 did the same.
    >> >> >> >>> 
    >> >> >> >>> The place in which one-dimensional arrays
    >> differ from >> >> >>> normal vectors is how names are
    >> handled: notice that my >> >> >>> example has dimnames
    >> not names, and ?names says
    >> >> >> >>> 
    >> >> >> >>> For a one-dimensional array the 'names'
    >> attribute really >> >> >>> is 'dimnames[[1]]'.
    >> >> >> >>> 
    >> >> >> >> 
    >> >> >> >> Thanks for this explanation.  One could then
    >> argue that >> >> >> [1:10,] is somewhat better than
    >> [,1:10], but that is just polish.
    >> >> >> 
    >> >> >> yes.  And honestly I don't remember anymore why I
    >> chose the >> >> "[,1:n]" notation.  It definitely was
    >> there already before R >> >> came into existence, as S
    >> also has had one-dimensional arrays, >> >> and I
    >> programmed the first version of str() in 1990.
    >> >> >> 
    PatB> Perhaps it could be:
    >> >> >> 
    PatB> [1:10(,)]
    >> 
    PatB> That is weird enough that it should not lead people to
    PatB> believe that it is a matrix.  But might prompt them a
    PatB> bit in that direction.
    >> >> >> 
    >> >> >> Well, str() was always aimed a bit at experienced S
    >> (and R) >> >> users, and I had always aimed somewhat to
    >> keep it's output >> >> "compact".  I'm quite astonished
    >> that the OP didn't know about >> >> 1D arrays in spite of
    >> the many years he's been using R.  >> >> Would a wierd
    >> solution like the above have helped?
    >> >> >> 
    >> >> >> At the moment, I'd tend to keep it "as is" if only
    >> just for >> >> historical reminescence, but I can be
    >> convinced to change the >> >> current "tendency" ...
    >> >> >> 
    >> >> >> Martin Maechler, ETH Zurich
    >> >> >> 
    TP> What about just including "(1d-array)", something like
    TP> this
    >> >> >> str(f)
    TP> 'table' int [1:10](1d array) 5 5 9 23 26 16 9 4 2 1 -
    TP> attr(*, "dimnames")=List of 1 ..$ : chr [1:10] "0" "1"
    TP> "2" "3" ...
    >> >> >> 
    TP> only 9 extra characters for a rare case, and much, much
    TP> less cryptic?
    >> >> 
    >> >> well,.. the next text request is to use >> "character"
    >> instead of "chr", only 6 extra characters ....
    >> >> 
    >> >> -> no way: str() has its very concise "style" and
    >> should keep that.
    >> >> 
    TP> Brevity is good, but clarity is important too.  The
    TP> output of str is usually decipherable, but not so much
    TP> in this case.  It's easy to dismiss suggestions like
    TP> replacing "chr" with "character" - the increase in
    TP> clarity would be minimal.  However, the potential
    TP> increase in clarity for a 1-d array is significant - the
    TP> decrease in brevity is at question here. Given the
    TP> rarity of the case it seems like a decent tradeoff to
    TP> add "(1d-array)" (one could even just write "(1d)").
    TP> 1-d arrays are sufficiently rare that no concise and
    TP> clear method of indicating them using brackets or other
    TP> symbols has arisen. You did say you "can be convinced to
    TP> change" it, but I won't attempt beyond this! :-)
    >> 
    >> well, "still can be .." .....
    >> 
    >> So you currently propose to replace "int [,1:10] 5 5 9 23
    >> 26 16 9 4 2 1" by "int [1:10](1d) 5 5 9 23 26 16 9 4 2 1"
    >> where Pat had "int [1:10(,)] 5 5 9 23 26 16 9 4 2 1"
    >> 
    >> Since the [.....] is where we specify the dimensionality
    >> of all arrays in str(), I'd like to try something where
    >> things remain inside "[....]" as with Pat's version or
    >> e.g., with
    >> 
    >> "int [1:10/1d] 5 5 9 23 26 16 9 4 2 1"
    >> 
    >> Opinions, further proposals ?

    MS> Recognizing that I am coming to this discussion quite
    MS> late, how about:

    MS>        int [1:10(1d)] 5 5 9 23 26 16 9 4 2 1

    MS> ?

    MS> I do think that any str() representation that includes a
    MS> ',' would continue to reinforce the current
    MS> misunderstandings pertaining to a 1d array.

    MS> Since using str() is a common response to posts on
    MS> r-help regarding how to access components of an object,
    MS> there will be naive users who would see something like
    MS> (using Prof. Ripley's example):

    >> str(f)
    MS>  'table' int [, 1:11] 1 9 15 21 15 17 13 5 1 2 ...  -
    MS> attr(*, "dimnames")=List of 1 ..$ : chr [1:11] "0" "1"
    MS> "2" "3" ...

    MS> and then think that they could do:

    >> f[, 1]
    MS> Error in f[, 1] : incorrect number of dimensions

    MS> which of course they cannot.

    MS> I think that the above change would help to reinforce
    MS> the notion that a 1d array can, for the most part, be
    MS> treated as an atomic vector.  However, as Prof. Ripley
    MS> has noted, there is a subtle difference in how
    MS> names/dimnames are treated. The use of '(1d)' in the
    MS> str() output would make it clear that this object is not
    MS> quite a simple atomic vector, but when indexing, can be
    MS> treated as such.

    MS> Regards,

    MS> Marc Schwartz

No further comments from the audience ...
In R-core tradition, I take this as 
"yes go ahead if you want"
and I'm about to implement the latest proposal from Marc 
for R-devel,

e.g.

> str(array(1:6, 6))
  int [1:6(1d)] 1 2 3 4 5 6

> str(array(LETTERS, 100))
 chr [1:100(1d)] "A" "B" "C" "D" "E" "F" "G" "H" "I" ...

-- 
Martin Maechler, ETH Zurich



More information about the R-devel mailing list