[Rd] several bugs (PR#918) lists and matrices

Thomas Lumley tlumley@u.washington.edu
Mon, 23 Apr 2001 12:54:17 -0700 (PDT)


On Mon, 23 Apr 2001, Rich Heiberger wrote:

> Robert's example
>   x<-matrix(1:6,nc=3)
>   x[[2,3]]<-NA
>   attr(x[[2,3]],"mv") <- "absent"
> loses the attribute in S-Plus as well as in R.  The only way I
> have been able to put attributes on individual elements is to make
> a matrix of lists.  Interestingly, this statement does print
>   print(attr(x[[2,3]],"mv") <- "absent")
> The value of the attribute doesn't vanish for the duration of the
> current statement.  The value is gone if we follow this with
>   attr(x[[2,3]],"mv")
> Should there be a warning message when the assignment is attempted?
>
> As to the fundamental question of putting dimensions on a list, that
> is asking the question backwards from how I am looking at it.  I have
> a two-dimensional array of data items for which one or more of the
> values is unknown.  The matrix is the natural structure for this data.
>
> How do I record the type of missingness for the missing datum?  I have
> chosen to place a list in the cell of the matrix.  Another option is
> to have a parallel matrix of missingness information, possibly
> attached as an attribute to the original data.  Another option is a
> sparse array of some form that is pointed to by access functions that
> are sensitive to the NA in the original data matrix.
>
> To me, the recursive structure is the most natural way to represent the
> information about missingness.

Ah. Ok.

This is a more difficult question.  You can't put a list into a matrix.
Matrices handle homogenous data; they are vectors with a dimension
attribute.  Lists with an arbitrary dimension attribute are, as Rob
pointed out, an unimplemented bug. However, rectangle things with
arbitrary data in them do exist. They're called data frames.

As you will remember from Rob's DSC2001 talk, you can put potentially very
complicated objects into cells in a data frame. His example was
longitudinal data, where a celll might be the entire record of followup
measurements for a particular variable. Have a look at his paper in the
DSC proceedings at http://www.ci.tuwien.ac.at/Conferences/DSC-2001/

You could create an object of class "mvelement", say, which would be
either a number or a list, depending on whether the value was observed or
missing, and stick these in a data frame.


Another approach would be to have a matrix of data, with NAs where
necessary, and have the missing value information as an attribute of that
matrix, but that's different from what you're trying to do.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley@u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._