[R] two questions for R beginners

Petr PIKAL petr.pikal at precheza.cz
Mon Mar 1 14:57:08 CET 2010


Hi

r-help-bounces at r-project.org napsal dne 01.03.2010 13:03:24:

< snip>

> > 
> > I understand that 2 dimensional rectangular matrix looks quite
> > similar to data frame however it is only a vector with dimensions.
> > As such it can have items of only one type (numeric, character, ...).
> > And you can easily change dimensions of matrix.
> > 
> > matrix<-1:12
> > dim(matrix) <- c(2,6)
> > matrix
> > dim(matrix) <- c(2,2,3)
> > matrix
> > dim(matrix) <-NULL
> > matrix
> > 
> > So rectangular structure of printed matrix is a kind of coincidence
> > only, whereas rectangular structure of data frame is its main feature.
> > 
> > Regards
> > Petr
> >> 
> >> -- 
> >> Karl Ove Hufthammer
> 
> Petr, I think that could be confusing! The way I see it is that
> a matrix is a special case of an array, whose "dimension" attribute
> is of length 2 (number of "rows", number of "columns"); and "row"
> and "column" refer to the rectangular display which you see when
> R prints to matrix. And this, of course, derives directly from
> the historic rectangular view of a matrix when written down.
> 
> When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)"
> you stripped it of its special title of "matrix" and cast it out
> into the motley mob of arrays (some of whom are matrices, but
> "matrix" no longer is).
> 
> So the "rectangular structure of printed matrix" is not a coincidence,
> but is its main feature!

Ok. Point taken. However I feel that possibility to manipulate 
matrix/array dimensions by simple changing them as I  showed above 
together with perceiving matrix as a **vector with dimensions** prevented 
me especially in early days from using matrices instead of data frames and 
vice versa. 

Consider cbind and rbind confusing results for vectors with unequal mode. 
Far to often we can see something like that

> cbind(1:2,letters[1:2])
     [,1] [,2]
[1,] "1"  "a" 
[2,] "2"  "b" 

instead of

> data.frame(1:2,letters[1:2])
  X1.2 letters.1.2.
1    1            a
2    2            b

and then a question why does not the result behave as expected. Each type 
of object has some features which is good for some type of 
manipulation/analysis/plotting bud quite detrimental for others.

Regards
Petr


> 
> To come back to Karl's query about why "$" works for a dataframe
> but not for a matrix, note that "$" is the extractor for getting
> a named component of a list. So, Karl, when you did
> 
>   d=head(iris[1:4])
> 
> you created a dataframe:
> 
>   str(d)
>   # 'data.frame':   6 obs. of  4 variables:
>   #  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4
>   #  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
>   #  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
>   #  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4
> 
> (with named components "Sepal.Length", ... , "Petal.Width"),
> and a dataframe is a special case of a general list. In a
> general list, the separate components can each be anything.
> In a dataframe, each component is a vector; the different
> vectors may be of different types (logical, numeric, ... )
> but of course the elements of any single vector must be
> of the same type; and, in a dataframe, all the vectors must
> have the same length (otherwise it is a general list, not
> a dataframe).
> 
> So, when you print a dataframe, R chooses to display it
> as a rectangular structure. On the other hand, when you
> print a general list, R displays it quite differently:
> 
>   d
>   #   Sepal.Length Sepal.Width Petal.Length Petal.Width
>   # 1          5.1         3.5          1.4         0.2
>   # 2          4.9         3.0          1.4         0.2
>   # 3          4.7         3.2          1.3         0.2
>   # 4          4.6         3.1          1.5         0.2
>   # 5          5.0         3.6          1.4         0.2
>   # 6          5.4         3.9          1.7         0.4
> 
>   d3 <- list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4))
>   d3
>   # $C1
>   # [1] 1.1 1.2 1.3
>   # $C2
>   # [1] 2.1 2.2 2.3 2.4
> 
> Notice the similarity (though not identity) between the print
> of d3 and the output of str(d). There is a bit more hard-wired
> stuff built into a dataframe which makes it more than simply
> a "list with all components vectors of equal length). However,
> one could also say that "the rectangular structure is its
> main feature".
> 
> As to why "$" will not work on matrices: a matrix, as Petr
> points out, is a vector with a "dimensions" attribute which
> has length 2 (as opposed to a general array where the length
> of the dimensions attribute could be anything). Hence it is
> not a list of named components in the sense of "list".
> 
> Hence "$" will not work with a matrix, since "$" will not
> be able to find any list-components. which is basically what
> the error message
> 
>   d2$Sepal.Width
>   # Error in d2$Sepal.Width : $ operator is invalid for atomic vectors
> 
> is telling you: d2 is an atomic vector with a length-2 dimensions
> attribute. It has no list-type components for "$" to get its
> hands on.
> 
> Ted.
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 01-Mar-10                                       Time: 12:03:21
> ------------------------------ XFMail ------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list