[R] Data frame question

Claudia Beleites cbeleites at units.it
Fri Mar 12 22:43:14 CET 2010


apjaworski at mmm.com wrote:
> 
> Thanks for the quick reply.
> 
> No, I did not run into any problems so far.  I have been using the PLS 
> package and the modelling functions seem to work just fine.
> 
> In fact, even if I let the data.frame convert the x matrix to separate 
> column, the "y ~ x" modeling syntax still seems to work fine.
> 
I don't see that behaviour:

rm (x)  # make sure there is no leftover x in the workspace
mat <- matrix (1 : 9, 3)
df <- data.frame (y = 1 : 3, x = mat)
str (df)
df
coef (plsr (y ~ x, data = df, ncomp = 1)) # error
coef (plsr (y ~ x.1 + x.2 + x.3, data = df, ncomp = 1)) # works

df$x <- I (-mat)
str (df)
df
coef (plsr (y ~ x, data = df, ncomp = 1)) # works

Claudia

PS: May I be curious: what kind of data do you analyze with PLS?


> Thanks again,
> 
> Andy
> 
> __________________________________
> Andy Jaworski
> 518-1-01
> Process Laboratory
> 3M Corporate Research Laboratory
> -----
> E-mail: apjaworski at mmm.com
> Tel:  (651) 733-6092
> Fax:  (651) 736-3122
> 
> 
> From: 	Claudia Beleites <cbeleites at units.it>
> To: 	apjaworski at mmm.com
> Cc: 	r-help at r-project.org
> Date: 	03/12/2010 02:13 PM
> Subject: 	Re: [R] Data frame question
> 
> 
> 
> 
> 
> Andy,
> 
> Did you run into any kind of trouble?
> I'm asking because I'm maintaining a package for spectroscopic data that 
> heavily
> uses "I (spectra.matrix)" ...
> 
> However, once you have the matrix safe inside the data.frame, you can 
> delete the
>  "AsIs":
> 
>  > a <- matrix (1:9, 3)
>  > str (a)
>  int [1:3, 1:3] 1 2 3 4 5 6 7 8 9
>  > df <- data.frame (a = I (a))
>  > str (df)
> 'data.frame':                 3 obs. of  1 variable:
>  $ a: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9
>  > df$a <- unclass (df$a)
>  > str (df)
> 'data.frame':                 3 obs. of  1 variable:
>  $ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9
>  > df$a
>      [,1] [,2] [,3]
> [1,]    1    4    7
> [2,]    2    5    8
> [3,]    3    6    9
>  > dim (df)
> [1] 3 1
> 
> However, I don't know whether something can now trigger a conversion to
> data.frame that the AsIs would have stopped.
> 
> Cheers,
> 
> Claudia
> 
> apjaworski at mmm.com wrote:
>  > Hi,
>  >
>  > I have the following question about creating data frames.  I want to
>  > create a data frame with 2 components: a vector and a matrix.
>  >
>  > Let me use a simple example:
>  >
>  > y <- rnorm(10)
>  > x <- matrix(rnorm(150), nrow=10)
>  >
>  > Now if I do
>  >
>  > dd <- data.frame(x=x, y=y)
>  >
>  > I get a data frame with 16 colums, but if, according to the 
> documentation,
>  >  I do
>  >
>  > dd <- data.frame(x=I(x), y=y)
>  >
>  > then str(dd) gives:
>  >
>  > 'data.frame':   10 obs. of  2 variables:
>  >  $ x: AsIs [1:10, 1:15] 0.700073.... -0.44371.... -0.46625....
>  > 0.977337.... 0.509786.... ...
>  >  $ y: num  0.4676 -1.4343 -0.3671 0.0637 -0.231 ...
>  >
>  > This looks and works OK.
>  >
>  > Now, there exists a CRAN package called pls.  It has a yarn data set in
>  > it.
>  >
>  >> data(yarn)
>  >> str(yarn)
>  > 'data.frame':   28 obs. of  3 variables:
>  >  $ NIR    : num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ...
>  >   ..- attr(*, "dimnames")=List of 2
>  >   .. ..$ : NULL
>  >   .. ..$ : NULL
>  >  $ density: num  100 80.2 79.5 60.8 60 ...
>  >  $ train  : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
>  >
>  > This looks almost the same, except the matrix component in my example 
> has
>  > the AsIs instead of num.
>  >
>  > Is this just some older behavior of the data.frame function producing 
> this
>  > difference?  If not, how can I get my data frame (dd) to look like yarn?
>  >
>  > I read the help pages for data.frame and as.data.frame and found this
>  > paragraph
>  >
>  > If a list is supplied, each element is converted to a column in the data
>  > frame. Similarly, each column of a matrix is converted separately. This
>  > can be overridden if the object has a class which has a method for
>  > as.data.frame: two examples are matrices of class "model.matrix" (which
>  > are included as a single column) and list objects of class "POSIXlt" 
> which
>  > are coerced to class "POSIXct".
>  >
>  > If I do
>  >
>  >> methods(as.data.frame)
>  >  [1] as.data.frame.aovproj*        as.data.frame.array
>  >  [3] as.data.frame.AsIs            as.data.frame.character
>  >  [5] as.data.frame.complex         as.data.frame.data.frame
>  >  [7] as.data.frame.Date            as.data.frame.default
>  >  [9] as.data.frame.difftime        as.data.frame.factor
>  > [11] as.data.frame.ftable*         as.data.frame.integer
>  > [13] as.data.frame.list            as.data.frame.logical
>  > [15] as.data.frame.logLik*         as.data.frame.matrix
>  > [17] as.data.frame.model.matrix    as.data.frame.numeric
>  > [19] as.data.frame.numeric_version as.data.frame.ordered
>  > [21] as.data.frame.POSIXct         as.data.frame.POSIXlt
>  > [23] as.data.frame.raw             as.data.frame.table
>  > [25] as.data.frame.ts              as.data.frame.vector
>  >
>  > so it looks like there is a matrix method for as.data.frame.  The 
> question
>  > then is how can I override the default behavior for the matrix object
>  > (converting columns separately).
>  >
>  >
>  > Any hint will be appreciated,
>  >
>  > Andy
>  >
>  >
>  > __________________________________
>  > Andy Jaworski
>  > 518-1-01
>  > Process Laboratory
>  > 3M Corporate Research Laboratory
>  > -----
>  > E-mail: apjaworski at mmm.com
>  > Tel:  (651) 733-6092
>  > Fax:  (651) 736-3122
>  >                  [[alternative HTML version deleted]]
>  >
>  > ______________________________________________
>  > R-help at r-project.org mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-help 
> <https://stat.ethz.ch/mailman/listinfo/r-help>
>  > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> <http://www.r-project.org/posting-guide.html>
>  > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> Claudia Beleites
> Dipartimento dei Materiali e delle Risorse Naturali
> Università degli Studi di Trieste
> Via Alfonso Valerio 6/a
> I-34127 Trieste
> 
> phone: +39 0 40 5 58-37 68
> email: cbeleites at units.it
> 
> 
> 


-- 
Claudia Beleites
Dipartimento dei Materiali e delle Risorse Naturali
Università degli Studi di Trieste
Via Alfonso Valerio 6/a
I-34127 Trieste

phone: +39 0 40 5 58-37 68
email: cbeleites at units.it



More information about the R-help mailing list