[R] do glm with two data sets

Sundar Dorai-Raj sundar.dorai-raj at pdf.com
Thu Aug 18 18:41:19 CEST 2005



Hu, Ying (NIH/NCI) wrote:
> You are right. 
> # read the two data sets
> e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1))
> g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1))
> 
> # solution 2
> summary(glm(e[1,] ~ g[1,]))
> summary(glm(e[1,] ~ g[2,]))
> ...
> They work very well.
> 
> If I put it in the loop, such as
> 
> for (i in 1:50){
>   for (j in 1:50){
>      cat("file1 row:", i, "file2 row:", j, "\n")
>      print(summary(glm(e[i,] ~ g[j,])))
>   }
> } 
> 
> Why do I have to use "print" to print the results? If without "print"
> for (i in 1:50){
>   for (j in 1:50){
>      cat("file1 row:", i, "file2 row:", j, "\n")
>      summary(glm(e[i,] ~ g[j,]))
>   }
> }
> then without the results of glm.
> 

This is a FAQ 7.16.

See http://cran.r-project.org/doc/FAQ/R-FAQ.html

--sundar



> Thanks a lot.
> 
> Ying
>  
> 
> -----Original Message-----
> From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk] 
> Sent: Thursday, August 18, 2005 11:00 AM
> To: Hu, Ying (NIH/NCI)
> Cc: Sundar Dorai-Raj; r-help at stat.math.ethz.ch
> Subject: RE: [R] do glm with two data sets
> 
> On Thu, 2005-08-18 at 10:38 -0400, Hu, Ying (NIH/NCI) wrote:
> 
>>Thanks for your help.
>>
>># read the two data sets
>>e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1))
>>g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1))
>># solution 
>>d1<-data.frame(g[1,], e[1,])
> 
> 
> This is redundant, as:
> 
> 
>>fit<-glm(e[1,] ~ g[1,], data=d1)
> 
> 
> and:
> 
> fit <- glm(e[1, ] ~ g[1, ])
> 
> are equivalent - you don't need data = d1 in this case, e.g:
> 
> e <- matrix(c(0, 1, 0, 0, 1, 1, 1, 1, -1), ncol = 3, byrow = TRUE)
> e
> g <- matrix(c(1.22, 1.34, 2.44, 2.33, 2.56, 2.56, 1.56, 1.99, 1.46),
> ncol = 3, byrow = TRUE)
> g
> fit <- glm(e[1, ] ~ g[1, ])
> fit
> 
> works fine.
> 
> 
>>summary(fit)
>>
>>I am not sure that is the best solution.
> 
> 
> This seems a strange way of doing this. Why not:
> 
> pred <- g[1, ]
> resp <- e[1, ]
> fit <- glm(resp ~ pred)
> fit
> 
> and do your subsetting outside the glm call - makes things clearer no?
> Unless you plan to do many glm()s one per row of your two matrices. If
> that is the case, then there are better ways of approaching this.
> 
> 
>>Thanks again,
>>
>>Ying
> 
> 
> HTH
> 
> G
> 
> 
>>-----Original Message-----
>>From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk] 
>>Sent: Wednesday, August 17, 2005 7:01 PM
>>To: Sundar Dorai-Raj
>>Cc: Hu, Ying (NIH/NCI); r-help at stat.math.ethz.ch
>>Subject: Re: [R] do glm with two data sets
>>
>>On Wed, 2005-08-17 at 17:22 -0500, Sundar Dorai-Raj wrote:
>>
>>>Hu, Ying (NIH/NCI) wrote:
>>>
>>>>I have two data sets:
>>>>File1.txt: 
>>>>Name id1   id2   id3   ...
>>>>N1    0     1     0     ...
>>>>N2    0     1     1     ...
>>>>N3    1     1     -1    ...
>>>>...
>>>> 
>>>>File2.txt:
>>>>Group id1       id2       id3       ...
>>>>G1       1.22     1.34     2.44     ...
>>>>G2       2.33     2.56     2.56     ...
>>>>G3       1.56     1.99     1.46     ...
>>>>...
>>>>I like to do:
>>>>x1<-c(0,1,0,...)
>>>>y1<-c(1.22,1.34, 2.44, ...)
>>>>z1<-data.frame(x,y)
>>>>summary(glm(y1~x1,data=z1)
>>>> 
>>>>But I do the same thing by inputting the data sets from the two files
>>>>e <- read.table("file1.txt", header=TRUE,row.names=1)
>>>>g <- read.table("file2.txt", header=TRUE,row.names=1)
>>>>e1<-exp[1,]
>>>>g1<-geno[1,]
>>>>d1<-data.frame(g, e)
>>>>summary(glm(e1 ~ g1, data=d1))
>>>> 
>>>>the error message is 
>>>>Error in model.frame(formula, rownames, variables, varnames, extras,
>>>>extranames,  : 
>>>>        invalid variable type
>>>>Execution halted
>>>> 
>>>>Thanks in advance,
>>>> 
>>>>Ying
>>
>>Hi Ying,
>>
>>That error message is likely caused by having a data.frame on the right
>>hand side (rhs) of the formula. You can't have a data.frame on the rhs
>>of a formula and g1 is still a data frame even if you only choose the
>>first row, e.g.:
>>
>>dat <- as.data.frame(matrix(100, 10, 10))
>>class(dat[1, ])
>>[1] "data.frame"
>>
>>You could try:
>>
>>glm(e1 ~ ., data=g1[1, ])
>>
>>and see if that works, but as Sundar notes, your post is a little
>>difficult to follow, so this may not do what you were trying to achieve.
>>
>>HTH
>>
>>Gav
>>
>>
>>>You have several inconsistencies in your example, so it will be 
>>>difficult to figure out what you are trying to accomplish.
>>>
>>> > e <- read.table("file1.txt", header=TRUE,row.names=1)
>>> > g <- read.table("file2.txt", header=TRUE,row.names=1)
>>> > e1<-exp[1,]
>>>
>>>What's "exp"? Also it's dangerous to use an R function as a variable 
>>>name. Most of the time R can tell the difference, but in some cases it 
>>>cannot.
>>>
>>> > g1<-geno[1,]
>>>
>>>What's "geno"?
>>>
>>> > d1<-data.frame(g, e)
>>>
>>>d1 is now e and g cbind'ed together?
>>>
>>> > summary(glm(e1 ~ g1, data=d1))
>>>
>>>Are "e1" and "g1" elements of "d1"? From what you've told us, I don't 
>>>know where the error is occurring. Also, if you are having errors, you 
>>>can more easily isolate the problem by doing:
>>>
>>>fit <- glm(e1 ~ g1, data = d1)
>>>summary(fit)
>>>
>>>This will at least tell you the problem is in your call to "glm" and not
> 
> 
>>>"summary.glm".
>>>
>>>--sundar
>>>
>>>P.S. Please (re-)read the POSTING GUIDE. Most of the time you will 
>>>figure out problems such as these on your own during the process of 
>>>creating a reproducible example.
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
>>
>>http://www.R-project.org/posting-guide.html




More information about the R-help mailing list