[R] transformation of data.frame

Petr PIKAL petr.pikal at precheza.cz
Mon Jul 12 14:09:00 CEST 2010


Hi

Assa Yeroslaviz <frymor at gmail.com> napsal dne 09.07.2010 13:25:43:

> Hello Petr,
> 
> sorry for the mixed up. your example works perfectly fine. 
> 
> The one from Søren has shown the mentioned error.  But even after 
reading the 
> columns as character
> 
> > go <- read.table("go.txt", header= TRUE, colClasses = c("character", 
"character"))
> or
> > go <- read.table("go.txt", header= TRUE, as.is = 1)
> 
> 
> it didn't solve the problem.
> the command:
> gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {as.list(uu$GO[1], 
paste(uu
> $gen, collapse = " "))})
> 
> tries to convert my first column into integers and thand add 'NA's. 
> 
> What I don't understand is why. 
> Does lapplyBy can work only with integers?

I do not use doBy library so I cannot give you definite explanation. I do 
not believe that lapplyBy works with integers only. It says that it is a 
formula version of lapply.

This is what you did
gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {as.list(uu$GO[1], 
paste(uu$gen, collapse = " "))})

and this is what Søren advised
aa<-lapplyBy(~ID, data=ddd, FUN=function(uu){list(uu$ID[1], paste(uu$gen, 
collapse=":"))})

so maybe
gmt <- lapplyBy(~GO, data = go, FUN = function(uu) {list(uu$GO[1], 
paste(uu$gen, collapse = " "))})

Gives you desired result.

Regards
Petr


> 
> THX,
> 
> Assa
> 

> 2010/7/8 Petr PIKAL <petr.pikal at precheza.cz>
> Hi
> 
> r-help-bounces at r-project.org napsal dne 08.07.2010 12:02:45:
> 
> > I don't understand it. When I'm doing this example it wirks fine, but
> when
> > I'm adding the "GO:" to the beginning of the first column (as to see 
in
> my
> > wanted result table:
> > GO0042787
> > GO0016070
> > GO0016070
> >
> > I'm getting a list of warning:
> > Warning messages:
> > 1: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > 2: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > ...
> > 9: In storage.mode(xi) <- a$sm : NAs introduced by coercion
> > 10: In storage.mode(xi) <- a$sm : NAs introduced by coercion

> Not sure what is wrong, it seems to me that your ID become factor.
> 
> Having your data in dataframe test as character columns
> 
> see ?str
> 
> test.ag<-aggregate(test$X.gen, list(test$ID), function(x) paste(x,
> collapse=":"))
> 
> I can make aggregated data frame
> 
> paste("GO",test.ag[,1], sep="")
> [1] "GO0006417" "GO0006511" "GO0007409" "GO0016070" "GO0042787"
> 
> and it is strightforward to add GO at the beginning.
> 
> I leave how to add this result to your aggregated data frame as an
> exercise.
> 
> Regards
> Petr
> 
> 
> >
> > What did I do wrong here?
> >
> > Assa
> >
> > On Thu, Jul 8, 2010 at 11:09, Søren Højsgaard
> <Soren.Hojsgaard at agrsci.dk>wrote:
> >
> > > Like this?
> > >
> > > > library(doBy)
> > > > (ddd <- read.table("foo.txt",header=T))
> > >     ID  gen
> > > 1 42787 gen2
> > > 2 16070 gen2
> > > 3 16070 gen3
> > > 4  7409 Gen1
> > > 5  7409 gen3
> > > 6  6511 gen2
> > > 7  6417 gen3
> > > 8 16070 gen4
> > > 9  6511 gen4
> > > > aa<-lapplyBy(~ID, data=ddd,
> > > +   FUN=function(uu){
> > > +   list(uu$ID[1], paste(uu$gen, collapse=":"))
> > > + })
> > > >
> > > > do.call(rbind,aa)
> > >      [,1]  [,2]
> > > 42787 42787 "gen2"
> > > 16070 16070 "gen2:gen3:gen4"
> > > 7409  7409  "Gen1:gen3"
> > > 6511  6511  "gen2:gen4"
> > > 6417  6417  "gen3"
> > >
> > > Regards
> > > Søren
> > >
> > >
> > >
> > >
> > >
> > > -----Oprindelig meddelelse-----
> > > Fra: r-help-bounces at r-project.org [
mailto:r-help-bounces at r-project.org
> ] PĂĄ
> > > vegne af Assa Yeroslaviz
> > > Sendt: 8. juli 2010 10:45
> > > Til: r-help at stat.math.ethz.ch
> > > Emne: [R] transformation of data.frame
> > >
> > > Hello all R users,
> > >
> > > I have a problems transforming (or maybe better regrouping) a
> data.frame.
> > >  I have a big data.frame, which I would like to sum up according to 
a
> > > specific column.
> > >
> > > This is an example of my matrix:
> > > ID    gen
> > > 0042787    gen2
> > > 0016070    gen2
> > > 0016070    gen3
> > > 0007409    Gen1
> > > 0007409    gen3
> > > 0006511    gen2
> > > 0006417    gen3
> > > 0016070    gen4
> > > 0006511    gen4
> > >
> > > I want to rearrange the matrix according to column GO, so that it 
will
> look
> > > likes that:
> > >
> > > GO:0042787     gen2
> > > GO:0016070    gen2  :  gen3  :  gen4
> > > GO:0007409    gen1  :  gen3
> > > GO:0006511    gen2  :  gen4
> > > GO:0006417    gen3
> > >
> > > I've tried it with the package doBy (lapplyBy and paste) but it just
> > > doesn't
> > > work out.
> > >
> > > I will be very happy for any suggestions you might have to help me.
> > >
> > > Thanks
> > >
> > > Assa
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >    [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list