[R] Covert many lines in a specific line

arun smartpink111 at yahoo.com
Wed Dec 11 21:06:37 CET 2013


Hi,

May be this helps:

dat1 <- read.table(text="Sample  Genotype  Region
    sample1    A      Region1
    sample1    B      Region1
    sample1    A      Region1
    sample2    A      Region1
    sample2    A      Region1
    sample3    A      Region1
    sample4    B      Region1",sep="",header=TRUE,stringsAsFactors=FALSE)
library(plyr)
 unique(ddply(dat1,.(Sample),mutate, Genotype=if(length(unique(Genotype))>1) {"E"} else Genotype))


dat2 <- read.table(text="Sample  Genotype  Region
    sample1    A      Region1
    sample1    B      Region1
    sample1    A      Region1
    sample2    A      Region1
    sample2    A      Region1
    sample3    A      Region1
    sample4    B      Region1
    sample1    A      Region2
    sample1    B      Region2
    sample1    A      Region2
    sample2    A      Region2
    sample2    A      Region2",sep="",header=TRUE,stringsAsFactors=FALSE)

 unique(ddply(dat2,.(Region,Sample),mutate, Genotype=if(length(unique(Genotype))>1) {"E"} else Genotype))

#or
aggregate(Genotype~.,data=dat2,function(x) x <- if(length(unique(x))>1) "E" else unique(x))



A.K.


I would like to transform this data: 

    Sample  Genotype  Region 
    sample1    A      Region1 
    sample1    B      Region1 
    sample1    A      Region1 
    sample2    A      Region1 
    sample2    A      Region1 
    sample3    A      Region1 
    sample4    B      Region1 

In that format, tagging with "E" samples with more than one genotype and unifying samples with the same genotype 2 times: 

    Sample  Genotype  Region   
    sample1    E      Region1 
    sample2    A      Region1 
    sample3    A      Region1 
    sample4    B      Region1 

I have one list with many regions (Region1 - Regionx). It is possible to do in R software? Thanks a lot.



More information about the R-help mailing list