[R] R help for creating expression data of Differentially expressed genes

arun smartpink111 at yahoo.com
Wed May 8 00:35:22 CEST 2013


HI,
Assuming that "out_dat.txt" is the output you expected.


 dat1<- read.table("data1.txt",header=TRUE,stringsAsFactors=FALSE)
dat2<- read.table("data2.txt",header=TRUE,stringsAsFactors=FALSE)
out_dat<- read.table("out_data.txt",header=TRUE,stringsAsFactors=FALSE)
 out_dat2<-merge(dat1[,1:4],dat2,by="ID")
 identical(out_dat,out_dat2)
#[1] TRUE
A.K.





________________________________
From: Vivek Das <vd4mmind at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Cc: R help <r-help at r-project.org> 
Sent: Tuesday, May 7, 2013 6:07 PM
Subject: Re: R help for creating expression data of Differentially expressed genes



HI Arun,

My data sets are as in the provided files. I am providing the sample files. I guess this will give a better idea to the type of working I want to do with the two files and the kind or script am trying to write. Hope you can give me some suggestions regarding this. I am new to R so having trouble to use different functions to use this for my working.

Anyone who can help me out with this can be of great help.



----------------------------------------------------------

Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy

emails: vivek.das at ieo.eu
            vchris_05 at yahoo.co.in
            vd4mmind at gmail.com



On Tue, May 7, 2013 at 10:36 PM, arun <smartpink111 at yahoo.com> wrote:

Hi Vivek,
>
>May be this helps:
>set.seed(35)
> dat1<- cbind(ID=1:8, as.data.frame(matrix(sample(1:50,8*7,replace=TRUE),ncol=7)))
>
>set.seed(38)
>dat2<- cbind(ID= sample(1:20,8,replace=FALSE), as.data.frame(matrix(sample(1:50,8*33,replace=TRUE),ncol=33)))
>colnames(dat2)[-1]<-gsub("V","X",colnames(dat2)[-1])
> merge(dat1[,1:2],dat2[,1:31],by="ID")
>#  ID V1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
>#1  1 43 44  4 33 47 29 43 31 15  2  34  42   5  18  22  36  34  44   3  45   9
>#2  3 28  4 18 45 24  5 20 30 16 49  34  33   5  24  49  31  10  45  21  26  20
>#3  6  5 16  1  5  2 26  6 40 16 15  50  26  37  22  25  39  16  24  29  50  42
>#4  7 25 26 39 16 29  5 40 15 27 46  16  38  36  42   8   3  29   7  13  18  38
>#5  8 30  3 41 25 38 24 41 44 23  2  45  33  10  18  20  49  19  23  42  25   5
>#  X21 X22 X23 X24 X25 X26 X27 X28 X29 X30
>#1  14  27   3  21   6  44  33  42  10  29
>#2  48  13   8  47  18   9  23   9  44   3
>#3  25  14  31  19  14   6  26  13   6  49
>#4  43  28  15   6   9  19  43  21  41  21
>#5   1  27  18   3  42   5  16  39  46  47
>
>A.K.
>
>
>
>----- Original Message -----
>
>From: Vivek Das <vd4mmind at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Cc:
>
>Sent: Tuesday, May 7, 2013 3:45 PM
>Subject: R help for creating expression data of Differentially expressed genes
>
>Hi Arun,
>
>I need some help regarding R scripting. I have two data file one containing seven columns and the other containing 33. Both files have unique identifier as ID. I want to create another file which should have the first two columns of the first file and and the 31 columns of the second file matched on the basis of ID. The first file is having gene I'd and gene names of around 500 and I want the output file which is having all of those and other attributes as well. I want to get the output file having all attributes matching with the I'd of the first file. So that I get output of 500 rows with all the attributes of second file. I am new to R but having trouble with merge function in R. If you can help it will be great.
>
>Regards,
>Vivek
>
>Sent from my iPad
>
>On 07/mag/2013, at 21:13, arun <smartpink111 at yahoo.com> wrote:
>
>> HI Ye,
>>
>> For the NA in ID column,
>>
>>
>>
>> Hi
>> dat1<- read.table(text="
>> ObsNumber     ID          Weight
>>      1                 0001         12
>>      2                 0001          13
>>      3                 0001           14
>>      4                  0002         16
>>       5                 0002         17
>>      6                   N/A          18 
>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
>>  unlist(lapply(split(dat1,dat1$ID),function(x) with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>> #[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
>> A.K.
>> ________________________________
>> From: Ye Lin <yelin at lbl.gov>
>> To: arun <smartpink111 at yahoo.com>
>> Cc: R help <r-help at r-project.org>
>> Sent: Tuesday, May 7, 2013 2:54 PM
>> Subject: Re: [R] create unique ID for each group
>>
>>
>>
>> Thanks A.K. But I have "NA" in ID column, so when I apply the code, it gives me error saying the replacement as less rows than the data has. Anyway for ID=N/A, return sth like "N/A_1" in order as well?
>>
>>
>>
>>
>>
>>
>> On Tue, May 7, 2013 at 11:17 AM, arun <smartpink111 at yahoo.com> wrote:
>>
>> H,
>>> Sorry, a mistake:
>>> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>>> dat1
>>>  # ObsNumber   ID Weight UniqueID
>>> #1         1 0001     12   0001_1
>>> #2         2 0001     13   0001_2
>>> #3         3 0001     14   0001_3
>>> #4         4 0002     16   0002_1
>>> #5         5 0002     17   0002_2
>>>
>>> dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>>>
>>> A.K.
>>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>>
>>> From: arun <smartpink111 at yahoo.com>
>>> To: Ye Lin <yelin at lbl.gov>
>>> Cc: R help <r-help at r-project.org>
>>> Sent: Tuesday, May 7, 2013 2:10 PM
>>> Subject: Re: [R] create unique ID for each group
>>>
>>>
>>>
>>> Hi,
>>>
>>> Try this:
>>> dat1<- read.table(text="
>>> ObsNumber     ID          Weight
>>>      1                 0001         12
>>>      2                 0001          13
>>>      3                 0001           14
>>>      4                  0002         16
>>>       5                 0002         17
>>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
>>> dat2<- read.table(text="
>>> ID               Height
>>> 0001            3.2
>>> 0001             2.6
>>> 0001             3.2
>>> 0002             2.2
>>> 0002              2.6
>>> ",sep="",header=TRUE,colClass=c("character","numeric"))
>>> dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
>>>  dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
>>>  dat2
>>> #    ID Height UniqueID
>>> #1 0001    3.2   0001_1
>>> #2 0001    2.6   0001_2
>>> #3 0001    3.2   0001_3
>>> #4 0002    2.2   0002_4
>>> #5 0002    2.6   0002_5
>>> A.K.
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Ye Lin <yelin at lbl.gov>
>>> To: R help <r-help at r-project.org>
>>> Cc:
>>> Sent: Tuesday, May 7, 2013 1:54 PM
>>> Subject: [R] create unique ID for each group
>>>
>>> Hey All,
>>>
>>> I have a dataset(dat1) like this:
>>>
>>> ObsNumber     ID          Weight
>>>      1                 0001         12
>>>      2                 0001          13
>>>      3                 0001           14
>>>      4                  0002         16
>>>       5                 0002         17
>>>
>>> And another dataset(dat2) like this:
>>>
>>> ID               Height
>>> 0001            3.2
>>> 0001             2.6
>>> 0001             3.2
>>> 0002             2.2
>>> 0002              2.6
>>>
>>> I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
>>> returns the first match it finds. So I am thinking create unique ID col in
>>> dat2 and dat2, then merge. But I dont know how to do that so it can be like
>>> this:
>>>
>>> dat1:
>>>
>>> ObsNumber     ID          Weight  UniqueID
>>>      1                 0001         12         0001_1
>>>      2                 0001          13        0001_2
>>>      3                 0001           14       0001_3
>>>      4                  0002         16         0002_1
>>>       5                 0002         17         0002_1
>>>
>>> dat2:
>>>
>>> ID               Height   UniqueID
>>> 0001            3.2          0001_1
>>> 0001             2.6         0001_2
>>> 0001             3.2         0001_3
>>> 0002             2.2         0002_1
>>> 0002              2.6        0002_2
>>>
>>> Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
>>> match in order that would be great!
>>>
>>> Thanks for your help!
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list