[BioC] Limma toptable output using write.table and column names

Ken Termiso jerk_alert at hotmail.com
Wed Feb 9 01:17:10 CET 2005


I apologize in advance if this is confusing...

When I use write.exprs (which, as I understand makes a call to write.table) 
to write expression data to a text file, the output text file has one less 
column name (the probe ID column does not get a name), and the other column 
names are shifted all the way to the left margin in the text file. When this 
text file is read into R using the command 
read.table(file="exprs.txt",header=TRUE), R converts the file into a data 
frame, and correctly displays the row labels as probeset IDs.

(the spacing may be a little off here, depending on the display font, but 
here you can see that the probeset name is the row label)
          6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
1053_at   3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
117_at    3.139897 3.105285 3.114203 3.131865 3.073855 3.038960


However, with the limma toptables, each column has a name, including the 
probeset column ("ID"). When I write a toptable to a textfile, and then read 
it back into R, R thinks that the probeset IDs are a column of data (since 
it is labelled with "ID"), and then adds row numbers to this data frame. 
This makes it difficult to do other operations (at least in my novice 
hands!!)

>tt[1:3,]
         ID            M        A           t   P.Value         B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2   1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
3    117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391

If I open up the toptable text file in excel, and delete the "ID" column 
name and do not shift over the other ones, this is what happens:

>tt_spc[1:3,]
          X            M        A           t   P.Value         B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2   1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
3    117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391

R silently appended an "X" to the "ID" column name..


If I open the toptable file in excel, delete the "ID" column name, and then 
shift the other column names over one all the way to the left, and then open 
the text file in R it looks perfect:

>tt_shft[1:3,]
                 M        A       t   P.Value         B
1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
1053_at   -0.05340 3.417325 -1.6100 0.9999627 -5.499340
117_at    -0.03820 3.100678 -1.4200 0.9999627 -5.724391


BUT, I don't want to have to edit each toptable file in excel before 
re-opening it in R.

I also tried setting the column name to "", and also giving the toptable 
data frame a string of names without the ID, but neither one worked...in 
both cases R filled in an "NA" for the column name...

Is there any way for me to avoid having to edit the file in excel so that I 
can write it to a text file, read it back into R, and have it display the 
probeset names as the row labels???

I guess what I'm asking is this -- is there are way for me to modify the 
toptable data frame so that the "ID" is removed and R uses the "ID" column 
as the row labels??

Thanks in advance,
-Ken



More information about the Bioconductor mailing list