[R] importing files, columns "invade next column"

Tiago R Magalhaes tiago17 at socrates.Berkeley.EDU
Wed Jan 19 05:25:00 CET 2005


Dear R-listers:

I want to import a reasonably big file into a table. (15797 x 257 
columns). The file is tab delimited with NA in every empty space. I 
have reproduced what I have used as my read.table instruction. I have 
read the R-dataImportExport FAQ and still couldn't solve my problem. 
(I might have missed it, of course). I'm using R.2.01 in a Mac G4, 
10.3.7.

I can import the file, but one of the columns "invades the other", 
meaning that the if there is an empty space marked as NA on the first 
column, it gets the value of the second column. I tried to import 
four different files (details below) and I think the problem is with 
the number of columns (with less columns it works)

workarounds:
a) I can separate my file into several files, import them and then 
make one file in R
b) try to learn basic commands in awk? perl?
any advice on this?

another question (much less important) I have a binnary file in Splus 
for this object. I exported the object in Splus as it says in the FAQ 
(dump.data). But data.restore doesn't exist as a function. Is it 
because I'm using a Mac?

details of what I did:
##
a) importing a shorter version of my file (58 columns); I get the 
"invading" behaviour and a column of row.names that I don't 
understand where it comes from. (UNIQID should be empty and 1006 
should be in All.FB.Id

>  AllFBImpFields <- read.table('AllFBAllFieldsNAShorter.txt', fill=T, header=T,
+                              row.names=paste('a',1:15797, sep=''),
+                              as.is=T, nrows=15797)
>  AllFBImpFields[1:2,1:5]
    row.names UNIQID All.FB.Id All.FB.5 All.FB.4
a1      <NA>  10006      <NA>     <NA>     <NA>
a2      <NA>  10007      <NA>     <NA>     <NA>

##
b) Importing only 5 cols of the previous file. It works. there is no 
"invasion" and the col row.names is not inserted

>  AllFB5Cols <- read.table('AllFB5Cols.txt', fill=T, header=T,
+                          row.names=paste('a',1:15797, sep=''),
+                          as.is=T, nrows=15797)
>  AllFB5Cols[1:2,1:5]
    UNIQID All.FB.Id Symbol       FB.gn CG.name
a1   <NA>     10006    p53 FBgn0039044 CG10873
a2   <NA>     10007  Gr94a FBgn0041225 CG31280

##
c) importing file with 4 rows, 58 columns; invasion behaviour and a 
warning that I don't get in a) although the file is the same for the 
first 4 rows

>  x4rowsAllCol <- read.table('AllFB4rowsAllCols.txt', fill=T, header=T,
+                            row.names=paste('a',1:4, sep=''),
+                            as.is=T, nrows=4)
Warning message:
incomplete final line found by readTableHeader on `AllFB4rowsAllCols.txt'
>  x4rowsAllCol[1:2,1:5]
    row.names UNIQID All.FB.Id All.FB.5 All.FB.4
a1        NA  10006        NA       NA       NA
a2        NA  10007        NA       NA       NA

##
d) importing file with 4 rows and 4 cols, result is like b) but gives 
the same warning as c!)
>  x4rows5cols <- read.table('AllFB4rows5cols.txt', fill=T, header=T,
+                      row.names=paste('a',1:4, sep=''),
+                      as.is=T, nrows=4)
Warning message:
incomplete final line found by readTableHeader on `AllFB4rows5cols.txt'
>  x4rows5cols[1:2,1:5]
    UNIQID All.FB.Id All.FB.5 All.FB.4 All.FB.3
a1     NA     10006       NA       NA       NA
a2     NA     10007       NA       NA       NA




More information about the R-help mailing list