[R] Tab Separated File Reading Error

Dario Strbenac dstr7320 at uni.sydney.edu.au
Fri Oct 4 14:00:31 CEST 2013


Hello,

I have a seemingly simple problem that a tab-delimited file can't be read in.

> annoTranscripts <- read.table("matched.txt", sep = '\t', stringsAsFactors = FALSE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 5933 did not have 12 elements

However, all lines do have 12 columns.

> lines <- readLines("matched.txt")
> tabsPosns <- gregexpr("\t", lines)
> table(sapply(tabsPosns, length))

    11 
367274 

> system("wc -l matched.txt")
367274 matched.txt

You can obtain the file from https://dl.dropboxusercontent.com/u/37992150/matched.txt

The line does not contain comment or quote characters. What can you suggest ?

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

loaded via a namespace (and not attached):
[1] tools_3.0.1

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia


More information about the R-help mailing list