[R] read.table: how to ignore errors?

Rolf Turner rolf.turner at xtra.co.nz
Tue Jan 24 22:38:07 CET 2012


On 25/01/12 09:45, Sam Steingold wrote:
> I get this error from read.table():
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>    line 234 did not have 8 elements
> The error is genuine (an extra field separator between 1st and 2nd element).
>
> 1. is there a way to see this bad line 234 from R without diving into the file?
>
> 2. is there a way to ignore the bad lines and get the data from the good
> lines only (I do want to see the bad lines, but I don't want to stop all
> work until some issue which causes 1% of data is resolved).
>
> thanks.
>
> Oh, yeah, a reproducible example:
>
> read.csv from
> =====
> a,b
> 1,2
> 3,4
> 5,,6
> 7,8
> =====
> I want to be able to extract the data frame
>    a b
> 1 1 1
> 2 3 4
> 3 7 8
>
> and a list of strings of length 1 containing "5,,6".

Try:

xxx <- readLines("<filename>")
hhh <- read.csv(textConnection(xxx[1]),header=FALSE)
yyy <- hhh[-1,]
names(yyy) <- hhh[1,]
bad <- list()
j <- 0
for(i in 2:length(xxx)) {
     tmp <- read.csv(textConnection(xxx[i]),header=FALSE)
     if(ncol(tmp)==ncol(yyy)) yyy <- rbind(yyy,tmp) else {
         j <- j+1
         bad[[j]] <- tmp
     }
}
closeAllConnections()

HTH

     cheers,

         Rolf Turner



More information about the R-help mailing list