[R] count.fields inconsistent with read.table?

Sam Steingold sds at gnu.org
Fri Feb 24 06:58:13 CET 2012


Hi,

batch is a vector of lines returned by readLines from a
NL-line-terminated file, here is the relevant section:
=========================================================
AA	BB	CC	DD			EE	FF
GG	H

H	JJ	KK			LL	MM
=========================================================
as you can see, a line is corrupt; two CRLF's are inserted.
This is okay, I drop the bad lines, at least I hope I do:

  conn <- textConnection(batch)
  field.counts <- count.fields(conn, sep="\t", comment.char="", quote="")
  close(conn)
  good <- field.counts == 8  # this should drop all bad lines
  if (!all(good))
    batch <- batch[good]
  conn <- textConnection(batch)
  ret <- read.table(conn, sep="\t", comment.char="", quote="")
  close(conn)

I get this error in read.table():

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 7151 did not have 8 elements

how come?!

also, is there some error recovery?
e.g., the code above is a part of a function - is there a way to recover
batch (without re-running the whole thing)?

Thanks!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il
http://www.PetitionOnline.com/tap12009/ http://dhimmi.com
Conscience is like a hamster: it is either asleep or gnawing.



More information about the R-help mailing list