[R] read.csv and field containing single quotes

Benilton Carvalho beniltoncarvalho at gmail.com
Tue Mar 27 01:09:38 CEST 2012


I need to read in csv files, created by 3rd party, with fields
containing single quotes (as shown below).

"header1","header2","header3","header4"
"field1r1","field2r1","field3r1","field4r1"
"field1r2","field2r2","field3r2PartA), field3r2PartB Very" Long","field4r2"
"field1r3","field2r3","field3r3","field4r3"


read.csv(filename, quote="\"'", header=TRUE) won't read the file
represented above, unless the 3rd line has Very""  (double quotes)
instead of Very" (single quotes)... and this is documented (scan() man
page).

Assuming that the creation of such csv files is something I'm not in a
position to interfere with, are there (preferably, "all in R")
suggestions on how to handle such task?

For the moment, I'm using my poor man's solution (below), but any
tricks that would simplify this task would be great.

Thank you very much,

benilton


parser <- function(fname, header=TRUE, stringsAsFactors=FALSE){
    txt <- readLines(fname)
    txt <- gsub("^\"|\"$", "", txt)
    txt <- strsplit(txt, "\",\"")
    txt <- do.call(rbind, lapply(txt, function(x) gsub("\"", "\"\"", x)))
    if (header){
        nms <- txt[1,]
        txt <- txt[-1,]
    }
    txt <- as.data.frame(txt, stringsAsFactors=stringsAsFactors)
    if (header) names(txt) <- nms
    txt
}



More information about the R-help mailing list