[R] read.csv quotes within fields

David Winsemius dwinsemius at comcast.net
Fri Jan 25 22:16:23 CET 2013


On Jan 25, 2013, at 11:35 AM, Tim Howard wrote:

> Great point, your fix (quote="") works for the example I gave. Unfortunately, these text strings have commas in them as well(!).  Throw a few commas in any of the text strings and it breaks again.  Sorry about not including those in the example.
>  
> So, I need to incorporate commas *and* quotes with the escape character within a single string.

Well you need to have _some_ delimiter. At the moment it sounds as though you might end upusing readLines() and strsplit( . , split="\\'\\,\\s\\").

-- 
david.

>  
> Tim
>  
> 
> >>> David Winsemius <dwinsemius at comcast.net> 1/25/2013 2:27 PM >>>
> 
> On Jan 25, 2013, at 10:42 AM, Tim Howard wrote:
> 
> > All,
> > 
> > I have some csv files I am trying to import. I am finding that quotes inside strings are escaped in a way R doesn't expect for csv files. The problem only seems to rear its ugly head when there are an uneven number of internal quotes. I'll try to recreate the problem:
> > 
> > # set up a matrix, using escape-quote as the internal double quote mark.
> > 
> > x <- data.frame(matrix(data=c("1", "string one", "another string", "2", "quotes escaped 10' 20\" 5' 30\" \"test string", "final string", "3","third row","last \" col"),ncol = 3, byrow=TRUE))
> > 
> >> write.csv(x, "test.csv")
> > 
> > # NOTE that write.csv correctly created the three internal quotes ' " ' by using double quotes ' "" '. 
> > # here's what got written
> > 
> > "","X1","X2","X3"
> > "1","1","string one","another string"
> > "2","2","quotes escaped 10' 20"" 5' 30"" ""test string","final string"
> > "3","3","third row","last "" col"
> > 
> > # Importing test.csv works fine.
> > 
> >> read.csv("test.csv")
> >  X X1                                         X2             X3
> > 1 1  1                                 string one another string
> > 2 2  2 quotes escaped 10' 20" 5' 30" "test string   final string
> > 3 3  3                                  third row     last " col
> > # this looks good. 
> > # now, please go and open "test.csv" with a text editor and replace all the double quotes '""' with the 
> > # quote escaped ' \" ' as is found in my data set. Like this:
> > 
> > "","X1","X2","X3"
> > "1","1","string one","another string"
> > "2","2","quotes escaped 10' 20\" 5' 30\" \"test string","final string"
> > "3","3","third row","last \" col"
> 
> Use quote="":
> 
> > read.csv(text='"","X1","X2","X3"
> + "1","1","string one","another string"
> + "2","2","quotes escaped 10\' 20"" 5\' 30"" ""test string","final string"
> + "3","3","third row","last "" col"', sep=",", quote="")
> 
> Not ...., quote="\""
> 
> 
>   X.. X.X1.                                           X.X2.            X.X3.
> 1 "1"   "1"                                    "string one" "another string"
> 2 "2"   "2" "quotes escaped 10' 20"" 5' 30"" ""test string"   "final string"
> 3 "3"   "3"                                     "third row"    "last "" col"
> 
> You will then be depending entirely on commas to separate. 
> 
> (Needed to use escaped single quotes to illustrate from a command line.)
> 
> > 
> > # this breaks read.csv:
> > 
> >> read.csv("test.csv")
> >  X X1                                                                                    X2             X3
> > 1 1  1                                                                            string one another string
> > 2 2  2 quotes escaped 10' 20\\ 5' 30\\ \\test ( file://\test ) string,final string\n3,3,third row,last \\ col      
> > 
> > # we now have only two rows, with all the data captured in col2 row2
> > 
> > Any suggestions on how to fix this behavior? I've tried fiddling with quote="\"" to no avail, obviously. Interestingly, an even number of escaped quotes within a field is loaded correctly, which certainly threw me for a while!
> > 
> > Thank you in advance, 
> > Tim
> > 
> > 
> 
> David Winsemius
> Alameda, CA, USA
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list