[Rd] what do you think about write.table(... qmethod = "excel")?

Paul Johnson pauljohn32 at gmail.com
Wed Sep 20 18:53:05 CEST 2017


On Tue, Sep 19, 2017 at 4:45 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
>
>
> That's true, but if that's what they want to do, and they're willing to pay
> to be able to write files that imitate Excel, then why not do what they ask?
>
> On the other hand, if they aren't willing to pay for the work, then you
> should lecture them on how silly their request is.
>
> In any case, base R functions should not include nonsense, so this is not
> something that should go into R.
>
> Duncan Murdoch
>

I understand.  This is a paying client, I'm going where the money goes.

Here's my current working example of a function that writes a CSV
exactly as Excel does. I've posted this into StackOverflow
(https://stackoverflow.com/questions/25743018/how-to-conditionally-remove-quotes-in-write-csv).
It is buried under a thread, maybe will not get much attention and
bug-checking. Oh, well, I can hope. Processes variables
column-by-column, I don't know how to do it differently.



##' Write CSV files with quotes same as MS Excel 2013 or newer
##'
##' R's write.csv inserts quotes around all elements in a character
##' vector (if quote = TRUE).  In contrast, MS Excel CSV export no
##' longer inserts quotation marks on all elements in character
##' variables, except when the cells include commas or quotation
##' marks.  This function generates CSV files that are, so far as we
##' know, exactly the same "quoted style" as MS Excel CSV export
##' files.
##'
##' This works by manually inserting quotation marks where necessary and
##' turning FALSE R's own method to insert quotation marks.
##' @param x a data frame
##' @param file character string for file name
##' @param row.names Default FALSE for row.names
##' @importFrom utils write.table
##' @return the return from write.table, using revised quotes
##' @export
##' @author Paul Johnson
##' @examples
##' set.seed(234)
##' x1 <- data.frame(x1 = c("a", "b,c", "b", "The \"Washington, DC\""),
##'       x2 = rnorm(4), stringsAsFactors = FALSE)
##' x1
##' fn <- tempfile(pattern = "testcsv", fileext = ".csv")
##' writeCSV(x1, file = fn)
##' readLines(fn)
##' x2 <- read.table(fn, sep = ",", header = TRUE, stringsAsFactors = FALSE)
##' all.equal(x1,x2)
writeCSV <- function(x, file, row.names = FALSE){
    xischar <- colnames(x)[sapply(x, is.character)]
    for(jj in xischar){
        x[ , jj] <- gsub('"', '""', x[ , jj], fixed = TRUE)
        needsquotes <- grep('[\",]', x[ ,jj])
        x[needsquotes, jj] <- paste0("\"", x[needsquotes, jj], "\"")
    }
    write.table(x, file = file, sep = ",", quote = FALSE,
                row.names = row.names)
}

Output:

>  set.seed(234)
>  x1 <- data.frame(x1 = c("a", "b,c", "b", "The \"Washington, DC\""),
+        x2 = rnorm(4), stringsAsFactors = FALSE)
>  x1
                    x1         x2
1                    a  0.6607697
2                  b,c -2.0529830
3                    b -1.4992061
4 The "Washington, DC"  1.4712331
>  fn <- tempfile(pattern = "testcsv", fileext = ".csv")
>  writeCSV(x1, file = fn)
>  readLines(fn)
[1] "x1,x2"
[2] "a,0.660769736644892"
[3] "\"b,c\",-2.052983003941"
[4] "b,-1.49920605110092"
[5] "\"The \"\"Washington, DC\"\"\",1.4712331168047"
>  x2 <- read.table(fn, sep = ",", header = TRUE, stringsAsFactors = FALSE)
>  all.equal(x1,x2)
[1] TRUE

I already see one problem, that I've got no special arrangement for
column names with commas or quotes. People who want column names with
those things are even more wrong than the people write a parser that
can't understand quotes on character variables.

-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write to me directly, please address me at pauljohn at ku.edu.



More information about the R-devel mailing list