[Rd] Improvements to write.arff (PR#12574)

martin at martincmartin.com martin at martincmartin.com
Wed Aug 20 22:45:10 CEST 2008


Full_Name: Martin C. Martin
Version: 2.7.1
OS: Ubuntu
Submission from: (NULL) (75.150.115.86)


The function write.arff, in the foreign library:

- Can produce relation names with invalid characters
- Doesn't use colnames() for attribute names when writing a matrix.

Here's a better version:

write.arff <- function (x, file, eol = "\n") 
{
    if (file == "") 
        file <- stdout()
    else if (is.character(file)) {
        file <- file(file, "w")
        on.exit(close(file))
    }
    if (!inherits(file, "connection")) 
        stop("Argument 'file' must be a character string or connection.")
    if (!is.data.frame(x) && !is.matrix(x)) 
        x <- data.frame(x)
    squote <- function(s) {
        ifelse(is.na(s), s, sprintf("'%s'", gsub("(['\\])", "\\\\\\1", 
            s)))
    }
    text <- paste('@relation "', make.names(deparse(substitute(x))), '"',
sep='')
    writeLines(text, file, sep = eol)
    for (name in colnames(x)) {
        text <- paste("@attribute", name)
        if (is.data.frame(x) && is.factor(x[[name]])) {
            lev <- squote(levels(x[[name]]))
            levels(x[[name]]) <- lev
            text <- paste(text, " {", paste(lev, collapse = ","), 
                "}", sep = "")
        }
        else if (is.character(x[,name])) {
            text <- paste(text, "string")
            x[,name] <- squote((x[,name]))
        }
        else if (inherits(x[,name], "POSIXt")) {
            text <- paste(text, "date \"yyyy-MM-dd hh:mm:ss\"")
            x[,name] <- squote(format(x[,name]))
        }
        else {
            text <- paste(text, "numeric")
        }
        writeLines(text, file, sep = eol)
    }
    writeLines("@data", file)
    write.table(x, file = file, na = "?", sep = ",", eol = eol, 
        quote = FALSE, row.names = FALSE, col.names = FALSE)
}



More information about the R-devel mailing list