[R] RWeka read.arff()

Christian Schulz ozric at web.de
Mon May 26 08:46:48 CEST 2003


Dear R-project and Weka users,

here is my first step to import 
Arff files from the Weka Machine Learning Package
 into R-project.

http://www.cs.waikato.ac.nz/ml/weka/

Further: Ko-Kang Kevin Wang
starts a webpage for RWeka

http://www.stat.auckland.ac.nz/~kwan022/RWeka/


Perhaps a regular expression crack,
know how it's possible - getting the second part of
three text-blocks, which tab or space separated
- so the the regular expression for labels could be better!?

many thanks & regards,christian


[1] "@ATTRIBUTE SEPALLENGTH REAL"
[2] "@ATTRIBUTE SEPALWIDTH REAL"
[3] "@ATTRIBUTE PETALLENGTH REAL"
[4] "@ATTRIBUTE PETALWIDTH REAL"
[5] "@ATTRIBUTE CLASS {IRIS-SETOSA,IRIS-VERSICOLOR,IRIS-VIRGINICA}"


read.arff <- function(file = "", header = TRUE, ...) {
x <- readLines(file)
y <- toupper(x)
s <- which(y == "@DATA")
data <- read.table(file = file, sep = ",", comment.char = "%",skip = s)
attr <- paste(y)
attr2 <- grep("^@ATTRIBUTE",value=T,attr)
labels <- sub("\([^ ]*\)","", attr2)
names(data) <- as.character(labels)
return(data)
}


[tests]
xyz <- read.arff("c:/project/Rweka/labor.arff")
str(xyz)
xyz <-  read.arff("c:/project/Rweka/labor.arff")
str(xyz)
xyz <-  read.arff("c:/project/Rweka/iris.arff")
str(xyz)
xyz <-  read.arff("c:/project/Rweka/labor.arff")
str(xyz)
xyz <-  read.arff("c:/project/Rweka/cpu.arff")
str(xyz)
xyz <-  read.arff("c:/project/Rweka/adult.arff")
str(xyz)




More information about the R-help mailing list