[R] Questions on Random Forest

Wiener, Matthew matthew_wiener at merck.com
Mon Nov 24 17:10:42 CET 2003


It looks like image_and_label has only 2 columns, so when you take
img_and_label[,2] you have a vector left.  Even if that weren't the case,
you're going to need to pass in both the gray scale points and labels,
presumably in a data frame.  You've created a character matrix below, so
you're just passing in a character vector of labels.

You'll probably want something like 
rf <- randomForest(label~image,data=image_and_label,importance=TRUE,
proximity=TRUE),

assuming that image_and_label is a data frame with elements image and label.


For the second question, see the documentation for the predict method for
random forests; for the third, the answer is yes, random forests can be used
with multiple variables.

There is an introduction to the random forests package in volume 2, issue 3
of the R newsletter (available in the documentation section of cran).

Hope this helps,

Matt Wiener

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Fucang Jia
Sent: Monday, November 24, 2003 10:31 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Questions on Random Forest


Hi, everyone,

I am a newbie on R. Now I want to do image pixel classification by random 
forest. But I has not a clear understanding on random forest. Here is some 
question:

As for an image, for example its size is 512x512 and has only one variable 
-- gray level. The histogram of the image looks like mixture Gaussian Model,

say Gauss distribution (u1,sigma1), (u2,sigma2),(u3,sigma3). And a image 
classified by K-means or EM algorithm, so the class label image is also 
512x512 and has 0, 1, 2 value.

I read the binary image data as follows:

datafile <- file("bone.img","rb")
img <- readBin(datafile,size=2,what="integer",n=512*512,signed=FALSE)
img <- as.matrix(img)
close(datafile)

labelfile <- file(label.img","rb")
label <- readBin(labelfile,size=2,what="integer",n=512*512,signed=FALSE)
label <- as.matrix(label)
close(labelfile)

img_and_label <- c(img,label)  // binds the image data and class label
img_and_label <- as.matrix(img_and_label)
img_and_label <- array(img_and_label, dim=c(262144,2))


Random Forest need a class label like "Species" in the  iris. I do not know 
how
to set a class label like "Species" to the img.  So I run the command as 
follows:

set.seed(166)
rf <- randomForest(img_and_label[,2],data=image_and_label,importance=TRUE,
proximity=TRUE)

which outputs:

Error in if (n == 0) stop("data (x) has 0 rows") :
        argument is of length zero

Could anyone tell what is wrong and how can do the RF?

Secondly, if there is an new image , say img3 (dimension is 512x512,too), 
how can I
use the former result to classifify the new image?

Thirdly, whether or not random forest be used well if there is only one 
variable, say pixel
gray level, or three variables, such as red, green, blue color component to 
an true color
image?

Thank you very much!

Best,

Fucang

========================================
Fucang Jia, Ph.D student
Institute of Computing Technology, Chinese Academy of Sciences
Post.Box 2704
Beijing, 100080
P.R.China
E-mail:fcjia at ict.ac.cn

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list