[R] interpretation of MDS plot in random forest

Massimo Bressan mbressan at arpa.veneto.it
Tue Dec 3 14:27:54 CET 2013


here it is an amended (more general) version

library(randomForest)
set.seed(1)
data(iris)
iris.rf <- randomForest(Species ~ ., iris, proximity=TRUE, keep.forest=TRUE)

x<-MDSplot(iris.rf, iris$Species)
#add legend
legend("topleft", legend=levels(iris.rf$predicted), 
fill=brewer.pal(length(levels(iris.rf$predicted)), "Set1"))
#str(x)
# need to identify points?
text(x$points,labels=attr(x$points,"dimnames")[[1]], cex=0.5)

bye

m


Il 03/12/2013 12:15, mbressan at arpa.veneto.it ha scritto:
> sorry, in fact it was a trivial question!
>
> by just peeping into the function I've worked out this simple solution:
>
> MDSplot(iris.rf, iris$Species)
> legend("topleft", legend=levels(iris$Species), fill=brewer.pal(3, "Set1"))
>
> thank you
>
>> thanks andy
>>
>> it's a real honour form me to get a reply by you;
>> I'm still a bit faraway from a proper grasp of the purpose of the plot...
>>
>> may I ask you for a more technical (trivial) issue?
>> is it possible to add a legend in the MDS plot?
>> my problem is to link the color points in the chart to the factor that was
>> used as response to train rf, how to?
>>
>> best
>>
>> max
>>
>>> Yes, that's part of the intention anyway.  One can also use them to do
>>> clustering.
>>>
>>> Best,
>>> Andy
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>> On Behalf Of Massimo Bressan
>>> Sent: Monday, December 02, 2013 6:34 AM
>>> To: r-help at r-project.org
>>> Subject: [R] interpretation of MDS plot in random forest
>>>
>>> Given this general example:
>>>
>>> set.seed(1)
>>>
>>> data(iris)
>>>
>>> iris.rf <- randomForest(Species ~ ., iris, proximity=TRUE,
>>> keep.forest=TRUE)
>>>
>>> #varImpPlot(iris.rf)
>>>
>>> #varUsed(iris.rf)
>>>
>>> MDSplot(iris.rf, iris$Species)
>>>
>>> I’ve been reading the documentation about random forest (at best of my
>>> -
>>> poor - knowledge) but I’m in trouble with the correct interpretation
>>> of
>>> the MDS plot and I hope someone can give me some clues
>>>
>>> What is intended for “the scaling coordinates of the proximity
>>> matrix”?
>>>
>>>
>>> I think to understand that the objective is here to present the distance
>>> among species in a parsimonious and visual way (of lower dimensionality)
>>>
>>> Is therefore a parallelism to what are intended the principal components
>>> in a classical PCA?
>>>
>>> Are the scaling coordinates DIM 1 and DIM2 the eigenvectors of the
>>> proximity matrix?
>>>
>>> If that is correct, how would you find the eigenvalues for that
>>> eigenvectors? And what are the eigenvalues repreenting?
>>>
>>>
>>> What are saying these two dimensions in the plot about the different
>>> iris species? Their relative distance in terms of proximity within the
>>> space DIM1 and DIM2?
>>>
>>> How to choose for the k parameter (number of dimensions for the scaling
>>> coordinates)?
>>>
>>> And finally how would you explain the plot in simple terms?
>>>
>>> Thank you for any feedback
>>> Best regards
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> Notice:  This e-mail message, together with any attachments, contains
>>> information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
>>> New Jersey, USA 08889), and/or its affiliates Direct contact information
>>> for affiliates is available at
>>> http://www.merck.com/contact/contacts.html) that may be confidential,
>>> proprietary copyrighted and/or legally privileged. It is intended solely
>>> for the use of the individual or entity named on this message. If you
>>> are
>>> not the intended recipient, and have received this message in error,
>>> please notify us immediately by reply e-mail and then delete it from
>>> your system.
>>>
>>
>
>



More information about the R-help mailing list