[R] SVM Param Tuning with using SNOW package

raluca ucagui at hotmail.com
Wed Nov 18 13:09:34 CET 2009


Hi Charlie,


Yes, you are perfectly right, when I make the clusters I should put 2, not
10 (it remained 10 from previous trials with 10 slaves).

cl<- makeCluster(2, type="SOCK" ) 

To tell the truth I do not understand very well what the 2nd parameter for
clusterApplyLB() has to be.

If the function sv.lin has just 1 parameter, sv.lin(c), where c is the cost,
how should I call clusterApplyLB?


 ? clusterApply LB(cl, ?,sv.lin, c=cost1)  ?



Below, I am providing a working example, using the gasoline data that comes
in the pls package.

Thank you for your time! 


library(e1071)
library(snow)
library(pls)

data(gasoline)

X=gasoline$NIR
Y=gasoline$octane

NR=10
cost1=seq(0.5,30, length=NR)


sv.lin<- function(c) {

for (i in 1:NR) {

ind=sample(1:60,50)
gTest<-  data.frame(Y=I(Y[-ind]),X=I(X[-ind,])) 
gTrain<- data.frame(Y=I(Y[ind]),X=I(X[ind,])) 

svm.lin   	  <- svm(gTrain$X,gTrain$Y, kernel="linear",cost=c[i], cross=5)
results.lin   <- predict(svm.lin, gTest$X)

e.test.lin     <- sqrt(sum((results.lin-gTest$Y)^2)/length(gTest$Y))

return(e.test.lin)
}
}


cl<- makeCluster(2, type="SOCK" ) 


clusterEvalQ(cl,library(e1071))


clusterExport(cl,c("NR","Y","X")) 


RMSEP<-clusterApplyLB(cl,?,sv.lin,c=cost1)

stopCluster(cl)





cls59 wrote:
> 
> 
> raluca wrote:
>> 
>> Hello,
>> 
>> Is the first time I am using SNOW package and I am trying to tune the
>> cost parameter for a linear SVM, where the cost (variable cost1) takes 10
>> values between 0.5 and 30.
>> 
>> I have a large dataset and a pc which is not very powerful, so I need to
>> tune the parameters using both CPUs of the pc.
>> 
>> Somehow I cannot manage to do it. It seems that both CPUs are fitting the
>> model for the same values of cost1, I guess the first 5, but not for the
>> last 5.
>> 
>> Please, can anyone help me!
>> 
>> Here is the code:  
>> 
>> data <- data.frame(Y=I(Y),X=I(X))
>> data.X<-data$X
>> data.Y<-data$Y
>> 
>> 
> 
> 
> Helping you will be difficult as we're only three lines into your example
> and already I have no idea what the data you are using looks like. 
> Example code needs to be fully reproducible-- that means a small slice of
> representative data needs to be provided or faked using an appropriate
> random number generator.  
> 
> Some things did jump out at me about your approach and I've made some
> notes below.
> 
> 
> 
> raluca wrote:
>> 
>> NR=10
>> cost1=seq(0.5,30, length=NR)
>> 
>> sv.lin<- function(cl,c) {
>> 
>> for (i in 1:NR) {
>> 
>> ind=sample(1:414,276)
>> 
>> hogTest<-  data.frame(Y=I(data.Y[-ind]),X=I(data.X[-ind,])) 
>> hogTrain<- data.frame(Y=I(data.Y[ind]),X=I(data.X[ind,])) 
>> 
>> svm.lin   	  <- svm(hogTrain$X,hogTrain$Y, kernel="linear",cost=c[i],
>> cross=5)
>> results.lin   <- predict(svm.lin, hogTest$X)
>> 
>> e.test.lin     <- sqrt(sum((results.lin-hogTest$Y)^2)/length(hogTest$Y))
>> 
>> return(e.test.lin)
>> }
>> }
>> 
>> cl<- makeCluster(10, type="SOCK" ) 
>> 
> 
> 
> If your machine has two cores, why are you setting up a cluster with 10
> nodes?  Usually the number of nodes should equal the number of cores on
> your machine in order to keep things efficient.
> 
> 
> 
> raluca wrote:
>> 
>> 
>> clusterEvalQ(cl,library(e1071))
>> 
>> clusterExport(cl,c("data.X","data.Y","NR","cost1")) 
>> 
>> RMSEP<-clusterApplyLB(cl,cost1,sv.lin)
>> 
> 
> 
> Are you sure this evaluation even produces results? sv.lin() is a function
> you defined above that takes two parameters-- "cl" and "c".
> clusterApplyLB() will feed values of cost1 into sv.lin() for the argument
> "cl", but it has nothing to give for "c".  At the very least, it seems
> like you would need something like:
> 
>   RMSEP <- clusterApplyLB( cl, cost1, sv.lin, c = someVector )
> 
> 
> 
> raluca wrote:
>> 
>> 
>> stopCluster(cl)
>> 
>> 
> 
> 
> Sorry I can't be very helpful, but with no data and no apparent way to
> legally call sv.lin() the way you have it set up, I can't investigate the
> problem to see if I get the same results you described.  If you could
> provide a complete working example, then there's a better chance that
> someone on this list will be able to help you.
> 
> Good luck!
> 
> -Charlie
> 

-- 
View this message in context: http://old.nabble.com/SVM-Param-Tuning-with-using-SNOW-package-tp26399401p26406709.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list