[R] Rare Cases and SOM

Manuel Gutierrez manuel_gutierrez_lopez at yahoo.es
Fri Feb 4 12:06:43 CET 2005


I am trying to understand how the SOM algorithm works
using library(class) SOM function.
I have a 1000*10 matrix and I want to be able to
summarize the different types of 10-element vectors.
In my real world case it is likely that most of the
1000 values are of one kind the rest of other (this is
an oversimplification).
Say for example:

InputA<-matrix(cos(1:10),nrow=900,ncol=10,byrow=TRUE)
InputB<-matrix(sin(5:14),nrow=100,ncol=10,byrow=TRUE)
Input<-rbind(InputA,InputB)

I though that a small grid of 3*3 would be enough to
extract the patterns in such simple matrix :
GridWidth<-3
GridLength<-3
gr <- somgrid(xdim=GridWidth,ydim=GridLength,topo =
"hexagonal")
test.som <- SOM(Input, gr)
par(mfrow=c(GridLength,GridWidth))
for(i in 1:(GridWidth*GridLength))
plot(test.som$codes[i,],type="l")

Only when I use a larger grid (say for example 7*3 ) I
get some of the representatives for the sin pattern.
This must have something to do with the initialization
of the grid, as the sin is so rare it is unlikely that
I get it as a reference vector. Afterwards, because
the selection for the training is also random it is
also unlikely they are picked.
I've been trying to modify some of the other
parameters for the SOM also, but I would appreciatte
some input to keep me going until I receive the
reference books from my bookstore.

Are my suspictions right?
Should I be using the SOM for my study or should I
look somewhere else?
NOTE: I have no prior knowledge of whether the
datasets I want to analyse will have rare cases or not
or where they will be located.
Thanks,
Manuel




More information about the R-help mailing list