[R] Salient feature selection

Andy Weller weller at erdw.ethz.ch
Mon Jul 2 17:17:12 CEST 2007


I am relatively new to R. I am hoping that someone will be able to point 
me in the right direction and/or suggest a technique/package/reference 
that will help me with the following. I have:

a) Some explanatory variables (integers, real) - these are "real world" 
physical descriptions, i.e. counts of features, etc

b) Some response variables (integers, real) - these are image analysis 
measurements (gray-value distributions, textural descriptors, etc) of 
the same things represented in a

and I want to find out which between the two correlate best - i.e. the 
salient features from BOTH sets (i.e. not for classification purposes).

For example, if a has 10 explanatory variables and b has 10 response 
variables, I want to test the complete set of explanatory variables with 
each individual response (or vice versa). So, explanatory 1-10 with 
response 1, explanatory 1-10 with response 2, explanatory 1-10 with 
response 3, etc...

This should ultimately tell me which "real world" physical features are 
related best with the image analysis measurements (with the confidence 
level between them).

I hope this makes sense?

I have used SPSS AnswerTree's "Exhaustive CHAID" before to select a 
subset of input features for a complete set of output features to aid 
the creation of artificial neural networks. I want to do a similar 
thing, but it is not important for ALL explanatory and response 
variables are used/selected.

I hope that I have been clear in my intentions and I look forward to 
your replies, Andy



More information about the R-help mailing list