[R] Pre-model Variable Reduction
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Tue Dec 9 14:08:31 CET 2008
> Hello All,
> I am trying to carry out variable reduction. I do not have information
> about the dependent variable, and have only the X variables as it
> In selecting variables I wish to keep, I have considered the following criteria.
> 1) Percentage of missing value in each column/variable
> 2) Variance of each variable, with a cut-off value.
> I recently came across Weka and found that there is an RWeka package
> which would allow me to make use of Weka through R.
> Weka provides a "Genetic search" variable reduction method, but I
> could not find its R code implementation in the RWeka Pdf file on
> I looked for other R packages that allow me to do variable reduction
> without considering a dependent variable. I came across 'dprep'
> package but it does not have a Windows implementation.
> Moreover, I have a dataset that contains continuous and categorical
> variables, some categorical variables having 3 levels, 10 levels and
> so on, till a max 50 levels (E.g. States in the USA).
> Any suggestions in this regard will be much appreciated.
> Thank you
> Harsh Singhal
> Decision Systems,
> Mu Sigma, Inc.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Take a look at the the redun function in the Hmisc package, which does
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help