[R] Variable shortlisting for the logistic regression

useR milicic.marko at gmail.com
Thu Oct 16 21:22:26 CEST 2008


Let's try to bring this discussion back again.... after Frank made
very funny remark!

What I'm doing at the moment is:

1. I split dataset in two (development and holdout)
2. I fit single predictor logistic model for every variable and
collect following stats:

DMaxDeriv=modelD$stats[2]
DModelLR=modelD$stats[3]
DP=modelD$stats[5]
DC=modelD$stats[6]
DDxy=modelD$stats[7]
DGamma=modelD$stats[8]
DTau=modelD$stats[9]
DR2=modelD$stats[10]
DBier=modelD$stats[11]

HMaxDeriv=modelH$stats[2]
HModelLR=modelH$stats[3]
HP=modelH$stats[5]
HC=modelH$stats[6]
HDxy=modelH$stats[7]
HGamma=modelH$stats[8]
HTau=modelH$stats[9]
HR2=modelH$stats[10]
HBier=modelH$stats[11]

where D is prefix for stats on development sample and H is prefix for
stats derived from hold out sample



3. Now I screen factor with sommers d grather than 0.3 and relative
change on hold out sample is smaller than 5%


Any comments are very welcomed....















On Oct 14, 2:48 pm, John Kane <jrkrid... at yahoo.ca> wrote:
> --- On Mon, 10/13/08, David Scott <d.sc... at auckland.ac.nz> wrote:
>
>
>
> > From: David Scott <d.sc... at auckland.ac.nz>
> > Subject: Re: [R] Variable shortlisting for the logistic regression
> > To: "Frank E Harrell Jr" <f.harr... at vanderbilt.edu>
> > Cc: r-h... at r-project.org
> > Received: Monday, October 13, 2008, 6:32 PM
> > On Mon, 13 Oct 2008, Frank E Harrell Jr wrote:
>
> > > useR wrote:
> > >> Hi R helpers,
>
> > >> One rather statistical question?
>
> > >> What would be the best startegy to shortlist
> > thousands of continous
> > >> variables automaticaly using R....
> > >> as the preparation for logistic regression
> > modleing!
>
> > >> Thanks
>
> > > The easiest approach is to use a random number
> > generator.
> > > Frank
>
> > Got a laugh from me Frank!
>
> > Can I nominate it for a fortune?
>
> > David
>
> Seconded.
>
>       __________________________________________________________________
> [[elided Yahoo spam]]
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list