[R] Factoring a variable

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Jun 17 22:57:36 CEST 2010


On Thu, Jun 17, 2010 at 3:45 PM, Noah Silverman <noahsilverman at ucla.edu> wrote:
> Hi,
> I have a dataset where the results are coded ("yes", "no")  We want to
> do some machine learning with SVM to predict the "yes" outcome
> My problem is that if I just use the as.factor function to convert, then
> it reverses the levels.
> ----------------------
> x <- c("no", "no", "no", "yes", "yes", "no", "no")
>  as.factor(x)
> [1] no  no  no  yes yes no  no
> Levels: no yes
> ----------------------
> The SVM function (in the e1071 package) sees "no" as the first label and
> treats that as the positive outcome.
> The problem arises when we look at the decision values of the
> predictions.  Everything is gauged as values for "no".
> So, is there a way to force R to use my specified order when converting
> to factors?
> I've tried as.factor(x, levels=c("yes", "no")) but that throws errors
> about unused arguments.
> Any help?

Yes, look at the error message you're getting in your call to
`as.factor` more closely -- also look at the help for as.factor and
note there is no "levels" argument:

R> x <- c("no", "no", "no", "yes", "yes", "no", "no")
R> factor(x, levels=c('yes', 'no'))
[1] no  no  no  yes yes no  no
Levels: yes no


Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the R-help mailing list