[Rd] Relevel confusing with numeric value

peter dalgaard pd@lgd @ending from gm@il@com
Tue Oct 2 18:22:03 CEST 2018


In a word, no. It is behaving as documented and adding a warning would just confuse others who have been using the feature as intended. 

This belongs in the same bin as "as.integer(f) vs as.integer(as.character(f))" and "x[f] vs. x[as.character(f)]"

-pd


> On 2 Oct 2018, at 17:18 , Emil Bode <emil.bode using dans.knaw.nl> wrote:
> 
> Something that bit me:
> The function relevel takes a factor, and a reference level to be promoted to the first place.
> If “ref” is a character this level is promoted, if it’s a numeric the “ref”-th level is promoted.
> Which turns out to be very confusing if you have factor with numeric values (e.g. when reading in a csv with some dirty numeric columns and stringsAsFactors TRUE)
> For example:
> 
> set.seed(1)
> test <- data.frame(n=sample(c(1:100, letters[1:10]), size=90))
> test$n <- relevel(test$n, 50)
> print(levels(test$n))
> 
> gives “62” as the first level.
> 
> Could we make something like this an error, or at least issue a warning?
> Also because some other functions automatically coerce, factor(…, levels=1:100) and levels(test$n) <- 1:100 works fine.
> So this is maybe the most confusing: relevel(factor(1:10, levels = -10:20), 15) gives “4” as the first level
> 
> For now I’ve thought of 2 possible implementations, that could be inserted in stats::relevel.factor(), just before is.character(ref):
> 
> if(is.numeric(ref) && ref %in% lev)
>    warning('Provided numeric reference, note that this will promote the ', ref, 'th value, not level with value "', ref, '"!')
> 
> or
> 
> if(is.numeric(ref) && any(!is.na(suppressWarnings(as.numeric(lev)))))
>    warning('Provided numeric reference, note that this will promote the ', ref, 'th value, not level with value "', ref, '"!')
> 
> 
> Best regards,
> Emil Bode
> 
> Data-analyst
> 
> +31 6 43 83 89 33
> emil.bode using dans.knaw.nl<mailto:emil.bode using dans.knaw.nl>
> 
> DANS: Netherlands Institute for Permanent Access to Digital Research Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | info using dans.knaw.nl<mailto:info using dans.kn> | dans.knaw.nl<applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW<http://knaw.nl/nl> and funding organisation NWO<http://www.nwo.nl/>.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-devel mailing list