[R] Warning: as.numeric reorders factor data

Frank E Harrell Jr fharrell at virginia.edu
Sun Dec 8 16:07:03 CET 2002


On Sun, 08 Dec 2002 10:03:54 -0500
Bud Gibson <fpgibson at umich.edu> wrote:

> Recently, I was using aggregate() to develop averages by trial for an 
> experiment I was running.  Trials were indicated as ordinal numbers for 
> each subject.  aggregate() turned trial into factors during the 
> aggregation process.  I then wanted to create a scatter plot of subject 
> performance by trial, so I applied as.numeric to the (now) factor 
> variable trial.  as.numeric reordered the trial indicator creating some 
> (at first) incomprehensible results.
> 
> Investigation revealed that aggregate must first be interpreting trial 
> as a character and then turning it into a factor.  The behavior I 
> observed is reproducible from the following transcript using R1.6.1 on 
> RH linux 7.3.
> 
>  > test <- as.factor(as.character(c(1,2,3,4,5,6,7,8,9,10,11)))
>  > test
>   [1] 1  2  3  4  5  6  7  8  9  10 11
> Levels: 1 10 11 2 3 4 5 6 7 8 9
>  > as.numeric(test)
>   [1]  1  4  5  6  7  8  9 10 11  2  3
> 
> It strikes me that as.numeric should *never* reorder the vector it is 
> working on.  There is this workaround for the problem:
> 
>  > as.numeric(as.character(test))
>   [1]  1  2  3  4  5  6  7  8  9 10 11
> 
> However, I should not have to know about the internals of aggregate to 
> be able to use its results.
> 
> Bud Gibson

One of the reasons for being of the summarize function in the Hmisc library (http://hesweb1.med.virginia.edu/biostat/s/Hmisc.html) is that it preserves the nature of the stratification variables.  summarize produces data frames that are like the original data except with the response variables replaced by scalar or vector statistical summaries.
-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat




More information about the R-help mailing list