# as.numeric(<factor>) [Difference R/S]

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Tue, 20 Jan 1998 09:37:53 +0100

```>From  R-core;  this should interest most R-devel'ers (to some extent):

Since 0.60,  the semantics of   as.numeric(<factor>)  has changed,
e.g.

R> as.integer(factor(c("A","BB")))
 NA NA
R> as.integer(factor(c(100,40,100)))
 100  40 100

whereas older R and S:

S> as.integer(factor(c("A","BB")))
 1 2
S> as.integer(factor(c(100,40,100)))
 2 1 2

-------------------------------------
as explained by Ross, below :

>>>>> "KH" == Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> writes:

>>>>> Ross Ihaka writes:
KH>> From hornik@ci.tuwien.ac.at Mon Jan 19 22:52 NZD 1998 Subject:
KH>> Difference R/S
KH>>
KH>> Andreas just pointed me to the following:
KH>>
KH>> v <- as.factor(c("Age","Number","Age")) as.numeric(v)
KH>>
KH>> gives
KH>>
KH>>  1 2 1
KH>>
KH>> in S+ and
KH>>
KH>>  NA NA NA
KH>>
KH>> Bug/feature/intentional?
KH>>
KH>> Of course, R makes more sense because as.numeric("Age") gives NA in
KH>> both R and S+ ...
KH>>
KH>> Or, should we have as.numeric() return the codes on a non-numeric
KH>> factor?

Ross> At present R (implicitly) computes as.numeric(x) for x a factor as

Ross> 	as.numeric(as.character(x))

Ross> and S computes

Ross> 	codes(x)

Ross> I mistakenly thought that S does what I have implemented for R.
Ross> Thomas first objected to the difference and then said he quite liked
Ross> it.

Ross> I quite like the present semantics, but it is easy to change if
Ross> others have different preferences.

KH> I personally think that the current R approach makes more sense,
KH> too.  If we all agree on it, I would like to add the difference to
KH> the FAQ, so that it is (well) documented.

Later, I started to discover in how much S-code
as.numeric(ff)
is just used to extract the factor codes (in {1:M})  from a factor.

This lead me (and Peter Dalgaard, I think) to the conclusion that
- yes, the present R behavior maybe ``cleaner'' than S's
- no, it is a pain to keep it, because it breaks S code too often.

However, as you see, we haven't agreed yet on the topic.
I think we should agree ASAP, since it involves code in several places
(outside R base).

Martin
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```