[R] Replacing NA values in one column of a data.frame

Bert Gunter gunter.berton at gene.com
Tue Aug 18 17:27:02 CEST 2009



-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Steve Lianoglou
Sent: Tuesday, August 18, 2009 7:25 AM
To: John Kane
Cc: r-help at r-project.org; Steve Murray
Subject: Re: [R] Replacing NA values in one column of a data.frame


...** But do NOT do this.***  Unless you are using R to prepare your data
for some other statistical analysis system (most unwise), there is **never**
any reason in R to replace NA's with nonsense numerical codes -- and a great
many reasons NOT to do this, to whit:

1) NA's are a special value in R (actually many, of different types) with
extensive built-in capabilities to handle them properly, deal with them
(often automatically) in fitting functions etc. Recoding to a nonsense
numeric defeats all this careful machinery.

2) You are just begging for trouble by using nonsense numerics: if you
forget your coding or someone else uses your data who's not aware of it,
voilá ! -- you have just guaranteed that data analyses will be partial or
complete garbage.

Finally -- a request to well-meaning helpeRs: Just because you CAN do
something in R statistically or programmatically does not mean you should.
When someone requests something that you believe is the wrong thing to do, I
think it within the bounds of both R etiquette and professional practice to
tell them NOT to do it and explain why not. Of course, sometimes such
admonitions are themselves ill-advised (as this may be), but an open,
courteous, professional exchange on the issue will itself be informative to
useRs.

Cheers to all,

Bert

Bert Gunter
Genentech Nonclinical Biostatisics 




More information about the R-help mailing list