[R] how to handle missing values
gisar at nus.edu.sg
Wed Jul 16 12:56:58 CEST 2003
This really depends on what you want to do. I will try to give some
1. Coding the missing values
But you definitely do not need to delete observations BEFORE loading
them into R.
By default any empty cells or "NA" is treated as NA, when you load the
data using read.delim(). You can adjust the na.string option in
read.delim() to change this default behaviour.
Ensure the coding is ok before you proceed. You can check using is.na()
for example to see if R will treat them as missing values.
2. Perform calculations with missing values as defined by na.action()
X <- c(1,2,3,NA, 4)
sum(X, na.rm=T) # gives you 10
See ?na.action for more interestin detail.
Some algorithms are capable of automatically handle missing values. In
the classification context, rpart can handle missing values.
3. Missing value imputation
There are many imputation methods (eg. EMV, e1071, hmisc, norm, permax,
pamr libraries). The type of imputation depends on your application,
area of research and type of missingness (if at missing completely at
random, missing/observed at random,
From: Tor A Strand [mailto:Tor.Strand at cih.uib.no]
Sent: Wednesday, July 16, 2003 6:12 PM
Subject: [R] how to handle missing values
This group impresses me, so far I have been helped with all my questions
within 24 hours. Thanks.
Therefore another one.
I am used to programs (such as STATA) where observations with missing
values that are included in a model are simply ignored in the analysis.
So far I have not been able to figure out how to deal with missing
values in R and have solved the problem by deleting observations with
missing values before loading them into R.
Can anyone give me a hint on how to do this in a simpler way?
Dr. Tor A Strand
Centre for International Health
University of Bergen
Phone: (country prefix 47)
Residence:56 51 10 88, office: 55 97 49 80,
fax: 55 97 49 79, cellular: 90 97 10 86
R-help at stat.math.ethz.ch mailing list
More information about the R-help