[R] how to introduce missing data for complete data

MacQueen, Don macqueen1 at llnl.gov
Mon Nov 11 21:02:31 CET 2013


Here's a suggestion.

The sample() function takes random samples of sets. See
  ?sample
The set you want to take a random sample from is the rows of your data.
Represent the rows by their row numbers.
To get a vector of row numbers, you can use the seq() function. See
  ?seq

Let's suppose your data is in a data frame named 'mydat', and you want to
introduce 10 instances of missing data.

nr <- nrow(mydat)
set.to.missing <- sample( seq(nr) , 10)
mydat$Amount[set.to.missing] <- NA


A simplified example of the core idea is:

> foo <-seq(10)
> foo
 [1]  1  2  3  4  5  6  7  8  9 10
> foo[3] <- NA> foo
 [1]  1  2 NA  4  5  6  7  8  9 10


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/10/13 10:31 PM, "dila radi" <dilaradi21 at gmail.com> wrote:

>Hi,
>
>Im new R users. In my research I use rainfall data and Im interested in
>estimating missing data. I would like to use Normal Ratio Method to
>estimate missing data. My problem is, how do I introduce missing data
>randomly within my complete set of data?
>
>
>Stn ID      Year  Mth   Day   Amount
>48603 71 1 1 1
>48603 71 1 2 0.5
>48603 71 1 3 1.3
>48603 71 1 4 0.8
>48603 71 1 5 0
>48603 71 1 6 0
>48603 71 1 7 0
>...
>
>Thank you so much for your attention and help.
>
>Regards,
>Dila
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list