[R] insert and count missing data

Gabor Grothendieck ggrothendieck at gmail.com
Wed Jun 3 05:34:02 CEST 2009


Try this:

> Lines <- "No     Year     month rain
+ 1398 1985    10 104.2
+ 1399 1985    11 138.0
+ 1400 1985    12 120.4
+ 1401 1986     1  12.6
+ 1402 1986     2  19.4
+ 1403 1986     3   1.0
+ 1404 1986     4  58.8
+ 1405 1986     5  98.4
+ 1406 1986     6  56.6
+ 1407 1986     7 280.4
+ 1408 1986     8 128.2
+ 1409 1986     9 100.0
+ 1410 1986    10 166.0
+ 1411 1986    12  68.1
+ 1412 1987     2  46.0
+ 1413 1987     3  35.0
+ 1414 1987     4  75.0
+ 1415 1987     5  90.8
+ 1416 1987     6 189.0
+ 1417 1987     7 110.6
+ 1418 1987     8  87.2
+ 1419 1987     9  50.0
+ 1420 1987    10  41.8
+ 1421 1987    11  64.0
+ 1422 1987    12  75..6
+ 1423 1988     1  34.6
+ 1424 1988     2  36.0
+ 1425 1988     3  65..6
+ 1426 1988     4  40.0
+ 1427 1988     5 239.8
+ 1428 1988     6 150.8
+ 1429 1988     7 125.8
+ 1430 1988     8  64.4
+ 1431 1988     9  86.0
+ 1432 1988    10  54.0
+ 1433 1988    11 153.4
+ 1434 1988    12 212.0
+ 1435 1989     1  19.6
+ 1436 1989     2  17.4
+ 1437 1989     3 144.6
+ 1438 1989     4 143.8
+ 1439 1989     5 197.4"
>
> DF <- read.table(textConnection(Lines), header = TRUE)
>
> library(zoo)
> z <- zoo(as.matrix(DF[c(1, 4)]), as.yearmon(DF$Year + (DF$month-1)/12))
> zz <- as.zoo(as.ts(z))
> head(zz, 20)
         No   rain
1985(10) 1398 104.2
1985(11) 1399 138.0
1985(12) 1400 120.4
1986(1)  1401 12.6
1986(2)  1402 19.4
1986(3)  1403 1.0
1986(4)  1404 58.8
1986(5)  1405 98.4
1986(6)  1406 56.6
1986(7)  1407 280.4
1986(8)  1408 128.2
1986(9)  1409 100.0
1986(10) 1410 166.0
1986(11) <NA> <NA>
1986(12) 1411 68.1
1987(1)  <NA> <NA>
1987(2)  1412 46.0
1987(3)  1413 35.0
1987(4)  1414 75.0
1987(5)  1415 90.8

> # complete rows as fraction of all rows
> sum(complete.cases(zz)) / nrow(zz)
[1] 0.9545455

To find out more about the zoo package read the three included
vignettes (pdf documents)
and help files.


On Tue, Jun 2, 2009 at 11:06 PM, Roslina Zakaria <zroslina at yahoo.com> wrote:
> Hi R-users,
>
> I have missing data for the month. My question is how do I insert the missing month and fill up the cell with 'na' for the rain amount?  Then I would like to count the percentage of missing data.
>
> No     Year     month rain
> 1398 1985    10 104.2
> 1399 1985    11 138.0
> 1400 1985    12 120.4
> 1401 1986     1  12.6
> 1402 1986     2  19.4
> 1403 1986     3   1.0
> 1404 1986     4  58.8
> 1405 1986     5  98.4
> 1406 1986     6  56.6
> 1407 1986     7 280.4
> 1408 1986     8 128.2
> 1409 1986     9 100.0
> 1410 1986    10 166.0
> 1411 1986    12  68.1
> 1412 1987     2  46.0
> 1413 1987     3  35.0
> 1414 1987     4  75.0
> 1415 1987     5  90.8
> 1416 1987     6 189.0
> 1417 1987     7 110.6
> 1418 1987     8  87.2
> 1419 1987     9  50.0
> 1420 1987    10  41.8
> 1421 1987    11  64.0
> 1422 1987    12  75..6
> 1423 1988     1  34.6
> 1424 1988     2  36.0
> 1425 1988     3  65..6
> 1426 1988     4  40.0
> 1427 1988     5 239.8
> 1428 1988     6 150.8
> 1429 1988     7 125.8
> 1430 1988     8  64.4
> 1431 1988     9  86.0
> 1432 1988    10  54.0
> 1433 1988    11 153.4
> 1434 1988    12 212.0
> 1435 1989     1  19.6
> 1436 1989     2  17.4
> 1437 1989     3 144.6
> 1438 1989     4 143.8
> 1439 1989     5 197.4
> ...
>
> Then I would like to count the percentage of missing data.  I think I can use something like:
>
> apply(Matrix, 1, function(x) sum(is.na(x))) / ncol(Matrix) * 100
>
>
> Or  rowMeans(is.na(Matrix))*100 --question asked in the R forum before.
>
>
>
> Thank you so much for any help given.
>
>
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list