[R] Replace missing value within group with non-missing value

Rainer Schuermann rainer.schuermann at gmx.net
Sat Apr 6 18:38:32 CEST 2013


Probably not very R-ish but it works (your data in a dataframe called "x"), if I understand your question right:

# replace NA with 0
x$mth <- ifelse( is.na( x$mth ), 0, x$mth )

# loop through observation numbers and replace 0 with the month no
for( i in unique( x$obs ) ) x$mth[ x$obs == i ] <- max( x$mth[ x$obs == i ] ) 

Rgds,
Rainer


> x
   dn obs choice br mth
1   4   1      0  1 487
2   4   1      0  2 487
3   4   1      0  3 487
4   4   1      0  4 487
5   4   1      0  5 487
6   4   1      1  6 487
7   4   2      0  1 488
8   4   2      0  2 488
9   4   2      1  3 488
10  4   2      0  4 488
11  4   2      0  5 488
12  4   2      0  6 488
13  4   3      0  1 488
14  4   3      0  2 488
15  4   3      0  3 488
16  4   3      0  4 488
17  4   3      0  5 488
18  4   3      1  6 488
19  4   4      0  1 489
20  4   4      0  2 489
21  4   4      1  3 489
22  4   4      0  4 489
23  4   4      0  5 489
24  4   4      0  6 489
25  4   5      0  1 489
26  4   5      0  2 489
27  4   5      0  3 489
28  4   5      0  4 489
29  4   5      0  5 489
30  4   5      1  6 489
31  4   6      0  1 489
32  4   6      0  2 489
33  4   6      0  3 489
34  4   6      0  4 489
35  4   6      0  5 489
36  4   6      1  6 489
37  4   7      0  1 490
38  4   7      0  2 490
39  4   7      0  3 490
40  4   7      0  4 490
41  4   7      0  5 490
42  4   7      1  6 490
43  4   8      0  1 491
44  4   8      0  2 491
45  4   8      0  3 491
46  4   8      0  4 491
47  4   8      0  5 491
48  4   8      1  6 491
49  4   9      0  1   0
50  4   9      0  2   0




On Saturday 06 April 2013 16:16:16 Leask, Graham wrote:
> Hi Rui,
> 
> Data as follows
> 
> structure(list(dn = c(4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 
> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), obs = c(1, 1, 
> 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 
> 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 
> 8, 8, 8, 8, 9, 9), choice = c(0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 
> 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
> 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0), br = c(1, 
> 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 
> 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 
> 2, 3, 4, 5, 6, 1, 2), mth = c(NA, NA, NA, NA, NA, 487, NA, NA, 
> 488, NA, NA, NA, NA, NA, NA, NA, NA, 488, NA, NA, 489, NA, NA, 
> NA, NA, NA, NA, NA, NA, 489, NA, NA, NA, NA, NA, 489, NA, NA, 
> NA, NA, NA, 490, NA, NA, NA, NA, NA, 491, NA, NA)), .Names = c("dn", 
> "obs", "choice", "br", "mth"), row.names = c("1", "2", "3", "4", 
> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
> "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 
> "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", 
> "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", 
> "49", "50"), class = "data.frame")
> 
> Best wishes
> 
> 
> Graham
> 
> -----Original Message-----
> From: Rui Barradas [mailto:ruipbarradas at sapo.pt] 
> Sent: 06 April 2013 16:32
> To: Leask, Graham
> Cc: r-help at r-project.org
> Subject: Re: [R] Replace missing value within group with non-missing value
> 
> Hello,
> 
> Can't you post a data example? If your dataset is named 'dat' use
> 
> dput(head(dat, 50))  # paste the output of this in a post
> 
> 
> Rui Barradas
> 
> Em 06-04-2013 15:34, Leask, Graham escreveu:
> > Hi Rui,
> >
> > Thank you for your suggestion which is very much appreciated. Unfortunately running this code produces the following error.
> >
> > error in '$<-.data.frame' ('*tmp*', "mth", value = NA_real_) :
> >      replacement has 1 rows, data has 0
> >
> > I'm sure there must be an elegant solution to this problem?
> >
> > Best wishes
> >
> >
> >
> > Graham
> >
> > On 6 Apr 2013, at 12:15, "Rui Barradas" <ruipbarradas at sapo.pt> wrote:
> >
> >> Hello,
> >>
> >> That's not a very good way of posting your data, preferably paste the output of ?dput in a post.
> >> Some thing along the lines of the following might do what you want. 
> >> It seems that the groups are established by 'dn' and 'obs' numbers. 
> >> If so, try
> >>
> >>
> >> # Make up some data
> >> dat <- data.frame(dn = 4, obs = rep(1:5, each = 6), mth = NA) 
> >> dat$mth[6] <- 487 dat$mth[9] <- 488 dat$mth[18] <- 488 dat$mth[21] <- 
> >> 489 dat$mth[30] <- 489
> >>
> >>
> >> sp <- split(dat, list(dat$dn, dat$obs))
> >> names(sp) <- NULL
> >> tmp <- lapply(sp, function(x){
> >>         idx <- which(!is.na(x$mth))[1]
> >>         x$mth <- x$mth[idx]
> >>         x
> >>     })
> >> do.call(rbind, tmp)
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >>
> >> Em 06-04-2013 11:33, Leask, Graham escreveu:
> >>> Dear List members
> >>>
> >>> I have a large dataset organised in choice groups see sample below
> >>>
> >>>       +-------------------------------------------------------------------------------------------------+
> >>>       | dn   obs   choice      acid   br                 date       cdate   situat~n   mth   year   set |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>>    1. |  4     1        0     LOSEC    1                    .           .                .      .     1 |
> >>>    2. |  4     1        0    NEXIUM    2                    .           .                .      .     1 |
> >>>    3. |  4     1        0    PARIET    3                    .           .                .      .     1 |
> >>>    4. |  4     1        0   PROTIUM    4                    .           .                .      .     1 |
> >>>    5. |  4     1        0    ZANTAC    5                    .           .                .      .     1 |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>>    6. |  4     1        1     ZOTON    6   23aug2000 01:00:00   23aug2000         NS   487   2000     1 |
> >>>    7. |  4     2        0     LOSEC    1                    .           .                .      .     2 |
> >>>    8. |  4     2        0    NEXIUM    2                    .           .                .      .     2 |
> >>>    9. |  4     2        1    PARIET    3   25sep2000 01:00:00   25sep2000          L   488   2000     2 |
> >>> 10. |  4     2        0   PROTIUM    4                    .           .                .      .     2 |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>> 11. |  4     2        0    ZANTAC    5                    .           .                .      .     2 |
> >>> 12. |  4     2        0     ZOTON    6                    .           .                .      .     2 |
> >>> 13. |  4     3        0     LOSEC    1                    .           .                .      .     3 |
> >>> 14. |  4     3        0    NEXIUM    2                    .           .                .      .     3 |
> >>> 15. |  4     3        0    PARIET    3                    .           .                .      .     3 |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>> 16. |  4     3        0   PROTIUM    4                    .           .                .      .     3 |
> >>> 17. |  4     3        0    ZANTAC    5                    .           .                .      .     3 |
> >>> 18. |  4     3        1     ZOTON    6   20sep2000 00:00:00   20sep2000          R   488   2000     3 |
> >>> 19. |  4     4        0     LOSEC    1                    .           .                .      .     4 |
> >>> 20. |  4     4        0    NEXIUM    2                    .           .                .      .     4 |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>> 21. |  4     4        1    PARIET    3   27oct2000 00:00:00   27oct2000         NL   489   2000     4 |
> >>> 22. |  4     4        0   PROTIUM    4                    .           .                .      .     4 |
> >>> 23. |  4     4        0    ZANTAC    5                    .           .                .      .     4 |
> >>> 24. |  4     4        0     ZOTON    6                    .           .                .      .     4 |
> >>> 25. |  4     5        0     LOSEC    1                    .           .                .      .     5 |
> >>>       |-------------------------------------------------------------------------------------------------|
> >>> 26. |  4     5        0    NEXIUM    2                    .           .                .      .     5 |
> >>> 27. |  4     5        0    PARIET    3                    .           .                .      .     5 |
> >>> 28. |  4     5        0   PROTIUM    4                    .           .                .      .     5 |
> >>> 29. |  4     5        0    ZANTAC    5                    .           .                .      .     5 |
> >>> 30. |  4     5        1     ZOTON    6   23oct2000 03:00:00   23oct2000         NS   489   2000     5 |
> >>>
> >>> I wish to fill in the missing values in each choice set - delineated by dn (Doctor) obs (Observation number) and choices (1 to 6).
> >>> For each choice set one choice is chosen which contains full time 
> >>> information for that choice set ie in set 1 choice 6 was chosen and shows the month 487. The other 5 choices show mth as missing. I want to fill these with the correct mth.
> >>>
> >>> I am sure there must be an elegant way to do this in R?
> >>>
> >>>
> >>> Best wishes
> >>>
> >>>
> >>>
> >>> Graham
> >>>
> >>>
> >>>     [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide 
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list