[R] replacing Na's with values on different records

arun smartpink111 at yahoo.com
Tue Sep 10 15:03:56 CEST 2013


HI,

In the example you showed, there were only two cases: 1) For each ID, 2nd element of AUC24 is NA. 2) or all the entries for a particular ID is missing.  Suppose you have NAs in 1st or 3rd element or multiple NAs (1 and 2, 2 and 3, 1 and 3 etc) in addition to the cases you already described. For example:
u3s1<- u3s
 u3s1$AUC24[1]<- 2
u3sNew<- rbind(u3s1,u3s,u3s,u3s)
u3sNew$ID[10:36]<- rep(104:112,each=3)
u3sNew$AUC24[c(12,19:22,27,28,30)]<- c(20,2,4,8,10.115,18.3268,2,8)
res<- unsplit(lapply(split(u3sNew,u3sNew$ID),function(x) {
                indx<-!is.na(x$AUC24)
                x1<- x$AUC24[indx]
                x2<- 2^(0:floor(log(4,2)))
                 x$AUC24<-if(sum(indx)==3) {
                    x$AUC24
                    }
                 else if( sum(indx)==2)    {
                      if(which(!indx)==2|which(!indx)==3) {
                       x1[1]* x2
                       }    
                       else (x1[1]/2)*x2
                                     
                    } 
                else if( sum(indx)==1) {
                   if(which(indx)==1) {
                    x1*x2
                    } 
                     else if(which(indx)==2){
                     (x1/2)*x2
                    }                
                     else (x1/4)*x2
                  }
                 else NA
                   x}),u3sNew$ID)

 u3sNew$AUC24
# [1]  2.0000      NA      NA      NA 20.2300      NA      NA  9.1634      NA
#[10]      NA      NA 20.0000      NA 20.2300      NA      NA  9.1634      NA
#[19]  2.0000  4.0000  8.0000 10.1150 20.2300      NA      NA  9.1634 18.3268
#[28]  2.0000      NA  8.0000      NA 20.2300      NA      NA  9.1634      NA

res$AUC24
# [1]  2.0000  4.0000  8.0000 10.1150 20.2300 40.4600  4.5817  9.1634 18.3268
#[10]  5.0000 10.0000 20.0000 10.1150 20.2300 40.4600  4.5817  9.1634 18.3268
#[19]  2.0000  4.0000  8.0000 10.1150 20.2300 40.4600  4.5817  9.1634 18.3268
#[28]  2.0000  4.0000  8.0000 10.1150 20.2300 40.4600  4.5817  9.1634 18.3268


A.K.








----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Monday, September 9, 2013 6:58 PM
Subject: Re: replacing Na's with values on different records



HI Ahmed,

No problem.

You got the error because one of the IDs had all NAs for AUC24.
Also, In your dataset, there are NAs in BLADDERWALL, BLADDERWALLC, AUC12d, 
Cmaxd,  in addition to WT, and AUC24.  How do you want to fill those NAs?


res<-unsplit(lapply(split(u3s,u3s$ID),function(x) { x$AUC24<- if(all(is.na(x$AUC24))) NA else as.integer(x$AUC24[!is.na(x$AUC24)]/2)* (2^(0:floor(log(4,2))));x$WT<- if(all(is.na(x$WT))) NA else x$WT[!is.na(x$WT)];x}),u3s$ID)
res
#   ID PERIOD DOSE VOL1stD MCC BLADDERWALL visit VOL1stDC MCCC BLADDERWALLC
#1 101     p1 0.03      22  72         2.9     3       -8  -21          1.2
#2 101     p2 0.06      24  80         1.0     4       -6  -13         -0.7
#3 101     p3 0.12      17  59         4.6     5      -13  -34          2.9
#4 102     p1 0.03       5  25         0.3     3      -10  -20         -1.3
#5 102     p2 0.06      67 125          NA     4       52   80           NA
#6 102     p3 0.12      10  24         0.2     5       -5  -21         -1.4
#7 103     p1 0.03       6  15         0.0     3      -23  -20         -0.1
#8 103     p2 0.06      58  72         0.8     4       29   37          0.7
#9 103     p3 0.12      15  35         0.5     5      -14    0          0.4
 #  AUC12d Cmaxd Cmind   WT                 PHENO RACE SEX     AGE_y AUC24
#1      NA    NA    NA 12.7                          1   2 1.8152448    NA
#2      NA    NA    NA 12.7                          1   2 1.8152448    NA
#3      NA    NA    NA 12.7                          1   2 1.8152448    NA
#4      NA    NA    NA 13.4 Extensive Metabolizer    1   1 2.1465338    10
#5 10.1150  2.98     0 13.4 Extensive Metabolizer    1   1 2.1465338    20
#6      NA    NA    NA 13.4 Extensive Metabolizer    1   1 2.1465338    40
#7      NA    NA    NA 10.0                          1   1 0.5010404     4
#8  4.5817  1.41     0 10.0                          1   1 0.5010404     8
#9      NA    NA    NA 10.0                          1   1 0.5010404    16

A.K.


----- Original Message -----
From: "El-Tahtawy, Ahmed" <Ahmed.El-Tahtawy at pfizer.com>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Monday, September 9, 2013 6:06 PM
Subject: RE: replacing Na's with values on different records

Dear Arun,

Thanks a million for the sophisticated code- it is little above my skill level. I never saw brilliant use of function x like this before!!( I am a clinical Pharmacologist who loves to explore patient data!!). it seems impossible to use for loop or a simpler function I guess?

The code worked with the simple data, tried to use it with actual data and got an error!!

################################################################################
dput(head(u3s,9))     # only 3 patients

structure(list(ID = c(101L, 101L, 101L, 102L, 102L, 102L, 103L, 
103L, 103L), PERIOD = c("p1", "p2", "p3", "p1", "p2", "p3", "p1", 
"p2", "p3"), DOSE = c("0.03", "0.06", "0.12", "0.03", "0.06", 
"0.12", "0.03", "0.06", "0.12"), VOL1stD = c(22L, 24L, 17L, 5L, 
67L, 10L, 6L, 58L, 15L), MCC = c(72L, 80L, 59L, 25L, 125L, 24L, 
15L, 72L, 35L), BLADDERWALL = c(2.9, 1, 4.6, 0.3, NA, 0.2, 0, 
0.8, 0.5), visit = c(3L, 4L, 5L, 3L, 4L, 5L, 3L, 4L, 5L), VOL1stDC = c(-8L, 
-6L, -13L, -10L, 52L, -5L, -23L, 29L, -14L), MCCC = c(-21L, -13L, 
-34L, -20L, 80L, -21L, -20L, 37L, 0L), BLADDERWALLC = c(1.2, 
-0.7, 2.9, -1.3, NA, -1.4, -0.1, 0.7, 0.4), AUC12d = c(NA, NA, 
NA, NA, 10.115, NA, NA, 4.5817, NA), Cmaxd = c(NA, NA, NA, NA, 
2.98, NA, NA, 1.41, NA), Cmind = c(NA, NA, NA, NA, 0, NA, NA, 
0, NA), WT = c(NA, 12.7, NA, NA, 13.4, NA, NA, 10, NA), PHENO = c("", 
"", "", "Extensive Metabolizer", "Extensive Metabolizer", "Extensive Metabolizer", 
"", "", ""), RACE = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), SEX = c(2L, 
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), AGE_y = c(1.815244771, 1.815244771, 
1.815244771, 2.146533786, 2.146533786, 2.146533786, 0.501040412, 
0.501040412, 0.501040412), AUC24 = c(NA, NA, NA, NA, 20.23, NA, 
NA, 9.1634, NA)), .Names = c("ID", "PERIOD", "DOSE", "VOL1stD", 
"MCC", "BLADDERWALL", "visit", "VOL1stDC", "MCCC", "BLADDERWALLC", 
"AUC12d", "Cmaxd", "Cmind", "WT", "PHENO", "RACE", "SEX", "AGE_y", 
"AUC24"), row.names = c(NA, 9L), class = "data.frame")
######################################################################
Here is your code- changed current ID to ID & AUC to AUC24 to match actual data

u3s1<-unsplit(lapply(split(u3s,u3s$"ID"),function(x) {
  (x$AUC24<-as.integer(x$AUC24[!is.na(x$AUC24)]/2)* (2^(0:floor(log(4,2))))); 
  x$WT<- x$WT[!is.na(x$WT)];
  x }),u3s$"ID")

################################################################################
Error in `$<-.data.frame`(`*tmp*`, "AUC24", value = numeric(0)) : 
  replacement has 0 rows, data has 3




Thank you again for your time and help..
Ahmed 
.


-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: Monday, September 09, 2013 5:00 PM
To: El-Tahtawy, Ahmed
Subject: Re: replacing Na's with values on different records

HI Ahmed,
No problem.  Don't know if you didn't get the reply or not.  This is what I sent..
u3s<- read.table(text="Current-ID visit AUC Wight ID1
101 3 . . 1
101 4 10 13 2
101 5  . . 3
102 3 .  . 4
102 4 4 10 5
102 5 . . 6
103 3 . . 7
103 4 6 9 8
103 5 . . 9",sep="",header=TRUE,na.strings=".",check.names=FALSE)

u3d<- read.table(text="Desired-ID visit AUC Wight ID1
101 3 5 13 1
101 4 10 13 2
101 5 20 13 3
102 3 2 10 4
102 4 4 10 5
102 5 8 10 6
103 3 3 9 7
103 4 6 9 8
103 5 12 9 9",sep="",header=TRUE,check.names=FALSE)

 u3s1<-unsplit(lapply(split(u3s,u3s$`Current-ID`),function(x) {(x$AUC<-as.integer(x$AUC[!is.na(x$AUC)]/2)* (2^(0:floor(log(4,2))))); x$Wight<- x$Wight[!is.na(x$Wight)];x }),u3s$`Current-ID`)

attr(u3s1,"row.names")<- attr(u3d,"row.names")
colnames(u3s1)<- colnames(u3d)
all.equal(u3s1,u3d)






----- Original Message -----
From: "El-Tahtawy, Ahmed" <Ahmed.El-Tahtawy at pfizer.com>
To: arun <smartpink111 at yahoo.com>
Cc: 
Sent: Monday, September 9, 2013 4:48 PM
Subject: RE: replacing Na's with values on different records

Hi Arun,

Thanks a million...
I am looking forward to seeing your replay tomorrow.

Best Regards
Ahmed 
.


-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com] 
Sent: Monday, September 09, 2013 4:17 PM
To: El-Tahtawy, Ahmed
Subject: Re: replacing Na's with values on different records

HI Ahmed,

I already sent the reply.  Let me know if that works.
A.K.




----- Original Message -----
From: "El-Tahtawy, Ahmed" <Ahmed.El-Tahtawy at pfizer.com>
To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
Cc: 
Sent: Monday, September 9, 2013 2:50 PM
Subject: RE: replacing Na's with values on different records

Greeting,

Thank you for your quick response.

There are 5 columns and only 3 patients. the actual data has 50 variables and so many patients; but a simplified version is made to help focus on the main problem. I used dput as suggested and there an "L" added to all numbers!!

Please let me know if you have any more questions...

head(u3s,10)
   ID visit AUC Wight ID1
1 101     3  NA    NA   1
2 101     4  10    13   2
3 101     5  NA    NA   3
4 102     3  NA    NA   4
5 102     4   4    10   5
6 102     5  NA    NA   6
7 103     3  NA    NA   7
8 103     4   6     9   8
9 103     5  NA    NA   9

> dput(u3s)
structure(list(ID = c(101L, 101L, 101L, 102L, 102L, 102L, 103L, 103L, 103L), visit = c(3L, 4L, 5L, 3L, 4L, 5L, 3L, 4L, 5L), AUC = c(NA, 10L, NA, NA, 4L, NA, NA, 6L, NA), Wight = c(NA, 13L, NA, NA, 10L, NA, NA, 9L, NA), ID1 = 1:9), .Names = c("ID", "visit", "AUC", "Wight", "ID1"), class = "data.frame", row.names = c(NA, -9L))

Best Regards
Ahmed
.

-----Original Message-----
From: smartpink111 at yahoo.com [mailto:smartpink111 at yahoo.com]
Sent: Monday, September 09, 2013 12:15 PM
To: El-Tahtawy, Ahmed
Subject: replacing Na's with values on different records

HI,

The example dataset "Current" and "Desired" are mangled by HTML.  It is not clear how many columns you have.  I tried it like this, but still it is still not clear..

ID visit AUC Wight ID1
101 3 1 101 4
10 13 2 101 5
3 102 3 4 102
4 4 10 5 102
5 6 103 3 7
103 4 6 9 8
103 5 9   ####two elements are missing.

Please use ?dput()
For e.g.
dput(head(dataset,20))

<quote author='El-Tahtawy, Ahmed'>
I'm sure I'm missing something really obvious in the "for loop"...

Here is simplified data for 3 patients, we need filling in Na's with same WT for each patient, AUC halved for visit 3, doubled for visit 5 for the same patient, based on visit 4


for(i in unique(u3s$ID)){                             #fill in same Wt for each patient
  u3s$WT <- ifelse(is.na(u3s$WT),u3s$WT[u3s$visit == "4"],u3s$WT)

  for(j in length(u3s$ID1)){                        #fill in .5 AUC for visit 3, 2*AUC for visit 5

    u3s$AUC24 <- ifelse(is.na(u3s$AUC24),u3s$AUC24[u3s$visit ==
"4"]*0.5,u3s$AUC24)
    u3s$AUC24 <- ifelse(!is.na(u3s$AUC24),u3s$AUC24[u3s$visit ==
"4"]*1.0,u3s$AUC24)
    u3s$AUC24 <- ifelse(is.na(u3s$AUC24),u3s$AUC24[u3s$visit ==
"4"]*2.0,u3s$AUC24)
  }
}

Current-
ID

visit

AUC

Wight

ID1

101

3





1

101

4

10

13

2

101

5





3

102

3





4

102

4

4

10

5

102

5





6

103

3





7

103

4

6

9

8

103

5





9


Desired-

ID

visit

AUC

Wight

ID1

101

3

5

13

1

101

4

10

13

2

101

5

20

13

3

102

3

2

10

4

102

4

4

10

5

102

5

8

10

6

103

3

3

9

7

103

4

6

9

8

103

5

12

9

9



Your help is greatly appreciated...


Best Regards

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

</quote>
Quoted from: 
http://r.789695.n4.nabble.com/replacing-Na-s-with-values-on-different-records-tp4675696.html


_____________________________________
Sent from http://r.789695.n4.nabble.com




More information about the R-help mailing list