[R] R first.id last.id function error

jim holtman jholtman at gmail.com
Sat Sep 8 03:30:59 CEST 2007


This function should do it for you:


> file1 <- read.table(textConnection("   id rx week dv1
+ 1   1  1    1   1
+ 2   1  1    2   1
+ 3   1  1    3   2
+ 4   2  1    1   3
+ 5   2  1    2   4
+ 6   2  1    3   1
+ 7   3  1    1   2
+ 8   3  1    2   3
+ 9   3  1    3   4
+ 10  4  1    1   2
+ 11  4  1    2   6
+ 12  4  1    3   5
+ 13  5  2    1   7
+ 14  5  2    2   8
+ 15  5  2    3   5
+ 16  6  2    1   2
+ 17  6  2    2   4
+ 18  6  2    3   6
+ 19  7  2    1   7
+ 20  7  2    2   8
+ 21  8  2    1   9
+ 22  9  2    1   4
+ 23  9  2    2   5"), header=TRUE)
>
> mark.function <-
+ function(df){
+     df <- df[order(df$id, df$week),]
+     # create 'diff' of 'id' to determine where the breaks are
+     breaks <- diff(df$id)
+     # the first entry will be TRUE, and then every occurance of
non-zero in breaks
+     df$first.id <- c(TRUE, breaks != 0)
+     # the last entry is TRUE and every non-zero breaks
+     df$last.id <- c(breaks != 0, TRUE)
+     df
+ }
>
> mark.function(file1)
   id rx week dv1 first.id last.id
1   1  1    1   1     TRUE   FALSE
2   1  1    2   1    FALSE   FALSE
3   1  1    3   2    FALSE    TRUE
4   2  1    1   3     TRUE   FALSE
5   2  1    2   4    FALSE   FALSE
6   2  1    3   1    FALSE    TRUE
7   3  1    1   2     TRUE   FALSE
8   3  1    2   3    FALSE   FALSE
9   3  1    3   4    FALSE    TRUE
10  4  1    1   2     TRUE   FALSE
11  4  1    2   6    FALSE   FALSE
12  4  1    3   5    FALSE    TRUE
13  5  2    1   7     TRUE   FALSE
14  5  2    2   8    FALSE   FALSE
15  5  2    3   5    FALSE    TRUE
16  6  2    1   2     TRUE   FALSE
17  6  2    2   4    FALSE   FALSE
18  6  2    3   6    FALSE    TRUE
19  7  2    1   7     TRUE   FALSE
20  7  2    2   8    FALSE    TRUE
21  8  2    1   9     TRUE    TRUE
22  9  2    1   4     TRUE   FALSE
23  9  2    2   5    FALSE    TRUE
>
>


On 9/7/07, Gerard Smits <g_smits at verizon.net> wrote:
> Hi R users,
>
> I have a test dataframe ("file1," shown below) for which I am trying
> to create a flag for the first and last ID record (equivalent to SAS
> first.id and last.id variables.
>
> Dump of file1:
>
>  > file1
>    id rx week dv1
> 1   1  1    1   1
> 2   1  1    2   1
> 3   1  1    3   2
> 4   2  1    1   3
> 5   2  1    2   4
> 6   2  1    3   1
> 7   3  1    1   2
> 8   3  1    2   3
> 9   3  1    3   4
> 10  4  1    1   2
> 11  4  1    2   6
> 12  4  1    3   5
> 13  5  2    1   7
> 14  5  2    2   8
> 15  5  2    3   5
> 16  6  2    1   2
> 17  6  2    2   4
> 18  6  2    3   6
> 19  7  2    1   7
> 20  7  2    2   8
> 21  8  2    1   9
> 22  9  2    1   4
> 23  9  2    2   5
>
> I have written code that correctly assigns the first.id and last.id variabes:
>
> require(Hmisc)  #for Lags
> #ascending order to define first dot
> file1<- file1[order(file1$id, file1$week),]
> file1$first.id <- (Lag(file1$id) != file1$id)
> file1$first.id[1]<-TRUE      #force NA to TRUE
>
> #descending order to define last dot
> file1<- file1[order(-file1$id,-file1$week),]
> file1$last.id  <- (Lag(file1$id) != file1$id)
> file1$last.id[1]<-TRUE       #force NA to TRUE
>
> #resort to original order
> file1<- file1[order(file1$id,file1$week),]
>
>
>
> I am now trying to get the above code to work as a function, and am
> clearly doing something wrong:
>
>  > first.last <- function (df, idvar, sortvars1, sortvars2)
> +   {
> +   #sort in ascending order to define first dot
> +   df<- df[order(sortvars1),]
> +   df$first.idvar <- (Lag(df$idvar) != df$idvar)
> +   #force first record NA to TRUE
> +   df$first.idvar[1]<-TRUE
> +
> +   #sort in descending order to define last dot
> +   df<- df[order(-sortvars2),]
> +   df$last.idvar  <- (Lag(df$idvar) != df$idvar)
> +   #force last record NA to TRUE
> +   df$last.idvar[1]<-TRUE
> +
> +   #resort to original order
> +   df<- df[order(sortvars1),]
> +   }
>  >
>
> Function call:
>
>  > first.last(df=file1, idvar=file1$id,
> sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week))
>
> R Error:
>
> Error in as.vector(x, mode) : invalid argument 'mode'
>  >
>
> I am not sure about the passing of the sort strings.  Perhaps this is
> were things are off.  Any help greatly appreciated.
>
> Thanks,
>
> Gerard
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list