[R] repeated searching of no-missing values

Patrizio Frederic frederic.patrizio at gmail.com
Wed Dec 10 23:09:06 CET 2008


hi all,
I have a data frame such as:

1 blue  0.3
1 NA    0.4
1 red   NA
2 blue  NA
2 green NA
2 blue  NA
3 red   0.5
3 blue  NA
3 NA    1.1

I wish to find the last non-missing value in every 3ple: ie I want a 3
by 3 data.frame such as:

1 red   0.4
2 blue  NA
3 blue  1.1

I have written a little script

data = structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L
), V2 = structure(c(1L, NA, 3L, 1L, 2L, 1L, 3L, 1L, NA), .Label = c("blue",
"green", "red"), class = "factor"), V3 = c(0.3, 0.4, NA, NA,
NA, NA, 0.5, NA, 1.1)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-9L))

cl        = function(x) x[max(which(!is.na(x)))]
choose.last = function(x) tapply(x,x[,1],cl)

# now function choose.last works properly on numeric vectors:

> choose.last(data[,3])
  1   2   3
0.4  NA 1.1

# but not on factors (I loose the factor labels):

> choose.last(data[,2])
1 2 3
3 1 1

# moreover, if I apply this function to the whole data.frame
# the output is a character matrix

> apply(data,2,choose.last)
  V1  V2     V3
1 "1" "red"  "0.4"
2 "2" "blue" NA
3 "3" "blue" "1.1"

# and if I sapply, I loose factors labels

> sapply(data,choose.last)
  V1 V2  V3
1  1  3 0.4
2  2  1  NA
3  3  1 1.1

any hint?

Thanks in advance,

Patrizio

+-------------------------------------------------
| Patrizio Frederic, PhD
| Research associate in Statistics,
| Department of Economics,
| University of Modena and Reggio Emilia,
| Via Berengario 51,
| 41100 Modena, Italy
|
| tel:  +39 059 205 6727
| fax:  +39 059 205 6947
| mail: patrizio.frederic at unimore.it
+-------------------------------------------------



More information about the R-help mailing list