[R] Computing stats on common parts of multiple dataframes

Tue Feb 13 22:23:44 CET 2007

Suppose our data frames are called DF1, DF2 and DF3.  Then
find the least number of rows, n, among them.  Create a
list, DFs, of the last n rows of the data frames and another
list, mats, which is the same but in which each component is a
matrix.  Create a parallel median function, pmedian, analogous
to pmax and mapply it to the matrices.  Finally replace that
back into a data frame.

n <- min(sapply(L, nrow))
DFs <- lapply(list(DF1, DF2, DF3), tail, n)
mats <- lapply(DFs, as.matrix)
pmedian <- function(...) median(c(...))
medians <- do.call(mapply, c(pmedian, mats))
replace(DFs[[1]], TRUE, medians)

On 2/13/07, Murali Menon <feanor0 at hotmail.com> wrote:
> Folks,
>
> I have three dataframes storing some information about
> two currency pairs, as follows:
>
> R> a
>
> EUR-USD NOK-SEK
> 1.23    1.33
> 1.22    1.43
> 1.26    1.42
> 1.24    1.50
> 1.21    1.36
> 1.26    1.60
> 1.29    1.44
> 1.25    1.36
> 1.27    1.39
> 1.23    1.48
> 1.22    1.26
> 1.24    1.29
> 1.27    1.57
> 1.21    1.55
> 1.23    1.35
> 1.25    1.41
> 1.25    1.30
> 1.23    1.11
> 1.28    1.37
> 1.27    1.23
>
>
>
> R> b
> EUR-USD NOK-SEK
> 1.23    1.22
> 1.21    1.36
> 1.28    1.61
> 1.23    1.34
> 1.21    1.22
>
>
>
> R> d
>
> EUR-USD NOK-SEK
> 1.27    1.39
> 1.23    1.48
> 1.22    1.26
> 1.24    1.29
> 1.27    1.57
> 1.21    1.55
> 1.23    1.35
> 1.25    1.41
> 1.25    1.33
> 1.23    1.11
> 1.28    1.37
> 1.27    1.23
>
> The twist is that these entries correspond to dates where the
> *last* rows in each frame are today's entries, and so on
> backwards in time.
>
> I would like to create a matrix of medians (a median for each row
> and for each currency pair), but only for those rows where all
> dataframes have entries.
>
> My answer in this case should look like:
>
> EUR-USD NOK-SEK
>
> 1.25    1.41
> 1.25    1.33
> 1.23    1.11
> 1.28    1.37
> 1.27    1.23
>
> where the last EUR-USD entry = median(1.27, 1.21, 1.27), etc.
>
> Notice that the output is of the same dimensions as the smallest dataframe
> (in this case 'b').
>
> I can do it in a clumsy fashion by first obtaining the number
> of rows in the smallest matrix, chopping off the top rows
> of the other matrices to reduce them this size, then doing a
> for-loop across each currency pair, row-wise, to create a
> 3-vector which I then apply median() on.
>
> Surely there's a better way to do this?
>
> Please advise.
>
> Thanks,
>
> Murali Menon
>
> _________________________________________________________________
> Valentine's Day -- Shop for gifts that spell L-O-V-E at MSN Shopping
>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>