[R] calculate row median of every three columns for a dataframe

PIKAL Petr petr@p|k@| @end|ng |rom prechez@@cz
Fri Apr 17 08:53:43 CEST 2020


Hi

As usual in R, things could be done by different ways.

idx <- (0:(ncol(dfr)-1))%/%3

aggregate(t(dfr), list(idx), median)
  Group.1 V1 V2 V3
1       0  2  3  4
2       1  4  5  1

Results should be OK although its structure is different, performance is not tested.

Cheers
Petr

> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of David McPearson
> Sent: Friday, April 17, 2020 7:50 AM
> To: r-help using r-project.org
> Cc: dcmcp using telstra.com
> Subject: Re: [R] calculate row median of every three columns for a dataframe
> 
> Anna wrote:
> >
> > Hi all,
> > I need to calculate a row median for every three columns of a
> > dataframe.  I made it work using the following script, but not happy
> > with the script.  Is there a simpler way for doing this?
> >
> 
> 
> 
> To which Jim L responded:
> >
> > Hi Anna,
> > I can't think of a simple way, but this function may make you happier:
> >
> > step_median<-function(x,window) {
> > x<-unlist(x)
> > stop<-length(x)-window+1
> > xout<-NA
> > nindx<-1
> > for(i in seq(1,stop,by=window)) {
> > xout[nindx]<-do.call("median",list(x[i:(i+window-1)]))
> > nindx<-nindx+1
> > }
> > return(xout)
> > }
> > apply(df,1,step_median,3)
> >
> > This should return a matrix where the columns are the medians
> > calculated from blocks of "window" width on each row of "df". As Bert
> > noted, you may want to think about a "rolling" median where the
> > "windows" overlap. This can be done like so:
> >
> > library(zoo)
> > apply(df,1,rollmedian,3)
> >
> > Jim
> 
> Another approach you might try is multiple calls to sapply/lapply. This won't
> rid you of loops, but it will hide them:
> 
> # Example data. Some names changed to avoid collisions between # R
> functions (collisions are in the gap between the headphones, # not i R).
> 
> dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6),
>    d = c(7,2,1), e = c(2,5,3), f = c(4,5,1))
> 
> # Turn each of the three-column groups into their own element # in a list.
> Note: the subsetting (probably) fails with an # error if ncol(dfr) is not a
> multiple of 3
> 
>   dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn)
>    dfr[ , enn + 0:2])
> 
> # Then you can use sapply to calculate the row medians for each # of the
> elements..
> 
> # Both of the following seem to work. I'm not sure which is # more readable…
> 
>   sapply(dlist, function(xx) apply(xx, 1, median))
> 
>   sapply(dlist, apply, 1, median)
> 
> # I'm sure the cognoscenti will have a much more elegant way # of doing this.
> 
> 
> Cheers y'all,
> DMcP
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list