Eric Berger
Fri Apr 17 11:36:46 CEST 2020
Some comments on the contributions:
a) for Petr's suggestion, to return the desired structure modify the
statement to
t(aggregate(t(dfr), list(idx), median)[,-1])
And, although less readable, can certainly be put in a one-liner
solution by removing the idx definition
t(aggregate(t(dfr), list((0:(ncol(dfr)-1))%/%3), median)[,-1])
b) to DMcP: "# I'm sure the cognoscenti will have a much more elegant way"
+1 for elegance (in my view)
c) to Jim: I think your code is instructive. From a style viewpoint I would
recommend against naming a local variable 'stop' :-)
Best,
Eric
On Fri, Apr 17, 2020 at 9:54 AM PIKAL Petr <petr.pikal using precheza.cz> wrote:
> Hi
>
> As usual in R, things could be done by different ways.
>
> idx <- (0:(ncol(dfr)-1))%/%3
>
> aggregate(t(dfr), list(idx), median)
> Group.1 V1 V2 V3
> 1 0 2 3 4
> 2 1 4 5 1
>
> Results should be OK although its structure is different, performance is
> not tested.
>
> Cheers
> Petr
>
> > -----Original Message-----
> > From: R-help <r-help-bounces using r-project.org> On Behalf Of David McPearson
> > Sent: Friday, April 17, 2020 7:50 AM
> > To: r-help using r-project.org
> > Cc: dcmcp using telstra.com
> > Subject: Re: [R] calculate row median of every three columns for a
> dataframe
> >
> > Anna wrote:
> > >
> > > Hi all,
> > > I need to calculate a row median for every three columns of a
> > > dataframe. I made it work using the following script, but not happy
> > > with the script. Is there a simpler way for doing this?
> > >
> >
> >
> >
> > To which Jim L responded:
> > >
> > > Hi Anna,
> > > I can't think of a simple way, but this function may make you happier:
> > >
> > > step_median<-function(x,window) {
> > > x<-unlist(x)
> > > stop<-length(x)-window+1
> > > xout<-NA
> > > nindx<-1
> > > for(i in seq(1,stop,by=window)) {
> > > xout[nindx]<-do.call("median",list(x[i:(i+window-1)]))
> > > nindx<-nindx+1
> > > }
> > > return(xout)
> > > }
> > > apply(df,1,step_median,3)
> > >
> > > This should return a matrix where the columns are the medians
> > > calculated from blocks of "window" width on each row of "df". As Bert
> > > noted, you may want to think about a "rolling" median where the
> > > "windows" overlap. This can be done like so:
> > >
> > > library(zoo)
> > > apply(df,1,rollmedian,3)
> > >
> > > Jim
> >
> > Another approach you might try is multiple calls to sapply/lapply. This
> won't
> > rid you of loops, but it will hide them:
> >
> > # Example data. Some names changed to avoid collisions between # R
> > functions (collisions are in the gap between the headphones, # not i R).
> >
> > dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6),
> > d = c(7,2,1), e = c(2,5,3), f = c(4,5,1))
> >
> > # Turn each of the three-column groups into their own element # in a
> list.
> > Note: the subsetting (probably) fails with an # error if ncol(dfr) is
> not a
> > multiple of 3
> >
> > dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn)
> > dfr[ , enn + 0:2])
> >
> > # Then you can use sapply to calculate the row medians for each # of the
> > elements..
> >
> > # Both of the following seem to work. I'm not sure which is # more
> readable…
> >
> > sapply(dlist, function(xx) apply(xx, 1, median))
> >
> > sapply(dlist, apply, 1, median)
> >
> > # I'm sure the cognoscenti will have a much more elegant way # of doing
> this.
> >
> >
> > Cheers y'all,
> > DMcP
