# [R] apply formula over columns by subset of rows in a dataframe (to get a new dataframe)

Massimo Bressan massimo.bressan at arpa.veneto.it
Sat May 14 10:44:54 CEST 2016

```thank you, what a nice compact solution with ave()

I learned something new about the subtleties of R

let me here summarize the alternative solutions, just in case someonelse might be interested...

thanks, bye

#

# my user function (an example)
mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, na.rm=TRUE))}

# my dataframe to apply the formula by blocks
mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0))

# blocks (factors) to be used for splitting
b <- mydf\$blocks

# 1 - split-lapply-unsplit with anonimous function to return a new df
s <- split(mydf, b)
l<- lapply(s, function(x) data.frame(x, v1mod=mynorm(x\$v1)))
mydf_new <- unsplit(l, mydf\$blocks)

# 2 - split-lapply-unsplit with function trasnform to return a new df
l <- split(mydf, b)
l <- lapply(l, transform, v1.mod = mynorm(v1))
mydf_new <- unsplit(l, b)

# 3 - ave() encapsulating split-lapply-unsplit approach
mydf_new<-transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm))

#

Da: "William Dunlap" <wdunlap at tibco.com>
A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
Cc: "David L Carlson" <dcarlson at tamu.edu>, "r-help" <r-help at r-project.org>
Inviato: Venerdì, 13 maggio 2016 19:22:21
Oggetto: Re: [R] apply formula over columns by subset of rows in a dataframe (to get a new dataframe)

ave() encapsulates the split/lapply/unsplit stuff so
transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm))
also gives what you got above.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, May 13, 2016 at 7:44 AM, Massimo Bressan < massimo.bressan at arpa.veneto.it > wrote:

yes, thanks

you pointed me in the right direction: split/unplist was the trick

I completely left behind that possibility!

here the final version

############

mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, na.rm=TRUE))}

mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0))

g <- mydf\$blocks
l <- split(mydf, g)
l <- lapply(l, transform, v1.mod = mynorm(v1))
mydf_new <- unsplit(l, g)

############

thanks again

massimo

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--

------------------------------------------------------------
Massimo Bressan

ARPAV
Agenzia Regionale per la Prevenzione e
Protezione Ambientale del Veneto

Dipartimento Provinciale di Treviso
Via Santa Barbara, 5/a
31100 Treviso, Italy

tel: +39 0422 558545
fax: +39 0422 558516
e-mail: massimo.bressan at arpa.veneto.it
------------------------------------------------------------

[[alternative HTML version deleted]]

```