[R] Comparing "transform" to "with"

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Sep 1 22:49:45 CEST 2007


Muenchen, Robert A (Bob) wrote:
> Hi All,
>
> I've been successfully using the with function for analyses and the
> transform function for multiple transformations. Then I thought, why not
> use "with" for both? I ran into problems & couldn't figure them out from
> help files or books. So I created a simplified version of what I'm
> doing:
>
> rm( list=ls() )
> x1<-c(1,3,3)
> x2<-c(3,2,1)
> x3<-c(2,5,2)
> x4<-c(5,6,9)
> myDF<-data.frame(x1,x2,x3,x4)
> rm(x1,x2,x3,x4)
> ls()
> myDF
>
> This creates two new variables just fine"
>
> transform(myDF,
>   sum1=x1+x2,
>   sum2=x3+x4
> )
>
> This next code does not see sum1, so it appears that "transform" cannot
> see the variables that it creates. Would I need to transform new
> variables in a second pass?
>
> transform(myDF,
>   sum1=x1+x2,
>   sum2=x3+x4,
>   total=sum1+sum2
> )
>
> Next I'm trying the same thing using "with". It doesn't not work but
> also does not generate error messages, giving me the impression that I'm
> doing something truly idiotic:
>
> with(myDF, {
>   sum1<-x1+x2
>   sum2<-x3+x4
>   total <- sum1+sum2
> } )
> myDF
> ls()
>
> Then I thought, perhaps one of the advantages of "transform" is that it
> works on the left side of the equation without using a longer name like
> myDF$sum1. "with" probably doesn't do that, so I use the longer form
> below. It also does not work and generates no error messages. 
>
> # Try it again, writing vars to myDF explicitly.
> # It generates no errors, and no results.
> with(myDF, {
>   myDF$sum1<-x1+x2
>   myDF$sum2<-x3+x4
>   myDF$total <- myDF$sum1+myDF$sum2
> } )
> myDF
> ls()
>
> I would appreciate some advice about the relative roles of these two
> functions & why my attempts with "with" have failed.
>   
Yes, transform() calculates all its new values, then assigns to the 
given names. This is expedient, but it has the drawback that new 
variables are not usable inside the expressions. A possible alternative 
implementation would be equivalent to a series of nested calls to 
transform, which of course you could also do manually:

transform(
  transform(myDF,
     sum1=x1+x2,
     sum2=x3+x4
  ),
  total=sum1+sum2
)

The problem with with() on data frames and lists is that, like the 
"eval" family of functions, _converts_ the object to an environment, and 
then evaluates the expression in the converted environment. The 
environment is temporary, so assignments to it get lost. The current 
development sources has a new (experimental) function within() which is 
like with(), but stores any modified variables back. (This is very 
recent and may or may not make it to 2.6.0).

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list