[R] Mean-Centering Question

arun smartpink111 at yahoo.com
Sun Dec 9 04:58:41 CET 2012


Hi,

It works for me also:
 by(dat1[c("Units","AveragePrice")],dat1[,1],specialFunction)
#dat1[, 1]: Los Angeles
 #      Units AveragePrice
#1  0.2136827  0.071790268
#2  2.2735148 -2.351758623
#3 -0.2083118  0.001082696
----------------------------------------------
#or

 by(cbind(Units=dat1[,3],AveragePrice=dat1[,4]),dat1[,1],specialFunction)
#INDICES: Los Angeles
 #      Units AveragePrice
#1  0.2136827  0.071790268
#2  2.2735148 -2.351758623
#3 -0.2083118  0.001082696
--------------------------------------------

A.K.






----- Original Message -----
From: "Ray DiGiacomo, Jr." <rayd at liondatasystems.com>
To: R Help <r-help at r-project.org>
Cc: 
Sent: Saturday, December 8, 2012 6:54 PM
Subject: [R] Mean-Centering Question

Hello,

I'm trying to create a custom function that "mean-centers" data and can be
applied across many columns.

Here is an example dataset, which is similar to my dataset:

*Location,TimePeriod,Units,AveragePrice*
Los Angeles,5/1/11,61,5.42
Los Angeles,5/8/11,49,4.69
Los Angeles,5/15/11,40,5.05
New York,5/1/11,259,6.4
New York,5/8/11,187,5.3
New York,5/15/11,177,5.7
Paris,5/1/11,672,6.26
Paris,5/8/11,514,5.3
Paris,5/15/11,455,5.2

I want to mean-center the "Units" and "AveragePrice" Columns.

So, I created this function:

specialFunction <- function(x){ log(x) - colMeans(log(x), na.rm = T) }

If I use only "one" column in the first argument of the "by" function,
everything is in fine.  For example the following code will work fine:

by(data[c("Units")],
data["Location"],
specialFunction)

But the following code will "not" work, because I have "two" columns in the
first argument...

by(data[c("Units", "AveragePrice")],
data["Location"],
specialFunction)

Does anyone have any ideas as to what I am doing wrong?

Please note that I'm trying to get the following results (for the "Los
Angeles" group):

Los Angeles "Units" variable (Mean-Centered)
0.213682659
-0.005370907
-0.208311751

Los Angeles "AveragePrice" variable (Mean-Centered)
0.071790268
-0.072872965
0.001082696

Best Regards,

Ray DiGiacomo, Jr.
Healthcare Predictive Analytics Specialist
President, Lion Data Systems LLC
President, The Orange County R User Group
Board Member, TDWI
rayd at liondatasystems.com
(m) 408-425-7851
San Juan Capistrano, California USA
twitter.com/liondatasystems
linkedin.com/in/raydigiacomojr
youtube.com/user/liondatasystems/videos

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list