[R] aggregate

Gang Chen gangchen6 at gmail.com
Wed Aug 24 18:55:47 CEST 2016


Thanks a lot, David! I want to further expand the operation a little
bit. With a new dataframe:

myData <- data.frame(X=c(1, 2, 3, 4, 5, 6, 7, 8), Y=c(8, 7, 6, 5, 4,
3, 2, 1), S=c(‘S1’, ‘S1’, ‘S1’, ‘S1’, ‘S2’, ‘S2’, ‘S2’, ‘S2’),
Z=c(‘A’, ‘A’, ‘B’, ‘B’, ‘A’, ‘A’, ‘B’, ‘B’))

> myData

  X Y  S Z
1 1 8 S1 A
2 2 7 S1 A
3 3 6 S1 B
4 4 5 S1 B
5 5 4 S2 A
6 6 3 S2 A
7 7 2 S2 B
8 8 1 S2 B

I would like to obtain the same cross product between columns X and Y,
but at each combination level of factors S and Z. In other words, the
cross product would be still performed each two rows in the new
dataframe myData. How can I achieve that?

On Wed, Aug 24, 2016 at 11:54 AM, David L Carlson <dcarlson at tamu.edu> wrote:
> Your is fine, but it will be a little simpler if you use sapply() instead:
>
>> data.frame(Z=levels(myData$Z), CP=sapply(split(myData, myData$Z),
> +     function(x) crossprod(x[, 1], x[, 2])))
>   Z CP
> A A 10
> B B 10
>
> David C
>
>
> -----Original Message-----
> From: Gang Chen [mailto:gangchen6 at gmail.com]
> Sent: Wednesday, August 24, 2016 10:17 AM
> To: David L Carlson
> Cc: Jim Lemon; r-help mailing list
> Subject: Re: [R] aggregate
>
> Thank you all for the suggestions! Yes, I'm looking for the cross
> product between the two columns of X and Y.
>
> A follow-up question: what is a nice way to merge the output of
>
> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
>
> with the column Z in myData so that I would get a new dataframe as the
> following (the 2nd column is the cross product between X and Y)?
>
> Z   CP
> A   10
> B   10
>
> Is the following legitimate?
>
> data.frame(Z=levels(myData$Z), CP= unlist(lapply(split(myData,
> myData$Z), function(x) crossprod(x[, 1], x[, 2]))))
>
>
> On Wed, Aug 24, 2016 at 10:37 AM, David L Carlson <dcarlson at tamu.edu> wrote:
>> Thank you for the reproducible example, but it is not clear what cross product you want. Jim's solution gives you the cross product of the 2-column matrix with itself. If you want the cross product between the columns you need something else. The aggregate function will not work since it will treat the columns separately:
>>
>>> A <- as.matrix(myData[myData$Z=="A", 1:2])
>>> A
>>   X Y
>> 1 1 4
>> 2 2 3
>>> crossprod(A) # Same as t(A) %*% A
>>    X  Y
>> X  5 10
>> Y 10 25
>>> crossprod(A[, 1], A[, 2]) # Same as t(A[, 1] %*% A[, 2]
>>      [,1]
>> [1,]   10
>>>
>>> # For all the groups
>>> lapply(split(myData, myData$Z), function(x) crossprod(as.matrix(x[, 1:2])))
>> $A
>>    X  Y
>> X  5 10
>> Y 10 25
>>
>> $B
>>    X  Y
>> X 25 10
>> Y 10  5
>>
>>> lapply(split(myData, myData$Z), function(x) crossprod(x[, 1], x[, 2]))
>> $A
>>      [,1]
>> [1,]   10
>>
>> $B
>>      [,1]
>> [1,]   10
>>
>> -------------------------------------
>> David L Carlson
>> Department of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
>> Sent: Tuesday, August 23, 2016 6:02 PM
>> To: Gang Chen; r-help mailing list
>> Subject: Re: [R] aggregate
>>
>> Hi Gang Chen,
>> If I have the right idea:
>>
>> for(zval in levels(myData$Z))
>> crossprod(as.matrix(myData[myData$Z==zval,c("X","Y")]))
>>
>> Jim
>>
>> On Wed, Aug 24, 2016 at 8:03 AM, Gang Chen <gangchen6 at gmail.com> wrote:
>>> This is a simple question: With a dataframe like the following
>>>
>>> myData <- data.frame(X=c(1, 2, 3, 4), Y=c(4, 3, 2, 1), Z=c('A', 'A', 'B', 'B'))
>>>
>>> how can I get the cross product between X and Y for each level of
>>> factor Z? My difficulty is that I don't know how to deal with the fact
>>> that crossprod() acts on two variables in this case.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list