[R] correlation by factor

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Jul 28 07:51:52 CEST 2021


Hello,

And here are three more ways. I will put the data, corrected in Bert's 
post, in a data.frame.


R <- c(1,8,3,6,7,2,3,7,2,3,3,4,3,7,3)
Day <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
Freq <- paste0("a", rep(1:5,3))
df1 <- data.frame(R, Day, Freq)

# Base R, as for the function, see Bert's post
sapply(split(df1[-3], df1$Freq), \(x) cor(x)[1,2])


# tidyverse
library(dplyr)
df1 %>%
   group_by(Freq) %>%
   summarise(Cor = cor(R, Day))


# data.table
library(data.table)
setDT(df1)[, .(Cor = cor(R, Day)), by = Freq]


Hope this helps,

Rui Barradas


Às 03:30 de 28/07/21, Bert Gunter escreveu:
> Well, first of all, your example is messed up. You missed the "c" in front
> of the ( in Freq <-; and all of the Freq entries need to be enclosed in
> quotes for proper syntax. A simpler way to do it is just to use paste() and
> rep():
> 
> Freq <- paste0("a", rep(1:5,3))
> (If you are not familiar with such "utility" functions, you should consider
> spending time with a basic R tutorial or two.)
> 
> Ordinarily, your individual vectors, R, Day and Freq, would be in a data
> frame or similar (e.g. a tibble or data.table) structure and you would use
> functions like by() in base R; or "tidyverse" or "data.table" package
> equivalents/elaborations of these.
> 
> Here is a base R version (you must have version 4.1.x for the anonymous
> function shortcut, \(x)  ) using by, but you may prefer tidyverse or
> data.table versions that others may  provide:
> 
>> out <- by(cbind(R,Day), factor(Freq), FUN = \(x)cor(x)[1,2]) ## to just
> get the off-diagonal of the 2x2 cor matrix
>> as.list(out)
> $a1
> [1] 1
> 
> $a2
> [1] -0.7559289
> 
> $a3
> [1] 0
> 
> $a4
> [1] 0.1889822
> 
> $a5
> [1] -0.8660254
> 
> See ?by and ?cor for details as needed.
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Tue, Jul 27, 2021 at 5:30 PM Marlin Keith Cox <marlinkcox using gmail.com>
> wrote:
> 
>> I am having problems making a correlation/association between two variables
>> by a factor.
>>
>> In the case below, I need to know the correlation between R and Day at each
>> frequency (a1-a5). Each frequency would have a corresponding correlation
>> between R and day.
>>
>> I have found a lm function that is similar to what I need.
>> lm(R~Day*Freq), but this wont apply to the cor function.
>>
>> Mind you, I have hundreds of these to with these same three columns, so if
>> there is an association package, I would be interested in those too.  I did
>> research it, but it quickly went over my head, so I thought I would
>> approach my problem this way.
>>
>> Data is below.
>>
>> Keith
>>
>> R<-c(1,8,3,6,7,2,3,7,2,3,3,4,3,7,3)
>> Day<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
>> Freq<-(a1,a2,a3,a4,a5,a1,a2,a3,a4,a5,a1,a2,a3,a4,a5,)
>>
>>
>>
>> M. Keith Cox, Ph.D.
>> Principal
>> MKConsulting
>> 17415 Christine Ave.
>> Juneau, AK 99801
>> U.S. 907.957.4606
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list