[R] N Sizes between Pairs of Columns using cor(, , , use = 'pairwise')

Doran, Harold HDor@n @end|ng |rom @|r@org
Tue Jan 21 21:06:31 CET 2020


Now that’s brilliant! And to get a vector of counts I could extend it to

rr <- crossprod(!is.na(tmp))

rr[lower.tri(rr),]

Thanks, Bill!

From: William Dunlap <wdunlap using tibco.com>
Sent: Tuesday, January 21, 2020 3:00 PM
To: Doran, Harold <HDoran using air.org>
Cc: r-help using r-project.org
Subject: Re: [R] N Sizes between Pairs of Columns using cor(, , , use = 'pairwise')

External email alert: Be wary of links & attachments.
crossprod(!is.na<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fis.na%2F&data=02%7C01%7CHDoran%40air.org%7C8fd9799ec01c4a7550e008d79eac8b57%7C9ea45dbc7b724abfa77cc770a0a8b962%7C0%7C0%7C637152336221357438&sdata=k2OoWHLJts1VNYOBVSn%2Fwv85LLjdpAcH93o7k6HWnJ0%3D&reserved=0>(tmp))

Bill Dunlap
TIBCO Software
wdunlap tibco.com<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftibco.com%2F&data=02%7C01%7CHDoran%40air.org%7C8fd9799ec01c4a7550e008d79eac8b57%7C9ea45dbc7b724abfa77cc770a0a8b962%7C0%7C0%7C637152336221367435&sdata=aonp2ClFToaETLQMSZ7ie629zzGBDxtOWRf3XjHMGMQ%3D&reserved=0>


On Tue, Jan 21, 2020 at 11:56 AM Doran, Harold <HDoran using air.org<mailto:HDoran using air.org>> wrote:
I'm trying to find an efficient way to find the N size on correlations produced when using the pairwise option in cor().

Here is a sample to illustrate:

### Create a sample data frame
tmp <- data.frame(v1 = rnorm(10), v2 = rnorm(10), v3 = rnorm(10), v4 = rnorm(10))

### Create some random missingness
for(i in 1:4) tmp[sample(1:10, 2, replace = FALSE), i] <- NA

### Correlate
cor(tmp, use = 'pairwise')

Now, a REALLY bad idea would be this (but conceptually it illustrates what I want)

### Identify all column pairs
pairs <- combn(4,2)

### Now, write code to loop over each pair of columns and identify where both rows are TRUE
!is.na<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fis.na%2F&data=02%7C01%7CHDoran%40air.org%7C8fd9799ec01c4a7550e008d79eac8b57%7C9ea45dbc7b724abfa77cc770a0a8b962%7C0%7C0%7C637152336221377429&sdata=LwSEn2KSTjjhT8BhnEvNwSKgt%2BFCj8WIgc8%2FOEsWXJY%3D&reserved=0>(tmp[, pairs[,1]])

Of course doing this when the number of pairwise combinations is silly. So, hmmm, I don't see as a by-product of the cor() function N sizes, and certainly looping over pairs of columns would be doable, but not efficient, but any suggestions on this?

Thanks,
Harold

______________________________________________
R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=02%7C01%7CHDoran%40air.org%7C8fd9799ec01c4a7550e008d79eac8b57%7C9ea45dbc7b724abfa77cc770a0a8b962%7C0%7C0%7C637152336221377429&sdata=sRZ8evNYBw9MdgChOLWiNFxJ5KJcPnHNT1tUHMrPvEM%3D&reserved=0>
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=02%7C01%7CHDoran%40air.org%7C8fd9799ec01c4a7550e008d79eac8b57%7C9ea45dbc7b724abfa77cc770a0a8b962%7C0%7C0%7C637152336221387423&sdata=%2F%2FvHb0sfGXF1F9ZaLBdcI6Hw3UVwY2rEYPcvBSuTga4%3D&reserved=0>
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list