[R] correlation, import of large tables, test for point-biserial c.c.?

Adaikalavan Ramasamy gisar at nus.edu.sg
Fri Jul 11 16:39:05 CEST 2003


1) Calculating a 121 x 121 correlation matrix and then extracting the relevant correlating is extremely inefficient and slow. Instead try this :

data  <- as.matrix( data )                           # data is your 91 x 121 matrix or dataframe
colInterest   <- data[ ,1]
apply( data, 2, function(x) cor(x , colInterest) )


Here you take each column of data (at which point it becomes a vector called x) and calculates its correlation. apply() is an efficient form of for() loop.


3) I have imported files of much bigger dimensions without any problem. First of all ensure that the data is in tab delimited or comma seperated not .xls. Next use read.delim or read.csv to read in the file. 

If the file is only partially loaded or garbled up, then check for special characters. Most often the culprit is the comment character #. Sometimes % @ etc can also cause a problem. 

I have no idea about the other questions. If you type in help.start(), you will get a help page where you can do a keyword search etc.


-----Original Message-----
From: ArneSaatkamp at gmx.de [mailto:ArneSaatkamp at gmx.de] 
Sent: Friday, July 11, 2003 10:01 PM
To: r-help at stat.math.ethz.ch
Subject: [R] correlation, import of large tables,test for point-biserial c.c.?


Dear R help community,

I want to calculate correlations between environment parameters and species abundance data. When I use the cor() for my table (121 columns 91 rows) R generates a dataset with the correlations between all columns; 

1) How can I limit the calculations to the correlations of only the first column with every other ? (Or:) How can I extract the line/row in question from the cor() dataset produced by R ?

2) I assume that with one continuous factor and the other binary (0/1), 
cor() gives the point-biserial correlation coefficient, but how can I find the method used by R ?

3) I was not able to import (from "Excel") the whole 121x67 table, for instance I divided it into pieces. Is there a simple solution to import the whole file ?

4) In the end I want to test the correlation coefficients. Where do I find an appropriated test for the point biserial correlation ? Can R calculate the coefficient and test it for all data in one step ?

I just started working & learning with R, but even after reading the R-help and Introduction to R, I still have big difficulties, so Thanks in advance for your help !!

Arne Saatkamp

Arne Saatkamp
Inst. f. Biol. II - Abt. f. Geobotanik
Schänzlestr. 1
79104 Freiburg
Germany

-- 


Jetzt ein- oder umsteigen und USB-Speicheruhr als Prämie sichern!

______________________________________________
R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list