[R] Correlation matrix for pearson correlation (r,p,BH(FDR))

Sarah Bazzocco sarah.bazzocco at vhir.org
Thu Jun 18 10:19:55 CEST 2015


This post was called "help" before, I changed the Subject.
Thanks for the comments.
Here the example: (I have the two lists saved as .csv and I can open them in R)

Sheet one- Genes (10 genes expression, not binary, meaured in 10 cell lines)
> genes
     Genes  Cell.line1 Cell.line2  Cell.line3  Cell.line4  Cell.line5
1   KCNAB3 12.02005181 11.1400910 15.60381163 13.44151596 25.37161030
2    KCNB1  0.02457449  1.3028535  0.81538294  0.59318327  0.15332321
3    KCNB2  0.44791862  0.1060137  0.09864136  0.00000000  0.00000000
4     KERA  0.06090217  0.0000000  0.03352993  0.03634781  0.04190912
5   KGFLP1  0.02450101  0.0000000  0.00000000  0.00000000  0.00000000
6   KGFLP2  0.00000000  0.0000000  0.00000000  0.00000000  0.00000000
7    KHDC1  0.00000000  0.0000000  0.00000000  0.00000000  0.00000000
8   KHDC1L  2.31894450  2.8252262  5.29099724  7.44183228  1.94629741
9   KHDC3L  0.00000000  0.0000000  0.00000000  0.00000000  0.00000000
10 KHDRBS1  0.00000000  0.0000000  0.00000000  0.00000000  0.00000000
   Cell.line6 Cell.line7  Cell.line8  Cell.line9 Cell.line10
1  8.12373424 7.67506261 24.43776341 18.33244818    9.224225
2  4.18181234 1.65268403  5.98346320  1.51423807    0.000000
3  0.05857207 0.05945414  0.20733924  0.05830982    0.000000
4  0.00000000 0.00000000  0.07752608  0.01585643   16.664245
5  0.02563099 0.03902548  0.00000000  0.00000000    0.000000
6  0.00000000 0.00000000  0.00000000  0.00000000    0.000000
7  0.00000000 0.00000000  0.00000000  0.00000000    0.000000
8  8.56022436 7.50838343  7.17964645  3.28602729    0.000000
9  0.00000000 0.00000000  0.00000000  0.00000000    3.598534
10 0.00000000 0.03081180  0.00000000  0.00000000    2.600173

Sheet two - features (2 features(Growth rate,drug sensitivity for 10 cell lines)
> features
         Cell.line Cell.line1 Cell.line2 Cell.line3 Cell.line4 Cell.line5
1      Growth rate         NA         NA         NA      51.41         NA
2 Drug sensitivity       5.03       6.57          8       1.26          3
  Cell.line6 Cell.line7 Cell.line8 Cell.line9 Cell.line10
1      41.33      26.76      24.19         NA          NA
2       1.40       1.88       1.33       5.05        9.12

What I found:
corr.test {psych}
corr.test(x, y = NULL, use = "pairwise",method="pearson",adjust="BH",alpha=.01)
--> I adjusted the original command to what I need (BH insted og holm) and alpha=.01 insted of 0.05.

I would be very happy, if someone could show me how to use this command, in particular how to refer as x and y to the two sheets I have (Genes and Features). I would take it from there.

Thanks a lot in advance.

Sarah






----- Original Message -----
From: "Rainer Schuermann" <Rainer.Schuermann at gmx.net>
To: "Sarah Bazzocco" <sarah.bazzocco at vhir.org>
Sent: Thursday, 18 June, 2015 8:14:56 AM
Subject: Re: [R] help



Hi Sarah, 

  

Not an answer to our question but a piece of well intended advice: 

  

1. Don't post HTML but plain text. Not only that people will tell you this in a sometimes not very friendly manner - using HTML actually does make posts illegible in this mailing list. Code, and R _is_ code, is always plain text. 

  

2. Don't pose an abstract problem - this looks too much like "Can you please do my work for me". Show us what you have tried already, and people will happily jump in and provide their thoughts and advice. 

  

3. Always make sure that you ave a reproducible example in your mail, and a set of data of the same type and structure you are using - ideally using dput(). 

  

See further advice here 

  

PLEASE do read the posting guide   http://www.R-project.org/posting-guide.html 

and provide commented, minimal, self-contained, reproducible code. 

  

and here: 

  

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example 

  

For your problem, R has an immense wealth of ideas and solutions. 

  

Rgds, 

Rainer 

  

  

  

On Wed June 17 2015 16:57:24 Sarah Bazzocco wrote: 

> 

> Hello, 

> 

>
> 

> I am a R-beginner and I need some help.�The question is very simple: I need to do a pearson correlations (r,p-value and FDR with BH) from an Expression array (with several thousand genes for lets say 20 cell lines)�with some features of those cell lines. 

> 

> 

> 

> My problem I have is the organization of the excel sheets and how to introduce the data into R and run the script. I though the easiest and more organized for me would be two expcel sheets: 

> 

> 1- Only Expression data (in rows the�genes and in colums cell lines) 

> 

> 2- Only the features (In row the features (e.g. a) growth rate, b) sensitivity to some drugs) and in columns the cell lines). 

> 

> 

> 

> -->That would creat both sheets with 20 colums. 

> 

> 

> 

> Now I would like to get a correlation of the gene 1: the expression of all lines with the growth rate. 

> 

> the same for gene2... and soforth. I sould obtain as many r,p and BH(FDR) as genes there are. 

> 

> the same I would need to do for the sensitivity... and so on. 

> 

> 

> 

> Do you think this is doable? I am not at all a bioinformatic expert, so all help is very welcome. 

> 

> 

> 

> Thank you very much! 

> 

> 

> 

> Kind regards, 

> 

> 

> 

> Sarah 

> 

> 

> 

> 

  

-- 


Sarah Bazzocco, PhD student 
Group of Molecular Oncology, 
CIBBIM-Nanomedicine, 
Vall d'Hebron Hospital Research Institute, 
Passeig Vall d'Hebron 119-129, 
Barcelona 08035, Spain. 
Tel: +34-93-489-4056 

Fax: +34-93-489-3893 
Email: sarah.bazzocco at vhir.org 



-- 


Sarah Bazzocco, PhD student 
Group of Molecular Oncology, 
CIBBIM-Nanomedicine, 
Vall d'Hebron Hospital Research Institute, 
Passeig Vall d'Hebron 119-129, 
Barcelona 08035, Spain. 
Tel: +34-93-489-4056 

Fax: +34-93-489-3893 
Email: sarah.bazzocco at vhir.org 



More information about the R-help mailing list