[BioC] identifying sets of correlated genes

Robert Castelo robert.castelo at upf.edu
Mon Nov 26 14:35:59 CET 2012


hi Alyaa,

i also think you should give a try to the simple approach that Sean 
described in his previous email to see whether you get a clustering of 
samples close to what you're looking for. take a look at the 
MLInterfaces package and its first vignette for doing that with 
microarray expression data stored in ExpressionSet objects.

along the lines of what you are specifically asking for, correlations, 
you can always use the function cor() to calculate all pairwise Pearson 
correlations (this function needs a matrix of expression values with 
genes on the columns), and then threshold them at some cutoff to get the 
clusters you want to use them for a clustering of samples later, but 
this is not much different from what Sean was already proposing.

in any case, you should be aware that Pearson correlations are a 
marginal measure of association and thus sensitive to confounding 
factors, which although you say you do not expect them, with 257 
samples, chances for non-biological variability are high. you may want 
to give a try to a more restrictive measure of association such as 
conditional dependence which can give you better results in the presence 
of confounding. for that purpose you can use the 'qpgraph' package.

cheers,
robert.

On 11/26/2012 01:50 PM, Alyaa Mahmoud wrote:
>
>
>
> On Mon, Nov 26, 2012 at 1:51 PM, Robert Castelo <robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>> wrote:
>
>     hi,
>
>     few more questions,
>
>     how many samples do you have?
>
> 257
>
>
>     what is the structure of these data: are all samples from the same
>     experimental condition?
>
> yes
>
>
>     do you suspect the presence of some confounding factors such as
>     batch, gender (if applicable), strain (if applicable), etc...
>
> not really, I need to obtain sets of correlated genes (in
> expression/regulation...etc) and then re-cluster using these sets and
> observe the pattern of samples clustering.
>
>
>     are you looking for some specific type of correlated genes, such as
>     targets of DNA or RNA binding proteins?
>
> no, I am more interested in the behaviour of the samples rather, but I
> want to re-cluster using subsets of the genes.
>
>
>
>     robert.
>
>
>     On 11/26/2012 12:34 PM, Alyaa Mahmoud wrote:
>
>         Hi Dr Castelo
>
>         Gene expression dat
>
>         Thanks
>
>
>         On Mon, Nov 26, 2012 at 1:28 PM, Robert Castelo
>         <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
>         <mailto:robert.castelo at upf.edu
>         <mailto:robert.castelo at upf.edu>__>> wrote:
>
>              hi Alyaa,
>
>              from what kind of data?
>
>              cheers,
>              robert.
>
>
>              On 11/22/2012 10:14 AM, Alyaa Mahmoud wrote:
>
>                  Dear Group
>
>                  What the most convenient direct way of identifying sets of
>                  correlated genes
>                  ?
>
>                  Thanks a lot
>                  Alyaa
>
>
>              --
>              Robert Castelo, PhD
>              Associate Professor
>              Dept. of Experimental and Health Sciences
>              Universitat Pompeu Fabra (UPF)
>              Barcelona Biomedical Research Park (PRBB)
>              Dr Aiguader 88
>              E-08003 Barcelona, Spain
>              telf: +34.933.160.514 <tel:%2B34.933.160.514>
>         <tel:%2B34.933.160.514>
>              fax: +34.933.160.550 <tel:%2B34.933.160.550>
>         <tel:%2B34.933.160.550>
>
>
>
>
>
>         --
>         Alyaa Mahmoud
>
>         "Love all, trust a few, do wrong to none"- Shakespeare
>
>
>     --
>     Robert Castelo, PhD
>     Associate Professor
>     Dept. of Experimental and Health Sciences
>     Universitat Pompeu Fabra (UPF)
>     Barcelona Biomedical Research Park (PRBB)
>     Dr Aiguader 88
>     E-08003 Barcelona, Spain
>     telf: +34.933.160.514 <tel:%2B34.933.160.514>
>     fax: +34.933.160.550 <tel:%2B34.933.160.550>
>
>
>
>
> --
> Alyaa Mahmoud
>
> "Love all, trust a few, do wrong to none"- Shakespeare
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioconductor mailing list