[BioC] Nested Design (Again) & Subset WithinArray Correlation

Jenny Drnevich drnevich at illinois.edu
Thu Jul 29 00:12:45 CEST 2010


Hi Everyone,

I've been helping Osee with the second question he posted today. I'll 
explain it a bit further, as I'd like some help on how to interpret 
his results. He has an array where some of the probes (ENSGACT) were 
designed from known transcript sequences and other probes (GENSCAN) 
were designed from predicted sequences from a sequencing project. 
Further annotation of the predicted sequences has revealed that many 
of them actually overlap with the known transcripts sequences. He 
would like to estimate how correlated the expression values from the 
GENSCAN probes are to their matching ENSGACT probes. I thought this 
could be done by treating the probe pairs as technical replicates and 
running duplicateCorrelation() on them. On an array with true 
technical replication of probes, you'd hope the consensus correlation 
would be strongly positive, close to 1. Well, the consensus 
correlation for the GENSCAN:ENSGACT pairs is strongly _negative_ : 
between -0.8 and -0.92 depending on the subset of pairs we use. I 
can't quite figure out what the strong negative correlation means - 
it's probably something simple that I'm overlooking. We have no idea 
right now how much overlap there may be between ENSGACT probe oligo 
sequences and their corresponding GENSCAN probe oligo sequences.

Anyone have an explanation for the strong negative correlation?

Thanks,
Jenny

>Question 2: Testing Subset of within array replicates with different gene
>names. I have a subset of "overlapping" gene list [as below]  and I 
>would like
>to see how they correlate to
>assess the hybridization efficiency on the chip. The sequences and the
>probes are not identical, but overlap significantly. From reading the
>postings, I know I can't use duplicaleCorrelation, because the probes are
>randomly scattered on the array and I was not sure about how to use
>"avedups" in a subset of genes with different names.
>
>GENSCAN_ID                              Matched transcript ID
>GENSCAN00000010293      ENSGACT00000002218
>GENSCAN00000003508      ENSGACT00000001310
>GENSCAN00000021873      ENSGACT00000000225
>GENSCAN00000007931      ENSGACT00000000496
>GENSCAN00000022171      ENSGACT00000002296
>GENSCAN00000026278      ENSGACT00000000071
>GENSCAN00000000631      ENSGACT00000002139
>GENSCAN00000008636      ENSGACT00000002427
>GENSCAN00000008635      ENSGACT00000002432
>GENSCAN00000022111      ENSGACT00000007564
>
>Thank you so much and my apologies if this has been addressed before (You
>can
>point me to the discussion).
>
>Cheers,
>
>Osee
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu



More information about the Bioconductor mailing list