[BioC] handle clustering and replicated probes in Agilent 4x44K : "philosiphical" question?

Daniela Marconi daniela.marconi at gmail.com
Sat Feb 9 00:36:07 CET 2008


Hi everybody,
I have to come back to the issue of replicates probes in the Agilent 4 x 44K.
Reading for example the answer of Gordon Smith

http://article.gmane.org/gmane.science.biology.informatics.conductor/13846/match=agilent+probe+replicates

I completely agree with him to treat the replicated probes, doing the
analysis to select the differentially expressed probes, as
indipendent.
In fact, I think that to average these probes (like in Feature
Extraction software and Rosetta Resolver ) before to perform the
analysis to identify differential expressed genes couldn't be a safe
solution in general (for example for within-array problems).

Now the question is: after have identified a set of differentially
expressed probes, let's say that we want to perform a hierarchical
clustering to "visualize" the differential gene expression profiling
adding a third class to evaluate the similarities of this new class
with the profile of the other two, what we have to do with the
replicates?

1)CONSIDER THE PROBES AS INDIPENDENT ALSO WHEN WE USE THE HIERARCHICAL CLUSTER?
In my opinion the implicit  constrain of this approach is to introduce
a "literature-bias" , because the replicated genes are those who are
better known in the literature as central- players in many different
process (just for example p53, ER and so on). In this way we force
implicitly the algorithm to be guided by those genes, if all (or most
of all) appears as differentially expressed in the list.
But, in my experience, this kind of bias is however introduced by
biologists or clinicians when they go through the list of
differentially expressed genes, to decide on which genes they have to
focus their attention (for validation and further investigation)

2) REDUCE "THE PROBES" TO JUST ONE "GENE"?
In this case the problem is how? I was thinking to select the probe
with the best adjusted p-value for example or at least to average only
the probes that are identified as differentially expressed.
The p-value in my opinion could be the best choice, but at the moment
is just an opinion.

Have someone faced this point?
Thank you for any help, suggestion or comment....
Daniela


Daniela Marconi
PhD Students
Physics Department
University of Bologna
Viale Berti Pichat 6/2
Bologna
Italy
office: +39 051 2095136



More information about the Bioconductor mailing list