[BioC] Analysing multiple-platform gene expression data

Matthew McCall mccallm at gmail.com
Mon Jan 31 17:26:51 CET 2011


Max,

1) I've explored this with mixed results. There are certainly
differences in the behavior of the same probe across different
platforms. But if you want to try it out for yourself, take a look at
the hgu133a2ASaFrma function in the frmaTools package. Just be
cautious in your interpretation of the results.

2) Everything that we've made publicly available from the barcode work
is on this website: http://rafalab.jhsph.edu/barcode/. We don't have
the entire data matrices available to download (these are huge files),
but we do have the annotation files listing all the publicly available
data we used. If you want the data matrices, simply download all of
the CEL files listed in the annotation, preprocess them with frma, and
you'll have it.

Best,
Matt


On Mon, Jan 31, 2011 at 7:31 AM, Kauer Max <maximilian.kauer at ccri.at> wrote:
> Thanks Matt!
> Great! Actually I have two more questions for frma:
> 1) could one use frma also for hgu133a2 arrays (with the null-distribution vectors for hgu133a arrays)? I guess not, but I thought I'd ask anyway
> 2) could I somewhere access the expression-distribution (not only the null-distribution) for all genes, i.e. the data matrices that you used to construct these distributions?
>
> Thanks!
> max
>
>
>
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Matthew McCall [mailto:mccallm at gmail.com]
> Gesendet: Do 27.01.2011 14:39
> An: bioconductor at r-project.org; Kauer Max
> Betreff: Re: [BioC] Analysing multiple-platform gene expression data
>
> Max,
>
> You can certainly use the z-scores from the barcode function to
> combine hgu133a and hgu133plus2 data. Since the z-scores are based on
> platform-specific null distributions, they have the same meaning
> (number of sd's above the unexpressed mean) on all platforms. To gain
> robustness to batch effects, you might consider going further and
> using the actual barcode values (zeros and ones), but obviously this
> depends on what downstream analysis you want to do.
>
> Best,
> Matt
>
> On Thu, Jan 27, 2011 at 8:04 AM, Harris A. Jaffee <hj at jhu.edu> wrote:
>>
>>
>> Begin forwarded message:
>>
>>> From: Kauer Max <maximilian.kauer at ccri.at>
>>> Date: January 27, 2011 4:29:50 AM EST
>>> To: Marc Carlson <mcarlson at fhcrc.org>, bioconductor at r-project.org
>>> Subject: Re: [BioC] Analysing multiple-platform gene expression data
>>>
>>>
>>> Hi,
>>> along the same lines I wondered if one can take the z-scores from the
>>> barcode() function in the frma package. From my understanding these scores
>>> give a "distance" from the empirically defined value of no expression
>>> (separately for hgu133a and hgu133plus2), so in theory these could be
>>> comparable between platforms (?)
>>> Does anybody have an opinion on that?
>>>
>>> Best,
>>> Max
>>>
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: bioconductor-bounces at r-project.org im Auftrag von Marc Carlson
>>> Gesendet: Mi 26.01.2011 18:45
>>> An: bioconductor at r-project.org
>>> Betreff: Re: [BioC] Analysing multiple-platform gene expression data
>>>
>>> Hi Gabriel,
>>>
>>> I would urge caution.  Because even though "on paper" the different
>>> platforms might claim to be using many of the same probe sets, it is
>>> possible to actually measure differences that seem to be caused by
>>> nothing other than the fact that a given probeset was measured on one
>>> chip type vs another.
>>>
>>>
>>>  Marc
>>>
>>>
>>> On 01/26/2011 01:25 AM, gabriel teku wrote:
>>>>
>>>> Hi Jordi,
>>>> When I said multiple Affy platforms I meant different Affy chips, e.g.
>>>> hgu133a, hgu133plus2.
>>>> Is it OK and possible to remove probes not present in both platforms?
>>>> What are the bilogical/statistical implications of doing this.
>>>>
>>>> Thanks in advance
>>>>
>>>> On Mon, Jan 17, 2011 at 2:45 PM, Jordi Altirriba
>>>> <altirriba at hotmail.com>wrote:
>>>>
>>>>
>>>>> Dear Gabriel,
>>>>> We would need more information. What do you mean by different types of
>>>>> Affymetrix platforms? Platforms situated in different places, different
>>>>> machines, different Affymetix chips, etc, etc.
>>>>> Regards,
>>>>>
>>>>> Jordi Altirriba
>>>>>
>>>>>
>>>>> Message: 3
>>>>> Date: Mon, 10 Jan 2011 15:53:43 +0200
>>>>> From: gabriel teku <gabbyteku at gmail.com>
>>>>> To: bioconductor at r-project.org
>>>>> Subject: [BioC] Analysing multiple-platform gene expression data
>>>>> Message-ID:
>>>>> <AANLkTinzi+n2A=D4Nj_H3c4d4P7ruW3HF2BnKsB=vJO_ at mail.gmail.com>
>>>>> Content-Type: text/plain
>>>>>
>>>>> HI All,
>>>>> I'm trying to analyse microarray experiment data in which two types of
>>>>> Affymetrix platforms were used. However, I don't know how to handle
>>>>> these.
>>>>> I'll be great if I could get a heads up right from the beginning in
>>>>> terms
>>>>> of
>>>>> statistics, etc.
>>>>>
>>>>> Thanx
>>>>> Gabriel
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
>
>
> --
> Matthew N McCall, PhD
> 112 Arvine Heights
> Rochester, NY 14611
> Cell: 202-222-5880
>
>
>
>



-- 
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880



More information about the Bioconductor mailing list