[BioC] #Identify differentially expressed genes

Paolo Kunderfranco paolo.kunderfranco at gmail.com
Tue Aug 28 17:22:04 CEST 2012


Hi,
Thanks for the fast reply,
Just another curiosity, I finally obtained list of genes significantly
differentially expressed between my conditions,
I would like now to build for each condition an heatmap. I constructed
for each one a data matrix based on a lumi.N.Q object (after VST
variance stabilizing transformation and normalization).

>dataMatrix <- exprs(lumi.N.Q)
>head(dataMatrix)
CMe_2     mES_2     CMa_1     CMp_1     CMe_1     mES_1     CMa_3
CMp_3     CMe_3     mES_3     CMa_2     CMp_2
T.kvUVfFyruaCeqZVs  7.306945  7.312214  7.302783  7.302670  7.294789
7.273550  7.381502  7.312916  7.301191  7.268140  7.272041  7.278205
Q96MoiqM_z49FU37tE  7.257576  7.278205  7.300322  7.226146  7.281574
7.281210  7.294743  7.393729  7.340853  7.315099  7.312916  7.270886

>heatmap(dataMatrix[1:5,])

I was wondering on the scale of the heatmap, usually values ranges
from -3 to 3, but data VST transformed in my example ranges from 7 to
14, with a median value of 7.5...
is it correct to plot a heatmap with VST transfmed data? or is
preferable to perfrom variance stabilization using log2 transform?
If is correct to use vst transformed data how do I refer to the scale?

Thanks for any suggestion,
Cheers,
Paolo





2012/8/23 Pan Du <dupan.mail at gmail.com>:
> Hi Paolo
>
> It's pretty normal that some probes cannot map to any genes because
> Human genome annotation keeps updating but the probe design is
> unchanged. So usually you can ignore those NA probes. If you are
> really interested in them, you can easily convert the nuID to probe
> sequence (use id2seq function) and map them to genome or refseq.
> I will update the vignette later to avoid such confusion.
>
> Pan
>
> On Thu, Aug 23, 2012 at 3:26 AM, Paolo Kunderfranco
> <paolo.kunderfranco at gmail.com> wrote:
>> Dear All,
>> I am working with lumi / limma package to detect differentially expressed
>> genes between two or more samples.
>> I was wondering why when I add geneSymbol and geneName to my Illumina
>> probelist most of them (around 10%)are not called and remained NA, for
>> instance (last row):
>>
>>
>> if (require(lumiMouseAll.db) & require(annotate)) {
>>                geneSymbol <- getSYMBOL(probeList, 'lumiMouseAll.db')
>>                geneName <- sapply(lookUp(probeList, 'lumiMouseAll.db',
>> 'GENENAME'), function(x) x[1])
>>                fit1_2$genes <- data.frame(ID= probeList,
>> geneSymbol=geneSymbol, geneName=geneName, stringsAsFactors=FALSE)
>>           }
>>
>>
>> 7671  69fpKOOuFduFbAjNVU     Dppa5a    developmental pluripotency
>> associated 5A  7.828381  9.333743 149.31710 2.773571e-18 6.144846e-14
>> 29.80858
>> 16014 QpWgiAmByT4gW7iui0     Pou5f1 POU domain, class 5, transcription
>> factor 1  5.305532  8.633706 103.85793 1.098423e-16 8.143726e-13
>> 27.72832
>> 20450 HlUzpCHheswfSZNdQo        Trh               thyrotropin
>> releasing hormone  5.603441  8.761965 103.81774 1.102739e-16
>> 8.143726e-13 27.72571
>> 7670  o7Ah_nzF7JdZOTtd9U      Dppa4     developmental pluripotency
>> associated 4  5.300619  8.626239  99.82457 1.640729e-16 9.087587e-13
>> 27.45790
>> 7672  xjn0tTp4isUXmUkAKI     Dppa5a    developmental pluripotency
>> associated 5A  7.663922  9.439668  97.09091 2.173661e-16 9.631491e-13
>> 27.26346
>> 17719 ZXvxHuC6s3xogRFJfo      Sall4                     sal-like 4
>> (Drosophila)  4.456642  8.585243  90.39110 4.484584e-16 1.655932e-12
>> 26.74469
>> 14084 06jqfFxe5_X97NRXuk       Myl3                 myosin, light
>> polypeptide 3 -7.736059 13.128014 -88.39591 5.622067e-16 1.779384e-12
>> 26.57755
>> 8757  oii7mSFyrr_AMWODH0       <NA>
>>     <NA>  4.608167  8.438770  78.65631 1.833459e-15 5.077535e-12
>> 25.66512
>>
>> any ideas?
>> thanks
>> paolo
>>



More information about the Bioconductor mailing list