[BioC] #Identify differentially expressed genes

Paolo Kunderfranco paolo.kunderfranco at gmail.com
Thu Jul 12 17:08:49 CEST 2012


Thanks but there is still something that i miss, here is the code that
I used, I set up sampletype, 4 samples, in triplicates, respect to the
position if the dataMatrix with my normalized expression values:

dataMatrix <- exprs(lumi.N.Q)
presentCount <- detectionCall(x.lumi)
selDataMatrix <- dataMatrix[presentCount > 0,]
probeList <- rownames(selDataMatrix)
sampleType <- c('A','A','A','B','B','B','C','D','D','C','C','D')

then I created a design matrix.

design <- model.matrix(~ factor(sampleType))
colnames(design) <- c('A','B','C','D')
fit1 <- lmFit(selDataMatrix, design)
constrast.matrix <- makeContrasts (A-B,C-D,levels=design)
fit1_2 <- contrasts.fit(fit1,constrast.matrix)
fit1_2 <- eBayes(fit1_2)
topTable(fit1_2,coef=2, adjust="BH")

and for instance I get top genes in coef=2 -> C-D
I tried to change sampletype excluding sample C like this:
sampleType <- c('A','A','A','B','B','B','x','D','D','x','x,'D')
run the same command above and I obtain some other topgenes, I would
rather expect to encounter some error because sample C is not anymore
defined, where am i wrong?





2012/7/12 Yao Chen <chenyao.bioinfor at gmail.com>:
> Hi Paolo,
>
> You just need to put all 4 groups in the design matrix. It's a clear example
> in the Limma userguide.:
>
>> f <- factor(targets$Target, levels=c("RNA1","RNA2","RNA3"))
>> design <- model.matrix(~0+f)
>> colnames(design) <- c("RNA1","RNA2","RNA3")
> To make all pair-wise comparisons between the three groups one could proceed
>> fit <- lmFit(eset, design)
>> contrast.matrix <- makeContrasts(RNA2-RNA1, RNA3-RNA2, RNA3-RNA1,
> + levels=design)
>> fit2 <- contrasts.fit(fit, contrast.matrix)
>> fit2 <- eBayes(fit2)
> A list of top genes for RNA2 versus RNA1 can be obtained from
>> topTable(fit2, coef=1, adjust="BH")
>
> Best,
>
> Jack
>
> 2012/7/12 Paolo [guest] <guest at bioconductor.org>
>>
>>
>> Hi all,
>> I have an illumina dataset, VST transformed and normalized. 4 samples each
>> in triplicates,
>> this is the sampleType=
>>
>> sampleType <- c('A','A','A','B','B','B','C','D','D','C','C','D')
>> Now I would like to do perform each pairwise comparison, A vs B, A vs C, A
>> vs D...etc.)
>>
>> but I am confused how to set the colnames(design)
>>
>> here i what i did
>>
>> if (require(limma)) {
>> sampleType <-
>> c('TG_TAC','TG_TAC','TG_TAC','WT_TAC','WT_TAC','WT_TAC','WT_SHAM','TG_SHAM','TG_SHAM','WT_SHAM','WT_SHAM','TG_SHAM')
>> ## compare 'A' and 'B'
>> design <- model.matrix(~ factor(sampleType))
>> colnames(design) <-c('A','B' )
>> fit <- lmFit(selDataMatrix, design)
>> fit <- eBayes(fit)
>> ## Add gene symbols to gene properties
>>         if (require(lumiMouseAll.db) & require(annotate)) {
>>                geneSymbol <- getSYMBOL(probeList, 'lumiMouseAll.db')
>>                geneName <- sapply(lookUp(probeList, 'lumiMouseAll.db',
>> 'GENENAME'), function(x) x[1])
>>                fit$genes <- data.frame(ID= probeList,
>> geneSymbol=geneSymbol, geneName=geneName, stringsAsFactors=FALSE)
>>           }
>> # print the top 50 genes
>>         print(topTable(fit, adjust='fdr', number=5))
>>
>> ## get significant gene list with FDR adjusted p.values less than 0.01
>>         p.adj <- p.adjust(fit$p.value[,2])
>>         sigGene.adj <- probeList[ p.adj < 0.01]
>>         ## without FDR adjustment
>>         sigGene <- probeList[ fit$p.value[,2] < 0.01]
>> }
>>
>>
>>
>> how do I properly set up each pairwise comparison?
>>
>> thanks
>> paolo
>>
>>
>>
>>
>>
>>
>>  -- output of sessionInfo():
>>
>> R version 2.15.0 (2012-03-30)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252
>> LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
>> [5] LC_TIME=Italian_Italy.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>>  [1] annotate_1.34.1           lumiMouseAll.db_1.18.0
>> org.Mm.eg.db_2.7.1        limma_3.12.1
>>  [5] lumiMouseIDMapping_1.10.0 RSQLite_0.11.1            DBI_0.2-5
>> AnnotationDbi_1.18.1
>>  [9] lumi_2.8.0                nleqslv_1.9.3             methylumi_2.2.0
>> ggplot2_0.9.1
>> [13] reshape2_1.2.1            scales_0.2.1              Biobase_2.16.0
>> BiocGenerics_0.2.0
>>
>> loaded via a namespace (and not attached):
>>  [1] affy_1.34.0           affyio_1.24.0         bigmemory_4.2.11
>> BiocInstaller_1.4.7   Biostrings_2.24.1
>>  [6] bitops_1.0-4.1        BSgenome_1.24.0       colorspace_1.1-1
>> dichromat_1.2-4       digest_0.5.2
>> [11] DNAcopy_1.30.0        GenomicRanges_1.8.7   genoset_1.6.0
>> grid_2.15.0           hdrcde_2.16
>> [16] IRanges_1.14.4        KernSmooth_2.23-8     labeling_0.1
>> lattice_0.20-6        MASS_7.3-19
>> [21] Matrix_1.0-7          memoise_0.1           mgcv_1.7-18
>> munsell_0.3           nlme_3.1-104
>> [26] plyr_1.7.1            preprocessCore_1.18.0 proto_0.3-9.2
>> RColorBrewer_1.0-5    RCurl_1.91-1.1
>> [31] Rsamtools_1.8.5       rtracklayer_1.16.2    stats4_2.15.0
>> stringr_0.6           tools_2.15.0
>> [36] XML_3.9-4.1           xtable_1.7-0          zlibbioc_1.2.0
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list