[BioC] linear model for illumina Data(identifying differentially expressed genes)

Md.Mamunur Rashid mamunur.rashid at kcl.ac.uk
Wed Aug 26 11:27:47 CEST 2009


Dear All,

first I would like to apologize for a long mail.

I am working with probe profile file(tab separated file) generated by illumina beadstudio software. Main objective is to identify differentially expressed genes.
the experiment is done using single channel illumina microarray data.
There are 8 samples (from 3 groups)

1. 4135447095_A&  4135447095_B the replicates (patient with rupture 'R')
2. 4135447095_C, 4135447095_D are replicates (control 1  - 'C')
3. 4135447095_E&  4135447095_F are replicates (control 2 - 'C')
4. 4135447095_G&  4135447095_H are replicates( patient without rupture 'W')

I have used following code (lumi and limma package ) to pre-process the data and identify the differentially expressed genes.
Finally using the Venn diagram method in limma shows that around 5000 genes are differentially expressed which is really unexpected
(!!! OR may be I am not understanding it. I would really appreciate if anybody can enlight me about the data in the diagram).
I have attached the code below and the*probe profile file*  is uploaded in the link below.

http://www.esnips.com/doc/471328df-77e3-4c70-ba03-e12ccea21ac2/test_Sample_Probe_Profile



thanks in advance.
regards,
mamun


## ***************************************
## **********   R code   ******************
## loading library

library(lumi)
library(limma)
library(mgcv)
library(lumiHumanAll.db)
library(lumiHumanIDMapping)
library(annotate)
library(GO.db)

## data acquisition and pre-process

pilot_raw<- lumiR("test_Sample_Probe_Profile.txt",detectionTh =0.05)
pilot_b<- lumiB(pilot_raw)
pilot_t<- lumiT(pilot_b)
pilot_norm<- lumiN(pilot_t)

## identifying differentially expressed genes  ##
#################################################

pilot_Matrix<- exprs(pilot_norm)
probeList<- rownames(pilot_Matrix)

## creating the design matrix
pilot_sampleType<- c('R','R','C','C','C','C','W','W')
design_norm_pilot<- model.matrix(~0+pilot_sampleType)
colnames(design_norm_pilot)<- c('C','R','W')

pilot_fit<- lmFit(pilot_Matrix,design_norm_pilot)
pilot.contrast<- makeContrasts (R-C,W-R,W-C, levels=design_norm_pilot)
pilot_fit2<- contrasts.fit(pilot_fit,pilot.contrast)
pilot_fit2<- eBayes(pilot_fit2)

topTable(pilot_fit2,coef=1, adjust="BH")
result<- decideTests(pilot_fit2)
*vennDiagram(result)*

## ******* End of Code *********** ##


*Result of top expressed genes :*

line     ID     logFC   AveExpr         t      P.Value    adj.P.Val        B
9271  5220477  3.207187  7.518875  47.05839 1.099642e-13 2.439447e-09 19.48339
13425 2190519  2.783416  6.019115  43.02541 2.837996e-13 3.147905e-09 18.97582
20501 4590356 -2.479955  8.163643 -33.27002 4.288202e-12 3.170982e-08 17.26973
19555 4180546  1.647948  6.693443  31.70328 7.128156e-12 3.953275e-08 16.91037
12424 4150224 -1.335114  8.892449 -25.96928 5.799401e-11 2.573078e-07 15.30650
14224 3710184  1.776777  6.479575  25.18035 8.012080e-11 2.962333e-07 15.04307
7469  2230601  2.028836  8.904665  24.04887 1.296230e-10 3.757779e-07 14.64359
6648  7650678  1.425908  9.373222  23.94687 1.355131e-10 3.757779e-07 14.60626
4552  6770646 -1.651834  7.018473 -22.85906 2.202405e-10 5.428683e-07 14.19372
21007 5570673 -1.369335 10.130594 -22.26554 2.897959e-10 6.428833e-07 13.95698






More information about the Bioconductor mailing list