[BioC] how to find out the differentially expressed genes?how to downweight the arrays?

Sat Sep 24 08:49:35 CEST 2005

I am not sure if someone has answered you.

First regarding small sample sizes. The following article shows that
LIMMA does a pretty decent job for small sample sizes
http://www3.interscience.wiley.com/cgi-bin/abstract/110492632/ABSTRACT

However with small sample sizes, the effect of poor quality experiments,
outliers and misclassified samples have much larger influence. This is
an argument against small sizes in general.

Second, you have a highly imbalanced dataset. As I mentioned before, you
need to increase the size of the normal groups. So I am not surprised
your results are "not nice" and further I suspect it will be quite
sensitive to different tests (i.e. not robust enough).

Third, you can ask your biologists if the the top k genes are
interesting despite being insignificant. If they find it interesting,
then suggest they find they money to do larger study.

Now answering your specific questions :

1) This depends on the type and number of arrays as well as what is
running in the background. But I assume you already have done this to
produce the output below, so I am not sure what your question is.

2) Google gave http://bioinf.wehi.edu.au/folders/arrayweights/ but I
cannot find a related article, which means that it might be in
preparation or in press. You can try asking the first author or look at
the simple example given in help("arrayWeights").

3) From the example given in help("arrayWeights"), it appears that the
weighting is only incorporated during model fitting. 

Regards, Adai

On Thu, 2005-09-15 at 21:27 -0700, weinong han wrote:
> 17 samples(3 normal samples, 14 NPC tumor samples from different patients)were used in my Affymetrix microarray experiments. The small  size microarrays were recommmended to be analyzed using LIMMA. After moderated t statistic, I found the results were not so nice. please see attachment.
>  
> quality assessment was recommended in the first steps of data analysis. In some cases poor quality arrays will have to be dropped, but an alternative is to downweight the lower quality arrays using the arrayWeights() function in limma or array level standard errors from affyPLM. 
>  
> My Questions:
> 1. my RAM is 512M(windows xp), can the RAM size be used for 17 affymetrix chips?
> 2. If poor quality arrays checked, how to downweight the lower quality arrays using the arrayWeights() function in limma or array level atandard errors from affyPLM? All chips downweighted  or only poor quality arrays downweighted?
> 3. The downweight will change the original expression values or not?
>  
> Any advice and suggestions will be much appreciated.
>  
> dir()
> >  [1] "G05.CEL"    "G09.CEL"    "G10.CEL"    "G12.CEL"    "G15.CEL"
> >  [6] "G19.CEL"    "GF.CEL"     "GM.CEL"     "H044.CEL"   "H05.CEL"
> >[11] "H07.CEL"    "H10.CEL"    "H11.CEL"    "H14.CEL"    "hgu133acdf"
> >[16] "N01.CEL"    "N02.CEL"    "N03.CEL"
> > > library(limma)
> > > library(affy)
> >Loading required package: Biobase
> >Loading required package: tools
> >Welcome to Bioconductor
> >          Vignettes contain introductory material.  To view,
> >          simply type: openVignette()
> >          For details on reading vignettes, see
> >          the openVignette help page.
> >Loading required package: reposTools
> > > Data <- ReadAffy()
> > > eset <- rma(Data)
> >Background correcting
> >Normalizing
> >Calculating Expression
> > > pData(eset)
> >          sample
> >G05.CEL       1
> >G09.CEL       2
> >G10.CEL       3
> >G12.CEL       4
> >G15.CEL       5
> >G19.CEL       6
> >GF.CEL        7
> >GM.CEL        8
> >H044.CEL      9
> >H05.CEL      10
> >H07.CEL      11
> >H10.CEL      12
> >H11.CEL      13
> >H14.CEL      14
> >N01.CEL      15
> >N02.CEL      16
> >N03.CEL      17
> > > tissue <- 
> > 
> c("C","C","C","C","C","C","C","C","C","C","C","C","C","C","N","N","N")
> > > design <- model.matrix(~factor(tissue))
> > > colnames(design) <- c("C", "CvsN")
> > > design
> >    C CvsN
> >1  1    0
> >2  1    0
> >3  1    0
> >4  1    0
> >5  1    0
> >6  1    0
> >7  1    0
> >8  1    0
> >9  1    0
> >10 1    0
> >11 1    0
> >12 1    0
> >13 1    0
> >14 1    0
> >15 1    1
> >16 1    1
> >17 1    1
> >attr(,"assign")
> >[1] 0 1
> >attr(,"contrasts")
> >attr(,"contrasts")$"factor(tissue)"
> >[1] "contr.treatment"
> >
> >
> > > fit <-lmFit(eset,design)
> > > fit <-eBayes(fit)
> > > options(digits=2)
> > > topTable(fit,coef=2,n=50,adjust="fdr")
> >                ID     M   A    t P.Value    B
> >22193  78047_s_at  0.60 7.3  5.3    0.82 -3.4
> >2594  203065_s_at -1.26 6.7 -5.0    0.82 -3.5
> >10680 211245_x_at  0.58 4.9  4.7    1.00 -3.6
> >17919 218554_s_at  0.59 4.7  4.5    1.00 -3.6
> >9431  209945_s_at -0.67 6.1 -4.5    1.00 -3.6
> >4556  205029_s_at  3.09 3.6  4.4    1.00 -3.6
> >4557    205030_at  3.58 4.6  4.3    1.00 -3.6
> >5845  206319_s_at  0.82 4.0  4.3    1.00 -3.7
> >21838    36019_at  0.67 6.7  4.2    1.00 -3.7
> >5209  205682_x_at  0.61 4.8  4.2    1.00 -3.7
> >6791  207266_x_at -0.95 7.8 -4.0    1.00 -3.7
> >21916    38447_at  0.66 7.3  4.0    1.00 -3.7
> >21914    38340_at  0.59 6.3  3.9    1.00 -3.8
> >16241   216871_at  0.59 3.4  3.9    1.00 -3.8
> >982   201454_s_at -0.65 6.2 -3.9    1.00 -3.8
> >22024    46256_at  0.62 7.2  3.9    1.00 -3.8
> >7489  207978_s_at  0.47 4.3  3.8    1.00 -3.8
> >4452    204925_at  0.48 5.0  3.8    1.00 -3.8
> >7121    207600_at  0.48 5.5  3.7    1.00 -3.8
> >12443 213060_s_at  1.41 6.0  3.7    1.00 -3.8
> >1619    202091_at  0.51 3.3  3.7    1.00 -3.8
> >9890    210412_at  0.53 3.5  3.6    1.00 -3.8
> >21922  38707_r_at  0.45 7.8  3.6    1.00 -3.9
> >2715    203187_at  0.59 5.8  3.6    1.00 -3.9
> >3354    203827_at -0.99 5.5 -3.6    1.00 -3.9
> >5340  205813_s_at  0.52 5.8  3.5    1.00 -3.9
> >2445  202916_s_at -0.61 6.1 -3.5    1.00 -3.9
> >18810   219446_at -0.68 5.9 -3.5    1.00 -3.9
> >14010   214632_at -0.54 4.2 -3.4    1.00 -3.9
> >2915    203388_at  0.46 6.2  3.4    1.00 -3.9
> >21936    396_f_at  0.70 7.7  3.4    1.00 -3.9
> >16292 216922_x_at  0.61 3.8  3.4    1.00 -3.9
> >13378   213999_at  0.44 4.5  3.4    1.00 -3.9
> >9642    210158_at  0.58 4.4  3.4    1.00 -3.9
> >19117   219753_at  0.65 5.6  3.4    1.00 -3.9
> >10820 211405_x_at  0.53 5.3  3.4    1.00 -3.9
> >19242 219878_s_at -0.58 4.5 -3.4    1.00 -3.9
> >3275  203748_x_at -0.90 7.9 -3.4    1.00 -3.9
> >16554   217187_at  0.58 5.7  3.4    1.00 -3.9
> >8627  209133_s_at  0.54 4.7  3.3    1.00 -3.9
> >17983 218618_s_at -1.15 8.0 -3.3    1.00 -3.9
> >20977   221615_at  0.50 3.7  3.3    1.00 -3.9
> >18562   219198_at  0.54 5.7  3.3    1.00 -3.9
> >19513   220149_at  0.58 4.8  3.3    1.00 -3.9
> >1770    202242_at  1.04 5.4  3.3    1.00 -3.9
> >10081 210616_s_at -0.56 8.4 -3.3    1.00 -3.9
> >17995   218630_at  0.37 5.4  3.3    1.00 -3.9
> >3018  203491_s_at -0.67 5.1 -3.3    1.00 -3.9
> >10823 211410_x_at  0.56 5.3  3.3    1.00 -3.9
> >16351 216981_x_at  0.57 6.3  3.3    1.00 -3.9
> 
> 
> Best Regards
>  
> Han Weinong  
> hanweinong at yahoo.com
> 
> __________________________________________________
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>