[BioC] Need help: no MTC possible

suparna mitra smitra at liverpool.ac.uk
Tue Oct 16 17:57:08 CEST 2012


Dear James,
 Thanks for the support. But after doing these step also still no
significant genes (see attached ven diagram as all 0). I realize my data is
very variable. But isn't there any fix?
Thanks a lot,
Suparna.

> topTable(fit2, coef = 1, adjust = "fdr")

           ID      logFC   AveExpr         t      P.Value adj.P.Val
 B

6238  7917530  0.6251124 11.170012  5.592012 2.536230e-05 0.6518774
-0.0171572

11556 7970507  0.9123944  7.490579  5.057525 7.969967e-05 0.6518774
-0.5140431

15234 8007228  0.6400697  9.710164  4.854888 1.239777e-04 0.6518774
-0.7143350

8819  7943047 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774
-0.8148014

9675  7950951 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774
-0.8148014

18889 8043581 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774
-0.8148014

19899 8053785 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774
-0.8148014

6239  7917532  0.7207256 10.449577  4.677368 1.831220e-04 0.6518774
-0.8950366

25845 8113130  0.6264274  8.957173  4.640816 1.984975e-04 0.6518774
-0.9328368

3759  7896284 -0.8301988  5.727302 -4.616141 2.096126e-04 0.6518774
-0.9584680

> topTable(fit2, coef = 2, adjust = "fdr")

           ID      logFC   AveExpr         t      P.Value adj.P.Val
B

2088  7894602  0.5606499  2.841855  5.401484 3.800711e-05 0.9778984
-1.825861

685   7893190 -0.5281344  6.726990 -4.966811 9.708481e-05 0.9778984
-2.059423

6238  7917530 -0.5550876 11.170012 -4.786129 1.441524e-04 0.9778984
-2.162851

621   7893126 -0.6332961  4.412764 -4.785753 1.442714e-04 0.9778984
-2.163070

26642 8120756 -1.2198288  5.439265 -4.615075 2.101065e-04 0.9778984
-2.264206

1687  7894197 -1.0441762  2.631359 -4.526834 2.553958e-04 0.9778984
-2.317791

20947 8065084 -0.4297158  6.630412 -4.274936 4.470448e-04 0.9778984
-2.475534

154   7892657  0.9444466  3.997249  4.150578 5.900236e-04 0.9778984
-2.555951

20151 8056222 -0.8638926  7.892249 -4.144942 5.974996e-04 0.9778984
-2.559635

7851  7933619  0.4101773  8.480778  4.128728 6.195398e-04 0.9778984
-2.570249

> topTable(fit2, coef = 3, adjust = "fdr")

           ID      logFC  AveExpr         t      P.Value adj.P.Val         B

6210  7917182 -0.3334646 3.273225 -5.847896 1.483281e-05 0.2755645 2.2740621

27812 8132245 -0.5028082 5.409405 -5.795271 1.655191e-05 0.2755645 2.1981127

2366  7894884  0.6507323 8.436001  5.335322 4.378172e-05 0.3234851 1.5133904

26802 8122099 -0.4655070 4.548920 -5.279910 4.930640e-05 0.3234851 1.4284277

587   7893092 -1.0604644 6.013864 -5.143614 6.613900e-05 0.3234851 1.2172775

2562  7895081  0.6962641 6.898546  4.999306 9.045119e-05 0.3234851 0.9904391

867   7893372  1.2334593 3.017891  4.971552 9.608676e-05 0.3234851 0.9464374

685   7893190 -0.5196216 6.726990 -4.948033 1.011423e-04 0.3234851 0.9090560

808   7893313  0.9743437 7.938503  4.893437 1.139496e-04 0.3234851 0.8219582

15234 8007228  0.6486240 9.710164  4.801424 1.393931e-04 0.3234851 0.6741646

>

>  results <- decideTests(fit2)

>

> vennDiagram(results)

On 16 October 2012 14:48, James W. MacDonald <jmacdon at uw.edu> wrote:

> Hi Suparna,
>
>
> On 10/16/2012 5:58 AM, suparna mitra wrote:
>
>> Hello group,
>> Related to my previous post, I further tried arrayweight as:
>>
>> > f.invivo <- factor(InVivoTargets$**Treatment, levels = c("A", "R",
>> "T"))
>>
>> > design.invivo <- model.matrix(~0 + f.invivo)
>>
>> > colnames(design.invivo) <- c("A", "R", "T")
>>
>> > design.invivo
>>
>>    A R T
>>
>> 1  1 0 0
>>
>> 2  1 0 0
>>
>> 3  1 0 0
>>
>> 4  1 0 0
>>
>> 5  1 0 0
>>
>> 6  1 0 0
>>
>> 7  0 1 0
>>
>> 8  0 1 0
>>
>> 9  0 1 0
>>
>> 10 0 1 0
>>
>> 11 0 1 0
>>
>> 12 0 1 0
>>
>> 13 0 0 1
>>
>> 14 0 0 1
>>
>> 15 0 0 1
>>
>> 16 0 0 1
>>
>> 17 0 0 1
>>
>> 18 0 0 1
>>
>> attr(,"assign")
>>
>> [1] 1 1 1
>>
>> attr(,"contrasts")
>>
>> attr(,"contrasts")$f.invivo
>>
>> [1] "contr.treatment"
>>
>>
>> >
>>
>> > arrayw <- arrayWeightsSimple(**rmaOligoinvivo, design.invivo)
>>
>> > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw)
>>
>> > arrayw
>>
>>         1         2         3         4         5         6         7
>>     8         9        10        11        12        13        14
>>
>> 0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502 1.0741852
>> 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790 0.8309346 0.9328655
>>
>>        15        16        17        18
>>
>> 1.1926458 0.7280477 0.5130294 1.8503073
>>
>> > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels =
>> design.invivo)
>>
>> > fit2<-contrasts.fit(fit, contrast.matrix.invivo)
>>
>> > fit2 <- eBayes(fit2)
>>
>>
> Looks good to me.
>
> Best,
>
> Jim
>
>
>  >
>>
>> Can anybody please suggest if I am doing it right? Actually being new in
>> this I am bit afraid to make errors.
>> Thanks,
>> Suparna.
>>
>> On 16 October 2012 10:36, suparna mitra <smitra at liverpool.ac.uk <mailto:
>> smitra at liverpool.ac.uk**>> wrote:
>>
>>     Dear James,
>>       Thanks for your suggestion. I was reading arrayWeights package
>>     in limma.
>>     But being novice in bioC I have one confusion. Should I
>>     perform arrayWeights on normalized (rmaOligo) expression data or
>>     on the raw data?
>>
>>     This is what i have done so far:
>>
>>     > sessionInfo()
>>     R version 2.15.1 (2012-06-22)
>>     Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>
>>     locale:
>>     [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.**UTF-8/C/en_GB.UTF-8/en_GB.UTF-**8
>>
>>     attached base packages:
>>     [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>>     other attached packages:
>>      [1] statmod_1.4.15              limma_3.12.1
>> annotate_1.34.1             hugene10stprobeset.db_8.0.1
>>     org.Hs.eg.db_2.7.1
>>      [6] BiocInstaller_1.4.7         affycoretools_1.28.0
>> KEGG.db_2.7.1               GO.db_2.7.1
>>  AnnotationDbi_1.18.1
>>     [11] affy_1.34.0                 Biobase_2.16.0
>> BiocGenerics_0.2.0          pd.hugene.1.0.st.v1_3.6.0      RSQLite_0.11.1
>>     [16] DBI_0.2-5                   oligo_1.20.4
>> oligoClasses_1.18.0
>>
>>
>>      rmaOligoinvivo = oligo::rma(InVivodat1)
>>     Background correcting
>>     Normalizing
>>     Calculating Expression
>>
>>     > maplot(rmaOligoinvivo)
>>     >hist(rmaOligoinvivo)
>>     > InVivoTargets=readTargets("~/**Desktop/Recent/Liverpool-work-**
>> related/Micro_RawData/**InVivoTargets.txt")
>>     > InVivoTargets
>>      FileName Treatment
>>     1   MC1       A
>>     2   MC2       A
>>     3   MC3       A
>>     4   MC4       A
>>     5   MC5       A
>>     6   MC6       A
>>     7   MC7       R
>>     8   MC8        R
>>     9   MC9        R
>>     10 MC10        R
>>     11 MC11        R
>>     12 MC12        R
>>     13 MC13       T
>>     14 MC14        T
>>     15 MC15        T
>>     16 MC16        T
>>     17 MC17        T
>>     18 MC18        T
>>
>>     f.invivo <- factor(InVivoTargets$**Treatment, levels = c("A", "R",
>> "T"))
>>
>>     design.invivo <- model.matrix(~0 + f.invivo)
>>
>>     > colnames(design.invivo) <- c("A", "R", "T")
>>
>>     > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo)
>>
>>     > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels =
>>     design.invivo)
>>
>>     > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo)
>>
>>     > fit2.invivo <-eBayes(fit2.invivo)
>>
>>     Thanks a lot,
>>     Suparna.
>>
>>
>>     On 15 October 2012 14:33, James W. MacDonald <jmacdon at uw.edu
>>     <mailto:jmacdon at uw.edu>> wrote:
>>
>>         Hi Suparna,
>>
>>
>>         On 10/15/2012 7:01 AM, suparna mitra wrote:
>>
>>             Hi all,
>>                I have been working in a project where I have
>>             Affymetrix Hgene 1.0 St V1
>>             data. And I have tree groups of patients having 6 samples
>>             each. I tried to
>>             perform rma normalization and to filter my data based on
>>             expression values
>>             20%. After that went for unpaired t-test to test each two
>>             combination of
>>             groups. But the problem is my data is extremely variable.
>>             I have tried to filter my genes based on variance and/or
>>             CV before testing,
>>             to try to reduce the number of genes entering your test
>>             and multiple
>>             correction.  But with different reasonable filtering also
>>             I am with no
>>             luck. And I don't have the option to increase sample size
>>             of my project.
>>             Further I tried to check for the bad samples and bad
>>             probes from
>>             experimentand remove outlier if these are not of interest.
>>             Still the same
>>             when run t-test (and other possible test like
>>             Mann-Whitney) with MTC there
>>             are no genes.
>>             On the other hand if I go on with out MTC and select a
>>             good p value cutoff
>>             and reasonable fold change I get a list of significant
>>             gene which may be
>>             good or reasonable for my study. but the problem is I
>>             somehow need to
>>             justify the method for my finding. Do you know any study
>>             or paper where
>>             anybody has treated their data without MTC?
>>             My main concern is if I find a good story matching
>>             biological prospective,
>>             would it be anyhow possible to justify the method without MTC?
>>
>>
>>         It's not clear to me what you are doing here - when you filter
>>         on variance are you keeping or removing the high variability
>>         genes (keeping, I hope)? I am also not sure what MTC stands
>>         for - is this multiple test correction?
>>
>>         Anyway, assuming I have things correct, some suggestions.
>>         First, you might want to use array weights when fitting your
>>         model. If you have a lot of intra-group variability, this will
>>         tend to help.
>>
>>         Second, the t-statistic is the universally most powerful test
>>         (assuming the underlying data are relatively hump-shaped), so
>>         going to a non-parametric test will usually reduce rather than
>>         increase power to detect differences.
>>
>>         Third, univariate tests are arguably not the most
>>         sophisticated way of analyzing expression data, and you might
>>         get better (or at least more satisfactory) results if you
>>         instead looked at analyzing for groups of genes rather than
>>         individually.
>>
>>         Depending on your experiment, you could accomplish this task
>>         with a gene set analysis (there are multiple ways of doing
>>         this - perhaps the easiest being romer() and roast() in
>>         limma), or if you have phenotypic data, especially continuous
>>         measures, a WGCNA analysis might be of some use.
>>
>>         Best,
>>
>>         Jim
>>
>>
>>             Thanks a lot,
>>             Suparna.
>>
>>                     [[alternative HTML version deleted]]
>>
>>             ______________________________**_________________
>>             Bioconductor mailing list
>>             Bioconductor at r-project.org <mailto:Bioconductor at r-**
>> project.org <Bioconductor at r-project.org>>
>>
>>             https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>             Search the archives:
>>             http://news.gmane.org/gmane.**science.biology.informatics.**
>> conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>>
>>         --         James W. MacDonald, M.S.
>>         Biostatistician
>>         University of Washington
>>         Environmental and Occupational Health Sciences
>>         4225 Roosevelt Way NE, # 100
>>         Seattle WA 98105-6099
>>
>>
>>
>>
>>     --     Dr. Suparna Mitra
>>     Wolfson Centre for Personalised Medicine
>>     Department of Molecular and Clinical Pharmacology
>>     Institute of Translational Medicine University of Liverpool
>>     Block A: Waterhouse Buildings,  L69 3GL Liverpool
>>
>>     Tel. +44 (0)151 795 5394 <tel:%2B44%20%280%29151%20795%**205394>,
>>     Internal ext: 55394
>>     M: +44 (0) 7511387895 <tel:%2B44%20%280%29%**207511387895>
>>     Email id: smitra at liverpool.ac.uk <mailto:smitra at liverpool.ac.uk**>
>>     Alternative Email id: suparna.mitra.sm at gmail.com
>>     <mailto:suparna.mitra.sm@**gmail.com <suparna.mitra.sm at gmail.com>>
>>
>>
>>
>>
>>
>> --
>> Dr. Suparna Mitra
>> Wolfson Centre for Personalised Medicine
>> Department of Molecular and Clinical Pharmacology
>> Institute of Translational Medicine University of Liverpool
>> Block A: Waterhouse Buildings,  L69 3GL Liverpool
>>
>> Tel.  +44 (0)151 795 5394, Internal ext: 55394
>> M: +44 (0) 7511387895
>> Email id: smitra at liverpool.ac.uk <mailto:smitra at liverpool.ac.uk**>
>> Alternative Email id: suparna.mitra.sm at gmail.com <mailto:
>> suparna.mitra.sm@**gmail.com <suparna.mitra.sm at gmail.com>>
>>
>>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>


-- 
Dr. Suparna Mitra
Wolfson Centre for Personalised Medicine
Department of Molecular and Clinical Pharmacology
Institute of Translational Medicine University of Liverpool
Block A: Waterhouse Buildings,  L69 3GL Liverpool

Tel.  +44 (0)151 795 5394, Internal ext: 55394
M: +44 (0) 7511387895
Email id: smitra at liverpool.ac.uk
Alternative Email id: suparna.mitra.sm at gmail.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot.pdf
Type: application/pdf
Size: 75900 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20121016/549060c6/attachment.pdf>


More information about the Bioconductor mailing list