[BioC] Please comment the way I'm thinking about the way to find differentially expressed genes
James W. MacDonald
jmacdon at uw.edu
Fri Jul 25 17:52:11 CEST 2014
Hi Kaj,
I don't see how resampling is going to help you at all with just 2-3
samples per group. Anyway, the bootstrap is in general used to generate
improved estimates of the variance, not to generate 'new' data sets.
Figuring out ways to improve variance estimates was a fairly hot area of
research about 10 years ago, and people have in general settled on the
idea of empirical Bayesian estimates like you get with limma.
As a self-professed 'starter' in gene expression analysis, are you sure
you are best equipped to improve on the accepted methods that were
developed over several year by PhD statisticians? If not, I would just
stick with using limma, especially if you want to publish your results.
It's much easier to say 'I used the bioconductor limma package' then to
explain your newfangled, unpublished method, especially if you are not a
PhD statistician yourself.
Best,
Jim
On 7/25/2014 11:20 AM, Kaj Chokeshaiusaha [guest] wrote:
> Dear R helpers,
>
> I'm a starter in gene expression analysis, and I must apologize everyone in the first place if I'm posting something irritated. My attemp is just to figure out an alternative way to find out differentailly expressed genes in low replicated datasets.
>
> In case that, I have very few number of replicated datasets per group (2-3 replicates per group). I'm wondering whether I can generate several datasets from my original datasets I have (using methods like Bootstrap) and then perform the test to find out the lists of differentially expressed genes from my created datasets. After that I count the repeated genes from all lists and pick the top ones as differentially expressed genes.
>
> Please comment the idea, I don't want to slip too far in the wrong approach.
>
> With Respects,
> Kaj
>
>
> -- output of sessionInfo():
>
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] CMA_1.22.0 Biobase_2.24.0 BiocGenerics_0.10.0
> [4] e1071_1.6-3
>
> loaded via a namespace (and not attached):
> [1] class_7.3-10 tools_3.1.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list