[BioC] Please comment the way I'm thinking about the way to find differentially expressed genes

Kaj Chokeshaiusaha kaj.chk at gmail.com
Fri Jul 25 20:32:06 CEST 2014


Dear all,
Thank you very much for your comments. I now feel confident to stick
with the usual approach.
There is one thing that sticks in my mind all the time. This is
probably due to my lack of basic knowledge. I'm wondering about people
who generate sets of data using methods like leave-one-out from their
original data. After that applying test (like limma), and finally
check for top genes most repeated in differentially expressed gene
lists produced by all sets of data (for example, 4 out of 6).
Is this kind of approach better than sticking to the list of
differentially expressed genes list produced by original data?

Thank you very much in advance for your patience with me.

With Respects,
Kaj

2557-07-25 22:53 GMT+07:00, Sean Davis <sdavis2 at mail.nih.gov>:
> Hi, Kaj.
>
> You may be overthinking things a bit.  Differential gene expression
> analysis has a lot of history and has developed around the constraints
> imposed by small sample sizes, so most modern tools for doing differential
> expression analysis will handle your data in a rational and statistically
> sound way.  I would considering starting with limma; the user guide is
> excellent and the package is very highly utilized for experiments
> presumably just like yours.  I don't want to discourage experimentation,
> but it is often best to start with a known analysis if only for comparison
> if you do try something more exotic.
>
> Sean
>
>
>
> On Fri, Jul 25, 2014 at 11:20 AM, Kaj Chokeshaiusaha [guest] <
> guest at bioconductor.org> wrote:
>
>> Dear R helpers,
>>
>> I'm a starter in gene expression analysis, and I must apologize everyone
>> in the first place if I'm posting something irritated. My attemp is just
>> to
>> figure out an alternative way to find out differentailly expressed genes
>> in
>> low replicated datasets.
>>
>> In case that, I have very few number of replicated datasets per group
>> (2-3
>> replicates per group). I'm wondering whether I can generate several
>> datasets from my original datasets I have (using methods like Bootstrap)
>> and then perform the test to find out the lists of differentially
>> expressed
>> genes from my created datasets. After that I count the repeated genes
>> from
>> all lists and pick the top ones as differentially expressed genes.
>>
>> Please comment the idea, I don't want to slip too far in the wrong
>> approach.
>>
>> With Respects,
>> Kaj
>>
>>
>>  -- output of sessionInfo():
>>
>> R version 3.1.0 (2014-04-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>> [8] base
>>
>> other attached packages:
>> [1] CMA_1.22.0          Biobase_2.24.0      BiocGenerics_0.10.0
>> [4] e1071_1.6-3
>>
>> loaded via a namespace (and not attached):
>> [1] class_7.3-10 tools_3.1.0
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



More information about the Bioconductor mailing list