[BioC] Testing for no difference

Wolfgang Huber whuber at embl.de
Mon Jul 23 17:31:30 CEST 2012


Btw, a less complex way to approach such an analysis is highlighted here:

http://nsaunders.wordpress.com/2012/07/23/we-really-dont-care-what-statistical-method-you-used/

	Best wishes
	Wolfgang

Jul/23/12 5:09 PM, Wolfgang Huber scripsit::
> Gustavo,
>
> it seems that your question can be rephrased as 'there is no evidence
> for these 5 samples forming any (nontrivial, i.e. different from size 1
> or 5) clusters'. If so, have a look at the package 'clue':
> http://cran.r-project.org/web/packages/clue/vignettes/clue.pdf
>
> Of course, proving the absence of something (e.g., a systematic
> difference) is very difficult, and in your case as in most it's probably
> better to aim for saying that any difference that may exist is smaller
> than some (more or less arbitrary) measure.
>
>      Best wishes
>      Wolfgang
>
> Jul/23/12 9:52 AM, Gustavo Fernández Bayón scripsit::
>> Hi everybody.
>>
>> I have a set of only 5 samples of Illumina27k methylation data. We
>> have extracted some info from the probes, but now the researcher in
>> charge of the project wants something that could support his idea of
>> the five samples to be practically equivalent wrt to their methylation
>> levels.
>>
>> I know that the sample is quite small. Intuitively, if you plot
>> densities from the 5 samples, they are almost equal. Problem is, I do
>> not know a way in which I could give a statistical significance about
>> this fact (yes, as always, there is the "I need a p-value" problem).
>>
>> 1) I did PCA on both beta values and m-values, and found that the
>> first principal component accounts for between 90 and 91% of the total
>> variance. In the biplot, the five samples appear to be very close.
>>
>> 2) I asked for advice to a statistician friend, and we tried to do the
>> following: probe by probe, we tried a Leave-One-Out approach, by
>> calculating a confidence interval for 4 of the samples and seeing if
>> the remaining probe falls within the interval. Then, for each probe, I
>> summed the number of times a methylation value fell out of the
>> confInt, only to find out that nearly 53% of the probes contain -in
>> this sense- 'outliers'.
>>
>> At first it surprised me, but then I noticed -by plotting the outliers
>> against the samples- that these 'outliers' were uniformly distributed
>> among samples, which I think is again intuitive, i.e., there are
>> indeed differences (statistical differences, maybe not biological)
>> among samples, but there is no global difference of one of the samples
>> w.r.t. the others.
>>
>> These differences might be related to technical noise, so I was
>> thinking of imposing a minimum difference in order to test again for
>> outliers. Would this be ok?
>>
>> Is there any method I can use to try to show there is no difference
>> among the samples? Or should I stay with the graphs and the intuitive
>> description on the text?
>>
>> Thanks. Any help or hint would be much appreciated.
>>
>> Regards,
>> Gustavo
>>
>> ---------------------------
>> Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>


-- 
Best wishes
	Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list