[BioC] Data set for comparing statistical tests

Jorge Miró jorgma86 at gmail.com
Sat Sep 1 00:46:39 CEST 2012


Hi James,

thank you. I checked and found what looks lika some arrays as rows and
the genes in the arrays as columns:

                                         203508_at 204563_at 204513_s_at
12_13_02_U133A_Mer_Latin_Square_Expt1_R1      0.000     0.000       0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R1      0.125     0.125       0.125.
12_13_02_U133A_Mer_Latin_Square_Expt3_R1      0.250     0.250       0.250
   .
       .             .               .
   .
       .             .               .
   .
       .             .               .
12_13_02_U133A_Mer_Latin_Square_Expt1_R2      0.000     0.000       0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R2      0.125     0.125       0.125
12_13_02_U133A_Mer_Latin_Square_Expt3_R2      0.250     0.250       0.250
    .
        .             .               .
    .
        .             .               .
    .
        .             .               .
12_13_02_U133A_Mer_Latin_Square_Expt1_R3      0.000     0.000       0.000
12_13_02_U133A_Mer_Latin_Square_Expt2_R3      0.125     0.125       0.125
12_13_02_U133A_Mer_Latin_Square_Expt3_R3      0.250     0.250       0.250
   .
       .             .               .
   .
       .             .               .
>

What does the numbers in the pData matrix mean? Is that the concentrations?
Is there any paper or lab description with a guide about how to
compare statistical tests when using spike in data? I really can not
figure out how I should go on with the comparison. It seems that the
genes have the same concentrations among the three groups of arrays
(from 0.000 to 512.000) so I guess I should take only some from each
group and compare test for differentially expressed genes, eg four
from group-R1 (concentration 0.000 to 0.500), four from group-R2
(concentrations 4.000 to 32.000) and four from group-R3
(concentrations 64.000 to 512.000).

Am I thinking right?


Also I checked the size of  the pData
> dim(pdata)
[1] 42 42

are there really only 42 genes in  the SpikeIn133 dataset or am I
missing something here?

Best regards
Jorge


On Fri, Aug 31, 2012 at 9:02 PM, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Jorge,
>
> pData(phenoData(SpikeIn133))
>
> Best,
>
> Jim
>
>
>
>
> On 8/31/2012 2:12 PM, Jorge Miró wrote:
>>
>> Hi again,
>>
>> I have been trying to understand how I should go on with the spike in
>> data but in vain.
>> Here are the commands I used:
>>
>>
>> ************ Code *************************
>>>
>>> library(SpikeIn)
>>> data(SpikeIn133)
>>
>> #Checked phenoData as suggested....
>>>
>>> phenoData(SpikeIn133)
>>
>> An object of class "AnnotatedDataFrame"
>>    sampleNames: 12_13_02_U133A_Mer_Latin_Square_Expt1_R1
>> 12_13_02_U133A_Mer_Latin_Square_Expt2_R1 ...
>> 12_13_02_U133A_Mer_Latin_Square_Expt14_R3 (42 total)
>>    varLabels: 203508_at 204563_at ... AFFX-ThrX-3_at (42 total)
>>    varMetadata: labelDescription
>>
>> # ... but I could not see the concentrations for the samples. Is it
>> something else I should do? I tryid with pData too and I could not
>> find any information about the samples concentration.
>>
>> *************************** End of code ******************'
>> I guess the SpikeIn133 is a file with raw intensities so I shoud apply
>> rma on it and then use eg limma to test for differential expression of
>> the genes. Am I right?
>>
>> I read the manual for SpikeIn but I can't see anything about  the
>> concentrations for each sample in the data set
>>
>> (http://www.bioconductor.org/packages/2.10/data/experiment/manuals/SpikeIn/man/SpikeIn.pdf)
>>
>>
>> Best regards
>> Jorge
>>
>> On Fri, Aug 31, 2012 at 12:01 PM, Benilton Carvalho
>> <beniltoncarvalho at gmail.com>  wrote:
>>>
>>> check the SpikeIn package... in particular the phenoData slot for the
>>> datasets available. b
>>>
>>> On 31 August 2012 10:58, Jorge Miró<jorgma86 at gmail.com>  wrote:
>>>>
>>>> Hi everybody,
>>>>
>>>> I need to compare Student's t-test and the test implemented in the
>>>> limma package. Does any body has an idea of how I should do?
>>>>
>>>> I guess I need a data set with already known differentially expressed
>>>> genes (maybe this can be done by specially designing the probesets in
>>>> the used arrays?) and then compare the results of a t-tests and limma
>>>> test with the expected differentially expressed genes. Where can I get
>>>> such a data set?
>>>>
>>>> Sorry if the question is a bit stupid but I'm new to microarray
>>>> analysis and statistics... By the way, should this kind of questions
>>>> be posted here or should I use another forum?
>>>>
>>>>
>>>>
>>>> Best regards
>>>> Jorge
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>



More information about the Bioconductor mailing list