[BioC] read.ilmn function query

Gordon K Smyth smyth at wehi.EDU.AU
Sun Jul 17 02:39:02 CEST 2011


Hi Natasha,

> Date: Fri, 15 Jul 2011 18:03:59 +0100
> From: Natasha Sahgal <nsahgal at well.ox.ac.uk>
> To: bioconductor at r-project.org
> Subject: [BioC] read.ilmn function query
>
> Dear List,
>
> Normally for Illumina arrays, instead of the functions given based in
> the limma user guide (e.g. neqc, read.ilmn etc.), I use:
>
>    * read.delim - to load probe profile data and sample table control
>      data respectively
>    * perform bg correction using the negative control probes from the
>      sample table control
>    * filter data based on _"detection scores"_
>    * normalise data using the _"vsn2"_ function
>
>
> However, as I have just realised that these can be used I have some queries:
>
>   1. Will there be much difference between the quantile normalisation
>      in the neqc function (as compared to vsn2 ?)

The neqc() strategy is different from that of vsn, not only in terms of 
normalization, but also in terms of background corection and variance 
stabilization.  The are some parallels however in the mathematical theory 
between normexp background correction and the vsn transformation.  How 
different the practical results will be though, I don't know.  We compared 
neqc() to vst and other strategies that have been proposed for Illumina 
BeadChip data in the literature, but vsn wasn't one of those.

>   2. How does one interpret the boxplots for the various controls
>      (apart from x$genes$Status=="regular")?
>          * as the median/mean vary a lot
>          * much more for my samples (than the example shown in the user
>            guide)

This is a property of your data.  If the boxplots vary are lot, then there 
must be a lot of variability in your data.

>   3. When filtering: based on the help of read.ilmn
>          * The "Detection" column appears to be detection p-value by
>            default
>          * What does one do if the output is different from the
>            GenomeStudio and it gives a "Detection Score" instead??
>                o Would: expressed <- apply(y$other$Detection < 0.05,1,any)
>                      + change to: expressed <- apply(y$other$Detection
>                         > 0.95,1,any)

Yes.

>   4. Also, I do not fully understand the estimation of probes expressed
>      using the propexpr function
>          * one of my samples A7 shows 0.0 (I see that the housekeeping
>            gene intensity for this is ~ 200 whereas for others its
>            1000+), its a similar case for samples A11 and A12
>                o propexpr(x)
>                o             A1           A2             A7
>                  A8             A3            A4          A11          A12
>                  0.3380243 0.4066500 0.0000000 0.4232871 0.3131936
>                  0.3819055 0.1934197 0.2036340
>                              A5            A6            A9          A10
>                  0.3363844 0.3476216 0.3445201 0.3834617

This seems to flag a possible problem with your sample A7.  The regular 
probes (the majority of them anyway) are no brighter than background 
probes.  This could suggest a problem with the RNA extraction, for 
example, in this case.  The proportion of expressed probes might not be 
truly zero, but the spread of intensities must be different from that 
usually seen for a good quality array.

Best wishes
Gordon

> sessionInfo()
> R version 2.13.0 (2011-04-13)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>  [5] LC_MONETARY=C              LC_MESSAGES=en_GB.UTF-8
>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] gdata_2.8.2 limma_3.8.2
>
> loaded via a namespace (and not attached):
> [1] gtools_2.6.2 tools_2.13.0
>
> Many Thanks,
> Natasha
>
>
>
> --

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list