[BioC] Beadarray and illumina methylation arrays

Wed Oct 22 16:42:11 CEST 2008

Hi Katrina,

I only have limited experience with methylation data, but hopefully I might
be able to give you a few pointers.

-The error with readIllumina is quite hard to diagnose without seeing the
example. I haven't seen any data from this type of Methylation array. What
do the first few lines of the bead-level text files (.csv in your case) look
like? It could be that the x and y coordinates are in a slightly different
format to that we have seen before.

-Yes, 25% of outliers does seem a little high I'm afraid. Have you also
looked at whereabouts the outliers are located on the arrays or made some
imageplots? We have just developed a new function for automatic artefact
detection called BASH that will be available in the forthcoming Bioconductor
release. I could be interesting to run that on your data as Illumina do
sometimes miss some beads in obvious artefacts and remove too many beads on
the rest of the array. BASH should be a good compromise.

-Yes, currently the only way of reading methylation data into beadarray is
by using the bead-level data.

-I'm not very familiar with the output of BeadStudio. Do you get separate
detection values for the green and red channels? If so, then I don't think
it would be problem if a bead-type was detected in one channel but not the
other (since they are measure of either methylated or unmethylated
respectively). Bead types that are not detected in either channel could be a
problem though.

-I haven't really seen many guidelines for normalization and this is
something I would like to look into. There is an obvious dye-bias that needs
to be corrected and the background normalisation used by Illumina might
actually help in this regard (although I wouldn't usually recommend it for
other Illumina data). Quantile normalisation has been used for other types
of two-colour Illumina data (http://www.biomedcentral.com/1471-2105/9/409,
http://genome.cshlp.org/cgi/content/abstract/17/3/368) so it could work
here. 

Hope this helps,

Mark

-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Katrina bell
Sent: 21 October 2008 03:04
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Beadarray and illumina methylation arrays

Hello,

This is the first time I have used beadarray . I am using it  for the 
analysis of an illumina 27 methylation array and I am having a few issues I 
hope that you could help me with.

1.      The first time I tried to load the methylation data, I didn't write 
in singleChannel=FALSE. It happily read in my 12 arrays with no problems 
what so ever. I tried plotting a few things which worked fine. Seeing my 
mistake, I then went back to reload my data with the red channel 
(singleChannel=FALSE) and got the following error.

 > BLData = readIllumina(arrayNames = targets$ChipBarcode, textType=".csv", 
targets=targets, backgroundMethod="none", singleChannel=FALSE)
Found 12 arrays
Reading pixels of ./4408100017_A_Grn.tif
Calculating background
Sharpening Image
Calculating foregound
Background correcting: method = none
Reading pixels of ./4408100017_A_Red.tif
Calculating background

*** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
1: .C("readBeadImage", as.character(tifFiles2[i]), as.double(RedX[ord]), 
as.double(RedY[ord]), as.integer(numBeads), foreGround = double(length = 
numBeads), backGround = double(length = numBeads), 
as.integer(backgroundSize), as.integer(manip), as.integer(fground), PACKAGE 
= "beadarray")
2: readIllumina(arrayNames = targets$ChipBarcode, textType = ".csv", 
targets = targets, backgroundMethod = "none", singleChannel = FALSE)

session info Below.

So I ended up loading in the data with images=FALSE, which worked, but I 
would like to be able to look at the background. Is there a way around this 
issue?

2. When I plotted the outliers (bar chart) I got an astounding 25% for the 
majority of my 12 samples, both in the red and green channel (unlogged 
data). In addition, 3 of the samples had a peak of intensity at 5 in the 
green channel, leading me to believe that I have some real quality control 
issues with my samples. Any opinions/suggestions on these results would be 
most welcome.

3. Is it correct that readBeadSummaryData, is not set up for two colour 
arrays such as the methylation arrays? So the only way to look at 
methylation data is through reading in BLData?

4. Some of my samples seem to have a large number of targets which have a p 
value detection rate above 0.05 (beadstudio output). Illumina have 
indicated that they disregard these. If I can not read in the bead summary 
data from bead studio, I am assuming that these detection p values can not 
be taken into account in the analysis? Or are there other methods that 
remove/down grade these less than optimal probes (most removed as
outliers?).

5. Has any one had any experience with normalisation of the methylation 
arrays? I know that many of the usual array methods are out of the question 
due to the assumption that most probes on the array will not be 
differentially expressed is invalid. I read in a bioconductor list someone 
suggesting quantile normalisation? I would really appreciate any feeback 
from people who have tried this or other methods, especially if they have 
verified their methylation results.

Thanks for any help/advice you may be able to give.

Cheers
Katrina

 > sessionInfo() below
R version 2.7.0 (2008-04-22)
x86_64-redhat-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8
;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADD
RESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base

other attached packages:
[1] beadarray_1.8.0 affy_1.18.2 preprocessCore_1.2.1
[4] affyio_1.8.0 geneplotter_1.18.0 annotate_1.18.0
[7] xtable_1.5-2 AnnotationDbi_1.2.2 RSQLite_0.6-9
[10] DBI_0.2-4 lattice_0.17-6 Biobase_2.0.1
[13] limma_2.14.5

loaded via a namespace (and not attached):
[1] grid_2.7.0 KernSmooth_2.22-22 RColorBrewer_1.0-2

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor