[BioC] duplicated spots on oligonucleotide array

Gordon Smyth smyth at wehi.EDU.AU
Sat Jun 30 09:10:28 CEST 2007

Dear Leonardo,

Firstly, the function duplicateCorrelation() is designed for 
situations which all of your probes are duplicated, including control 
spots. You cannot "remove" spots for this purpose by setting zero weights.

Secondly, control spots which are repeated more than once, but are 
still duplicated top and bottom, cause no problems. You not need to 
remove them.

Thirdly, sorting on gene ID should work ok if your gene IDs are 
unique throughout.

Fourthly, there is no reason that I know of why your command

     lmFit(MA, design, ndups=2, 

should give an error, apart from the obvious that you haven't defined 
the object dupcor1.teste. There may be other trouble shooting that 
you need to do of your data.

Best wishes

>Date: Tue, 26 Jun 2007 10:17:48 -0500
>From: "Leonardo Rocha" <leobernardesrocha at gmail.com>
>Subject: [BioC] duplicated spots on oligonucleotide array
>To: <Bioconductor at stat.math.ethz.ch>
>Dear List,
>I am very sorry for the previous emails, I do not know what happened, so I
>trying to use another email. I am looking for help to account for
>duplication in analysis using lmFit in limma of data from a two-channel
>microarray. The experiment is comparing differences between breeds (A and N)
>using a dye-swap labelling. The array has the following layout:
>[1] 12
>  $ngrid.c
>[1] 4
>[1] 19
>  $nspot.c
>[1] 19
>  attr(,"class")
>[1] "PrintLayout"
>  The array has been duplicated on the top half and the bottom half of the
>slide (spots on the tip 1 are duplicated on the tip 48, spots on the tip 2
>are duplicated on the tip 47, and so on). Moreover, the slide has control
>spots with different number of duplicates, which I have been attributed
>weights=0, so I am using only spots with 2 replicates ("genes"). I have been
>successful accounting for duplicated spots using the following commands:
>  myfun <- function(x) as.numeric(x$Flags ==0)
>  targets1 <-  as.matrix(read.table("targets.txt", header = TRUE)); targets1
>      SlideNumber Name         FileName      Cy5      Cy3
>  [1,] "1"         "Treatment1" "N16_A11.gpr" "A"  "N"
>  [2,] "2"         "Treatment2" "A08_N15.gpr" "N" "A"
>  [3,] "3"         "Treatment3" "N12_A06.gpr" "A"  "N"
>[4,] "4"         "Treatment4" "A02_N07.gpr" "N" "A"
>  filenames <- matrix (c(targets1[,3]),nrow=4,ncol=1); filenames
>      [,1]
>  [1,] "N16_A11.gpr"
>[2,] "A08_N15.gpr"
>[3,] "N12_A06.gpr"
>[4,] "A02_N07.gpr"
>  RG <- read.maimages(filenames, source="genepix", wt.fun=myfun)
>  MA <- normalizeWithinArrays(RG, method="loess", bc.method="none")
>  MA1 <-MA[order(RG$genes[,4]),]
>  design = matrix(cbind(Dye = 1, c(-1,1, -1,1)), nrow=4,
>ncol=2,dimnames=list(c("N16_A11", "A08_N15", "N12_A06", "A02_N07"),
>  dupcor <-duplicateCorrelation(MA1, design, ndups=2, spacing=1)
>  fit <- lmFit(MA1, design, ndups=2, spacing=1,
>  fit2 <- eBayes(fit)
>  topTable (fit2, coef = "Treatment", adjust="BH", sort.by="P")
>  However, if the duplicateCorrelation is used to estimate spatial
>correlation in the slide, I am not sure if makes sense to rearrange MA by
>GeneID and then apply the duplicateCorrelation with spacing=1, so I have
>tried to use (unsucessful) the argument spacing="topbottom".
>  dupcor1 <-duplicateCorrelation(MA, design, ndups=2, spacing="topbottom")
>  fit.topbottom <- lmFit(MA, design, ndups=2, spacing="topbottom",
>Error in nspots/ndups/spacing : non-numeric argument to binary operator
>  sessionInfo()
>R version 2.5.0 (2007-04-23)
>  i386-pc-mingw32
>  locale:
>LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>  attached base packages:
>[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
>  other attached packages:
>statmod    limma
>  "1.3.0" "2.10.0"
>  Is it correct what I am doing? Could anyone give me some suggestions with
>this problem?
>  Thank you a lot for your help!

More information about the Bioconductor mailing list