[BioC] different gal files using limma

Gordon Smyth smyth at wehi.EDU.AU
Tue Sep 11 01:39:44 CEST 2007


Dear Tiandao,

Dealing with multiple gal files is very tricky, but possible. In 
limma, you need to read in the GPR files for each GAL file 
separately, identify control spots separately, and normalize 
separately. So, if you have two GAL files, you will end up with two 
normalized MAList objects MA1 and MA2.

You will then need to align MA1 and MA2 by gene ID. There is a merge 
command, but very often the situation is too complex for this command 
to handle. Usually you will need to remove the control spots from MA1 
and MA2 separately, to get down to a list of common genes, then sort 
MA1 to match the gene order of MA2, then cbind them together.

If MA1 and MA2 are of the same length, with the same gene IDs, then 
something like this wil do the merge:

    m <- match(MA2$genes$ID, MA1$genes$ID)
    MA <- cbind(MA1[m,], MA2)

There is any alternative method, which is to use the printorder() 
function to map spots back to the original 384-well plate positions, 
then align the arrays by 384-well plate. This method requires that 
the plates were used in the same order throughout the printing, 
except for control plates.

You need to be very careful!
Good luck.
Gordon

>Date: Sun, 9 Sep 2007 14:26:47 -0500 (CDT)
>From: Tiandao Li <Tiandao.Li at usm.edu>
>Subject: [BioC] different gal files using limma
>To: Bioconductor_help <bioconductor at stat.math.ethz.ch>
>Message-ID: <Pine.LNX.4.64.0709091401440.32134 at orca.st.usm.edu>
>Content-Type: TEXT/PLAIN; charset=US-ASCII
>
>Hello,
>
>I am analyzing cDNA microarray data using limma. I generated the GAL file
>using the program coming with chipwriter, everything looks great. However,
>when I printed the first batch of chips, after the last dip of pins in the
>first plates, print, wash, and the pins redipped again in the first plate
>from the beginning, and print, wash, then stop to change the plate. The
>company gave us the patch to solve this problem. So this gal file is a
>little different than the rest batches of chips, the locations of genes,
>MSP, and controls are different (5%). After hybridization, I used GenePix
>Pro 6.1 for spotfinding. After reading the data into limma, I want to use
>MSP and control spots for normalization. I don't know how to label
>different gal files using readSpotTypes() in all chips.
>
>Thanks,
>
>Tiandao
>
>I am kind of new to R and limma. The following is my setting.
>
> > sessionInfo()
>R version 2.5.1 (2007-06-27)
>i386-pc-mingw32
>
>locale:
>LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>States.1252;LC_MONETARY=English_United
>States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
>attached base packages:
>[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
>[7] "base"
>
>other attached packages:
>  statmod    limma
>  "1.3.0" "2.10.5"
>
>Codes for analysis
>
>library(limma)
>
>A <- list(R="F635 Median",G="F532 Median",Rb="B635",Gb="B532")
>B <- list("Block", "Column", "Row", "Name", "ID", "X", "Y", "Dia.", "F635
>Median", "F635 Mean", "F635 SD", "F635 CV", "B635", "B635 Median", "B635
>Mean", "B635 SD", "B635 CV", "% > B635+1SD", "% > B635+2SD", "F635 %
>Sat.", "F532 Median", "F532 Mean", "F532 SD", "F532 CV", "B532", "B532
>Median", "B532 Mean", "B532 SD", "B532 CV", "% > B532+1SD", "% >
>B532+2SD", "F532 % Sat.", "Ratio of Medians (635/532)", "Ratio of Means
>(635/532)", "Median of Ratios (635/532)", "Mean of Ratios (635/532)",
>"Ratios SD (635/532)", "Rgn Ratio (635/532)", "Rgn R2 (635/532)", "F
>Pixels", "B Pixels", "Circularity", "Sum of Medians (635/532)", "Sum of
>Means (635/532)", "Log Ratio (635/532)", "F635 Median - B635", "F532
>Median - B532", "F635 Mean - B635", "F532 Mean - B532", "F635 Total
>Intensity", "F532 Total Intensity", "SNR 635", "SNR 532", "Flags",
>"Normalize", "Autoflag")
>
># read 6 test files
>targets<-readTargets(file="targets.txt", row.name="Name") # 6 test files
>RG <-
>read.maimages(targets$FileName,source="genepix",ext="gpr",columns=A,other.columns=B)
>spottypes <- readSpotTypes("spottypes3.txt") # short spot types
>RG$genes$Status <- controlStatus(spottypes,RG)
>
>targets
>SlideNumber     FileName        Cy3     Cy5     Name
>1       13582917        N0      N1      N0N121
>2       13582918        N0      N1      N0N122
>3       13590446        N0      N1      N0N123
>4       13590420        N1      H1      N1H121
>5       13590521        N1      H1      N1H122
>6       13591193        N1      H1      N1H123
>
>spottypes3
>SpotType        ID      Color
>gene    *       black
>Calibration     Calib*  blue
>Ratio   Ratio*  red
>Negative        Neg*|Util*      brown
>MSP     MSP     orange
>Alexa   Alexa*  yellow
>blank   NotDefined      green



More information about the Bioconductor mailing list