[BioC] different gal files using limma

Gordon Smyth smyth at wehi.EDU.AU
Wed Sep 12 03:54:19 CEST 2007


Dear Tiandao,

It doesn't necessarily make sense to try to merge MAList if they 
aren't the same length and don't have the same IDs. I suggest you get 
down to a subset of probes for this is true, then try the merge 
command again. This assumes that the ID column of RG$genes has 
unambiguous identifiers for each probe. (I can't give you a lot of 
detail, because trying to troubleshoot this over the email is very hard.)

BTW, I notice that you're reading the entire GPR files into your 
RGList objects. This will make huge objects. Do you need to do that? 
Why not just

   RG <- read.maimages(targets,source="genepix.median",ext="gpr")

Best wishes
Gordon

At 07:26 AM 12/09/2007, Tiandao Li wrote:
>Dear Dr. Symth,
>
>Thanks for your help. I read in the gpr files using 2 gal files
>separately, then found the spot types separately, normalization
>separately, and remove all control spots separately, and only keep gene
>type for further analysis. Both MA1 and MA2 used the same gene ID s,
>however, MA2$genes$ID have 8 more genes than MA1. I used your code to
>match MA1 to MA2
>
>m <- match(MA2$genes$ID, MA1$genes$ID)
>MA <- cbind(MA1[m,], MA2)
>
>I compared MA2 to MA2 part of MA, the numbers are identical, however,
>there are some "NA" in MA$genes$ID instead of gene IDs from MA2$genes$ID.
>Because MA1 and MA2 aren't the same length and IDs. Could I still use it?
>There are 4 duplicate spots per gene on the array.
>
>I put 2 target files together to create a new target file, and use it to
>build design matrix for linear model. Is it OK?
>
>Sincerely,
>
>Tiandao
>
>On Tue, 11 Sep 2007, Gordon Smyth wrote:
>
>Dear Tiandao,
>
>Dealing with multiple gal files is very tricky, but possible. In 
>limma, you need
>to read in the GPR files for each GAL file separately, identify control spots
>separately, and normalize separately. So, if you have two GAL files, you will
>end up with two normalized MAList objects MA1 and MA2.
>
>You will then need to align MA1 and MA2 by gene ID. There is a merge command,
>but very often the situation is too complex for this command to 
>handle. Usually
>you will need to remove the control spots from MA1 and MA2 separately, to get
>down to a list of common genes, then sort MA1 to match the gene order of MA2,
>then cbind them together.
>
>If MA1 and MA2 are of the same length, with the same gene IDs, then something
>like this wil do the merge:
>
>    m <- match(MA2$genes$ID, MA1$genes$ID)
>    MA <- cbind(MA1[m,], MA2)
>
>There is any alternative method, which is to use the printorder() function to
>map spots back to the original 384-well plate positions, then align the arrays
>by 384-well plate. This method requires that the plates were used in the same
>order throughout the printing, except for control plates.
>
>You need to be very careful!
>Good luck.
>Gordon
>
> > Date: Sun, 9 Sep 2007 14:26:47 -0500 (CDT)
> > From: Tiandao Li <Tiandao.Li at usm.edu>
> > Subject: [BioC] different gal files using limma
> > To: Bioconductor_help <bioconductor at stat.math.ethz.ch>
> > Message-ID: <Pine.LNX.4.64.0709091401440.32134 at orca.st.usm.edu>
> > Content-Type: TEXT/PLAIN; charset=US-ASCII
> >
> > Hello,
> >
> > I am analyzing cDNA microarray data using limma. I generated the GAL file
> > using the program coming with chipwriter, everything looks great. However,
> > when I printed the first batch of chips, after the last dip of pins in the
> > first plates, print, wash, and the pins redipped again in the first plate
> > from the beginning, and print, wash, then stop to change the plate. The
> > company gave us the patch to solve this problem. So this gal file is a
> > little different than the rest batches of chips, the locations of genes,
> > MSP, and controls are different (5%). After hybridization, I used GenePix
> > Pro 6.1 for spotfinding. After reading the data into limma, I want to use
> > MSP and control spots for normalization. I don't know how to label
> > different gal files using readSpotTypes() in all chips.
> >
> > Thanks,
> >
> > Tiandao
> >
> > I am kind of new to R and limma. The following is my setting.
> >
> > > sessionInfo()
> > R version 2.5.1 (2007-06-27)
> > i386-pc-mingw32
> >
> > locale:
> > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> > States.1252;LC_MONETARY=English_United
> > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> >
> > attached base packages:
> > [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
> > [7] "base"
> >
> > other attached packages:
> >  statmod    limma
> >  "1.3.0" "2.10.5"
> >
> > Codes for analysis
> >
> > library(limma)
> >
> > A <- list(R="F635 Median",G="F532 Median",Rb="B635",Gb="B532")
> > B <- list("Block", "Column", "Row", "Name", "ID", "X", "Y", "Dia.", "F635
> > Median", "F635 Mean", "F635 SD", "F635 CV", "B635", "B635 Median", "B635
> > Mean", "B635 SD", "B635 CV", "% > B635+1SD", "% > B635+2SD", "F635 %
> > Sat.", "F532 Median", "F532 Mean", "F532 SD", "F532 CV", "B532", "B532
> > Median", "B532 Mean", "B532 SD", "B532 CV", "% > B532+1SD", "% >
> > B532+2SD", "F532 % Sat.", "Ratio of Medians (635/532)", "Ratio of Means
> > (635/532)", "Median of Ratios (635/532)", "Mean of Ratios (635/532)",
> > "Ratios SD (635/532)", "Rgn Ratio (635/532)", "Rgn R2 (635/532)", "F
> > Pixels", "B Pixels", "Circularity", "Sum of Medians (635/532)", "Sum of
> > Means (635/532)", "Log Ratio (635/532)", "F635 Median - B635", "F532
> > Median - B532", "F635 Mean - B635", "F532 Mean - B532", "F635 Total
> > Intensity", "F532 Total Intensity", "SNR 635", "SNR 532", "Flags",
> > "Normalize", "Autoflag")
> >
> > # read 6 test files
> > targets<-readTargets(file="targets.txt", row.name="Name") # 6 test files
> > RG <-
> > 
> read.maimages(targets$FileName,source="genepix",ext="gpr",columns=A,other.columns=B)
> > spottypes <- readSpotTypes("spottypes3.txt") # short spot types
> > RG$genes$Status <- controlStatus(spottypes,RG)
> >
> > targets
> > SlideNumber     FileName        Cy3     Cy5     Name
> > 1       13582917        N0      N1      N0N121
> > 2       13582918        N0      N1      N0N122
> > 3       13590446        N0      N1      N0N123
> > 4       13590420        N1      H1      N1H121
> > 5       13590521        N1      H1      N1H122
> > 6       13591193        N1      H1      N1H123
> >
> > spottypes3
> > SpotType        ID      Color
> > gene    *       black
> > Calibration     Calib*  blue
> > Ratio   Ratio*  red
> > Negative        Neg*|Util*      brown
> > MSP     MSP     orange
> > Alexa   Alexa*  yellow
> > blank   NotDefined      green



More information about the Bioconductor mailing list