[BioC] Affy ST 1.0 gene chip background probes cut off set up

Wed Aug 11 20:31:48 CEST 2010

Hi Stacy,

On 8/11/2010 10:51 AM, Stacy_Xu at bd.com wrote:
>
>     Hi, guys
>     There are several ways to do the background corrections for the new ST 1.0
>     chip from Affy from bio-conductor like xps, affy-aroma,pdfInfo and rma. And
>     almost all of those have a parameter to indicate background correction or
>     subtraction.  According  to  Affy  data  sheet for human ST 1.0 array,
>     "Background  is  estimated using a set of approximately 17,000 generic
>     background probes. Standard poly-A controls and hybridization controls are
>     represented on the arrays to allow convenient troubleshooting along the
>     entire experimental process."
>     I am wondering are the information or IDs of those 17,000 background RNA
>     probes somewhere in the CEL file where I can set up my negative cut off
>     expression level? They do have some "antigenomic" controls, but there are
>     only about 60 of those and they are not what we want.

Actually, I believe those are what you want (and I count 45 of them). 
Note that Affy states there are 17k _probes_, not probesets. If you look 
at the number of probes in a probeset for controls, you will see that 
they can number in the thousands. For instance, the AFFX-Bs-thr_st 
control probeset has 1189 probes.

And the data for these controls are in the celfile, as they are used by 
Affy's software. However, the cdf package that we make available for 
this chip, which is based on the unsupported cdf from Affy, doesn't 
contain the locations for these probesets.

However, the pd.hugene.1.1.st.v1 package that you can use in conjunction 
with the oligo package does contain these probe locations. As a check, I 
downloaded the annotation csv file from Affy and grep'ed out the 
antigenomic probe IDs. I then loaded up the pd.hugene.1.1.st.v1 package 
and looked to see how many probes there are:

 > bgp <- read.csv("bgp.csv", header=F) ## csv file with only 
antigenomic probes
 > dim(bgp)
[1] 45 39
 > sql <- paste("select * from pmfeature where fsetid in ('", 
paste(bgp[,1], collapse = "','"), "');", sep = "")
 > library(pd.hugene.1.1.st.v1)
 > con <- db(pd.hugene.1.1.st.v1)
 > a <- dbGetQuery(con, sql)
 > nrow(a)
[1] 16943

So the 45 antigenomic probesets have just about 17k probes.

AFAIK, oligo uses basic RMA to do background correction, so would not 
use these antigenomic probes. However, you can access them if you are 
willing to do some work.

Best,

Jim

>     Thanks,
>     Stacy
>     ************************************************************************
>     This message (which includes any attachments) is intended only for the
>     designated  recipient(s).  It  may contain confidential or proprietary
>     information and may be subject to the attorney-client privilege or other
>     confidentiality protections. If you are not a designated recipient, you may
>     not review, use, copy or distribute this message. If you receive this in
>     error, please notify the sender by reply e-mail and delete this message.
>     Thank you.
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues