[BioC] probe length was:RE: [BioC] GCRMA backgrounds?

paul.boutros at utoronto.ca paul.boutros at utoronto.ca
Fri Jul 23 14:54:16 CEST 2004

One reference I know is the initial paper introducing the Agilent chips by Tim 

Hughes, T. R., M. Mao, et al. (2001). "Expression profiling using microarrays 
fabricated by an ink-jet oligonucleotide synthesizer." Nat Biotechnol 19(4): 

There is a section in the results "impact of oligonucleotide length on 
hybridization properties" that you might want to check out.


Date: Thu, 22 Jul 2004 14:48:28 -0400 
From: "Michael Barnes" <Michael.Barnes at cchmc.org> 
Subject: probe length was:RE: [BioC] GCRMA backgrounds? 
To: <Hannah at mpimp-golm.mpg.de> 
Cc: bioconductor at stat.math.ethz.ch 
Message-ID: <s0ffd3d8.039 at n6mcgw16.cchmc.org> 
Content-Type: text/plain; charset=US-ASCII 

I wasn't trying to be difficult and I hope you didn't take it that way. 

Simply I am currently in need of information regarding what probe 
length is best and I thought following up your comment might be a way to 
find references.  Of course, Affy says 25-mers are best.  And there must 
be an optimal length for the reasons you explained.  However, I wonder 
what is the evidence that 25-mers are best as opposed to, say 20-mers, 
30-mers, 50-mers, 70-mers or anything else.  Hopefully there are some 
suggestions and references out there that could help me.   

On a related question...  Affy claims 25-mers, yet they synthesize 
their oligos on the chips.  We all know reactions are not perfect so 
there must be some amount of synthesis failure.  Does anyone have a feel 
for the percentage of complete/incomplete oligos on an affy feature? 
And are the short oligos prevented from binding to your sample in some 

BTW:  If you can find ANYTHING on the Affy site, more power to you:) 


>>> "Matthew  Hannah" <Hannah at mpimp-golm.mpg.de> 07/22/04 03:29AM >>> 
I should have said it was just a logical guess. 

What I meant was that if you had 2 homologous genes, obviously it 
is going to be harder to avoid homologous regions if you need to find 
50bp versus 25bp? But this is refering to cross-hybridisation between 
PM and related sequences, I don't know how it would affect non-specific 

binding of PM to non-complementary sequences (am I right to distinguish 

these?). I should have said 'less-' rather than non-homologous, or 
dropped the 'non-' in the initial post. Also this would only apply 
there were related sequences present, but then different probe-lengths 

for different sequences wouldn't be ideal. 

Also while we're on logic another reason to consider is that with 
probesets per mRNA, for short mRNAs there is already some overlap, 
would be worse for longer probes, making them less independent. It 
also extend the probed region further from the 3' end from where 
occurs and so efficiency may be reduced? 

If you need a reference I'm sure the affy website or some of their 
would have something. 

Sorry for any confusion. 


-----Original Message----- 
From: Michael Barnes [mailto:Michael.Barnes at cchmc.org] 
Sent: Mittwoch, 21. Juli 2004 19:49 
To: Matthew Hannah; bioconductor at stat.math.ethz.ch 
Subject: Re: [BioC] GCRMA backgrounds? 

What are references for this? 


>>> "Matthew  Hannah" <Hannah at mpimp-golm.mpg.de> 07/21/04 12:45PM >>> 

As for the 25mers, the obvious thing to take into account is that 
as you increase in length it is more likely that non-homologous 
probes will bind as it would be more difficult to find sequences 
that are gene specific. 


Bioconductor mailing list 
Bioconductor at stat.math.ethz.ch 


        I've been using GCRMA and the new speedier version (1.1) 
gives different values than the older slower version (1.0). 

        Looking through the bioconductor mails suggests that 
a few other people identified a similar problem, related to a 
background not being subtracted. Hopefully people are on the case, 
but this problem seems to have been around since April. I've been 
plugging GCRMA to my colleagues, who are now starting to use it, 
so I hope the problem can be sorted out. 

        On a different note, what technical limitations stop 
Affymetrix going for much longer probes than 25 bases? The work 
of Naef and Magnasco, and Wu and Irizarry, highlight the 
limitations of Affy technology due to cross-hybridisation, when 
there are only 25 bases. Pushing upwards to 50 bases will reduce CH, 
but what other factors then come in? 

        My understanding is that the Affy SNP chips have 25 base 
oligos. What is stopping these chips from also having 
cross-hybridisation issues?         

        Best wishes, 

More information about the Bioconductor mailing list