probe length was:RE: [BioC] GCRMA backgrounds?

Michael Barnes Michael.Barnes at
Thu Jul 22 20:48:28 CEST 2004

I wasn't trying to be difficult and I hope you didn't take it that way. 

Simply I am currently in need of information regarding what probe
length is best and I thought following up your comment might be a way to
find references.  Of course, Affy says 25-mers are best.  And there must
be an optimal length for the reasons you explained.  However, I wonder
what is the evidence that 25-mers are best as opposed to, say 20-mers,
30-mers, 50-mers, 70-mers or anything else.  Hopefully there are some
suggestions and references out there that could help me.  

On a related question...  Affy claims 25-mers, yet they synthesize
their oligos on the chips.  We all know reactions are not perfect so
there must be some amount of synthesis failure.  Does anyone have a feel
for the percentage of complete/incomplete oligos on an affy feature? 
And are the short oligos prevented from binding to your sample in some

BTW:  If you can find ANYTHING on the Affy site, more power to you:)


>>> "Matthew  Hannah" <Hannah at> 07/22/04 03:29AM >>>
I should have said it was just a logical guess.

What I meant was that if you had 2 homologous genes, obviously it 
is going to be harder to avoid homologous regions if you need to find 
50bp versus 25bp? But this is refering to cross-hybridisation between
PM and related sequences, I don't know how it would affect non-specific

binding of PM to non-complementary sequences (am I right to distinguish

these?). I should have said 'less-' rather than non-homologous, or
dropped the 'non-' in the initial post. Also this would only apply
there were related sequences present, but then different probe-lengths

for different sequences wouldn't be ideal.

Also while we're on logic another reason to consider is that with
probesets per mRNA, for short mRNAs there is already some overlap,
would be worse for longer probes, making them less independent. It
also extend the probed region further from the 3' end from where
occurs and so efficiency may be reduced?

If you need a reference I'm sure the affy website or some of their
would have something.

Sorry for any confusion.


-----Original Message-----
From: Michael Barnes [mailto:Michael.Barnes at] 
Sent: Mittwoch, 21. Juli 2004 19:49
To: Matthew Hannah; bioconductor at 
Subject: Re: [BioC] GCRMA backgrounds?

What are references for this?


>>> "Matthew  Hannah" <Hannah at> 07/21/04 12:45PM >>>

As for the 25mers, the obvious thing to take into account is that
as you increase in length it is more likely that non-homologous
probes will bind as it would be more difficult to find sequences
that are gene specific.


Bioconductor mailing list
Bioconductor at 


	I've been using GCRMA and the new speedier version (1.1) 
gives different values than the older slower version (1.0). 

	Looking through the bioconductor mails suggests that 
a few other people identified a similar problem, related to a
background not being subtracted. Hopefully people are on the case, 
but this problem seems to have been around since April. I've been 
plugging GCRMA to my colleagues, who are now starting to use it, 
so I hope the problem can be sorted out.

	On a different note, what technical limitations stop 
Affymetrix going for much longer probes than 25 bases? The work 
of Naef and Magnasco, and Wu and Irizarry, highlight the 
limitations of Affy technology due to cross-hybridisation, when 
there are only 25 bases. Pushing upwards to 50 bases will reduce CH, 
but what other factors then come in? 

	My understanding is that the Affy SNP chips have 25 base 
oligos. What is stopping these chips from also having 
cross-hybridisation issues?	

	Best wishes,

More information about the Bioconductor mailing list