[BioC] Analysing expression with tiling arrays

Sat Sep 11 11:25:19 CEST 2010

  January,

On Sep/10/10 9:21 AM, January Weiner wrote:
> Dear all,
>
> I have two tiling arrays of a bacterial genome. Unfortunately, I do
> not have the original files (like the bpmap / cel files for Affy
> tiling chips), just lists of spot intensities in two conditions for
> each probe (i.e. two values for each probe), and a list of gene
> positions on the genome. Several probes map on a each gene. The genome
> is not publicly available yet.
>
> What would be the best way to tackle this? I thought that I might just
> calculate the logFC for each probe, and then, for each gene, run a one
> sample t-test of the corresponding probe logFC values; then correct
> for multiple testing.

this sounds reasonable, just be aware that the noise in the data from 
neighbouring probes is likely correlated, so that the t-distribution 
with the 'naive' degrees of freedom will give you optimistic (too small) 
p-values. You can still use them for ranking / prioritizing genes, and 
perhaps set the cutoff from known positive and negative control genes.

>
> Would that make sense? I looked up the approach described in Toedling
> and Huber in 2008 PLoC Comp Biol (doi:10.1371/journal.pcbi.1000227)
> but this is not exactly what I had in mind; rather than looking for
> enriched regions, I'm more interested in focusing on the genes
> directly -- as a bacterial genome is densely packed with probes and
> genes (I have 10-30 probes per gene).
>
> Best regards,
>
> January
>

-- 

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber