[BioC] Affy's 500K SNP arrays - retrieval of probe info

De Bondt, An-7114 [PRDBE] ADBONDT at PRDBE.jnj.com
Tue May 29 12:54:38 CEST 2007


Exactly, Ben, thanks a lot !

Applying this on the Sty based feature set (6553600 rows) results in:
3 vectors, each of length 3201544 (the other 3352056 are corresponding to
MM) 
and the centralSnps vector of length 454224.

What I do not understand yet:
The number of rows after snprma() is 238304 for Sty. How is that number
related to the length of centralSnps?
In advance, I expected that the length of centralSnps would have been 4
times the number of rows after snprma:
     one central snp for alleleA on the sense strand
     one central snp for alleleB on the sense strand
     one central snp for alleleA on the antisense strand
     one central snp for alleleB on the antisense strand

Kind regards,
An


-----Original Message-----
From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu]
Sent: Friday, 25 May 2007 14:33
To: De Bondt, An-7114 [PRDBE]
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Affy's 500K SNP arrays - retrieval of probe info


Hi An,

I'm assuming you want the offset and GC content for the PM probes, ok?

Say your probe-level data (SnpFeature object) is called "rawData".

theOffset <- pmPosition(get(annotation(rawData)))
theSequences <- pmSequence(get(annotation(rawDataa)))

centralSnps <- which(theOffset == 0)
percentGC <- sapply(gregexpr("G|C", theSequences), length)/25

b

On May 25, 2007, at 8:00 AM, De Bondt, An-7114 [PRDBE] wrote:

> Dear,
>
>> From the raw probe level data, we would like to select only those  
>> of the
> central SNP probe (position 0, with the SNP position exactly in the  
> middle)
> from the sense as well as from the antisense strand.  How can we do  
> this?
>
> We know we can get the GC content from that central probe based on the
> 'Mapping250K_Nsp snp info.txt' file.  How can we get %GC for each  
> of the
> other probes as well? Is there a cdf for the Nsp and Sty arrays? Or  
> can we
> get this info out of the pd.mapping250k.nsp/pd.mapping250k.sty? Or  
> is there
> another way to get that info?
>
> Thanks in advance for your help!
>
> Regards,
> An


--
Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University
bcarvalh at jhsph.edu



More information about the Bioconductor mailing list