[BioC] pd.mapping250k.sty package: featureSet:fragment_length

Zhu, Julie Julie.Zhu at umassmed.edu
Fri Sep 17 19:25:28 CEST 2010


Jim,

Thank you very much for the detailed information! It all makes sense.

Best regards,

Julie


On 9/17/10 12:58 PM, "James W. MacDonald" <jmacdon at med.umich.edu> wrote:

> Hi Julie,
> 
> On 9/17/2010 9:53 AM, Zhu, Julie wrote:
>> Hi,
>> 
>> Could someone please tell me whether the fragment_length in the featureSet
>> of pd.mapping250k.sty is the fragment_length of the sample? Are there
>> documentations available for looking up the meanings of each field?
> 
> The fragment_length is the length of the restriction fragment. You could
> hypothetically have figured this out yourself by comparing the fragment
> length to the data on the netaffx site. Unfortunately, it looks like the
> current version of the pd.mapping250k.sty package is out of date when
> compared to what netaffx has, as the fragment length data for these two
> probesets don't agree.
> 
> This is not true of the pd.genomewidesnp.6 package, which is what I have
> installed. So for instance,
> 
>> dbGetQuery(con, "select fragment_length, fragment_length2, man_fsetid
>   from featureSet limit 10;")
>     fragment_length fragment_length2    man_fsetid
> 1              395              217 SNP_A-2131660
> 2               NA              702 SNP_A-1967418
> 3              633              883 SNP_A-1969580
> 4              831              399 SNP_A-4263484
> 5              970              611 SNP_A-1978185
> 6             1508              711 SNP_A-4264431
> 7               NA              921 SNP_A-1980898
> 8               NA              243 SNP_A-1983139
> 9               NA              194 SNP_A-4265735
> 10             420              858 SNP_A-1995832
> 
> the fragment_length and fragment_length2 data here do agree (well, at
> least the two I checked agree ;-P) with netaffx.
> 
> As for the other field names, most seem clear to me. Is there one in
> particular that is not clear?
> 
>> 
>> Some rows have NAs for most the fields even though the allele information is
>> known, is this expected?
> 
> It is expected, depending on when the package was built. We are simply
> taking data from Affymetrix and re-packaging into an object that is
> easier to use, so we are dependent on the data we get from Affy. Since
> annotation of genetic data is a moving target, things are always changing.
> 
> We only build these packages on a semi-annual basis, so we end up out of
> date quite quickly. This is a tradeoff between having the most
> up-to-date data, and having stable data packages that people can rely on.
> 
> We do provide the functionality to build your own, so if you desire the
> most up-to-date package, you can build a personal package using the
> pdInfoBuilder package.
> 
> Best,
> 
> Jim
> 
> 
>> 
>> Thanks so much for your help!
>> 
>> library("pd.mapping250k.sty")
>> con = db(pd.mapping250k.sty)
>> dbListFields(con, "featureSet")
>>   [1] "fsetid"          "man_fsetid"      "dbsnp_rs_id"     "chrom"
>>   [5] "physical_pos"    "strand"          "cytoband"        "allele_a"
>>   [9] "allele_b"        "gene_assoc"      "fragment_length" "dbsnp"
>> [13] "cnv"
>> 
>> dbGetQuery(con, "select * from featureSet order by fsetid desc limit 2")
>>    fsetid    man_fsetid dbsnp_rs_id chrom physical_pos strand cytoband
>> allele_a allele_b
>> 1 238378 SNP_A-4301986   rs6989223     8      5214036      -    p23.2
>> A        G
>> 2 238377 SNP_A-2291495  rs11644392<NA>            NA<NA>      <NA>
>> A        G
>>    fragment_length dbsnp
>> 1            1667     0
>> 2              NA    NA
>> 
>> 
>> Best regards,
>> 
>> Julie
>> 
>> sessionInfo()
>> R version 2.11.1 (2010-05-31)
>> x86_64-apple-darwin9.8.0
>> 
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> other attached packages:
>> [1] pd.mapping250k.sty_1.0.0 RSQLite_0.9-2            DBI_0.2-5
>> [4] oligo_1.12.2             oligoClasses_1.10.0      Biobase_2.8.0
>> [7] affxparser_1.20.0
>> 
>> loaded via a namespace (and not attached):
>> [1] affyio_1.16.0         Biostrings_2.16.9     IRanges_1.6.11
>> preprocessCore_1.10.0
>> [5] splines_2.11.1        tools_2.11.1
>> 
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list