[BioC] seqnames on dbSNP object error

Mark Cowley m.cowley at garvan.org.au
Mon Apr 11 02:49:00 CEST 2011


cheers Hervé, that works well
M

On 10/04/2011, at 5:38 AM, Hervé Pagès wrote:

> Hi Mark,
> 
> Sorry for the late answer.
> 
> On 11-04-05 08:38 PM, Mark Cowley wrote:
>> Hi folks,
>> i'm trying to compare a few different GRanges objects (such as mutations made by me, SNPs from SNPlocs.Hsapiens.dbSNP.20101109 and transcripts from GenomicFeatures), and since the dbSNP data package uses 'ch1' as opposed to 'chr1', I need to coerce the seqnames so that they can be matched.
>> Here's the error:
>> 
>>> snps<- getSNPlocs("ch17", as.GRanges=TRUE)
>>> seqnames(snps)<- rep('chr17', length(snps))
>> Error in validObject(.Object) :
>>   invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>> 
>> According to the man page this should be doable:
>> ?`seqnames,GRanges-method`
>>    ‘seqnames(x)’, ‘seqnames(x)<- value’: Gets or sets the sequence names.
>>           ‘value’ can be an Rle object, character vector, or factor.
> 
> There is an example at the bottom of the man page showing how to
> do this. In your case:
> 
>  seqnames(snps) <- sub("ch", "chr", seqnames(snps))
> 
> Note that the mechanism for renaming, dropping, adding or reordering
> the names of the underlying sequences has been revisited in BioC 2.8
> (soon-to-be-released). The user will do this thru the seqlevels getter
> and setter.
> 
> Cheers,
> H.
> 
>> 
>> # Try using an Rle:
>>> seqnames(snps)<- Rle(rep('chr17', length(snps)))
>> Error in validObject(.Object) :
>>   invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>> 
>> 
>> Reproducible code in a fresh R session:
>> library(SNPlocs.Hsapiens.dbSNP.20101109)
>> snps<- getSNPlocs("ch17", as.GRanges=TRUE)
>> seqnames(snps)<- rep('chr17', length(snps))
>> Error in validObject(.Object) :
>>   invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>> 
>> snps
>> GRanges with 641905 ranges and 2 elementMetadata values
>>          seqnames               ranges strand   |   RefSNP_id alleles_as_ambig
>>             <Rle>             <IRanges>   <Rle>    |<character>       <character>
>>      [1]     ch17         [ 293,  293]      *   |     9747578                R
>>      [2]     ch17         [ 828,  828]      *   |    62053745                Y
>>      [3]     ch17         [ 834,  834]      *   |     9747082                R
>>      [4]     ch17         [1389, 1389]      *   |    62053747                R
>>      [5]     ch17         [1397, 1397]      *   |    34845611                Y
>>      [6]     ch17         [1665, 1665]      *   |    34151105                Y
>>      [7]     ch17         [1869, 1869]      *   |    62053748                W
>>      [8]     ch17         [1880, 1880]      *   |    77383171                Y
>>      [9]     ch17         [1897, 1897]      *   |    75157665                R
>>      ...      ...                  ...    ... ...         ...              ...
>> [641897]     ch17 [81180274, 81180274]      *   |    74430365                R
>> [641898]     ch17 [81189713, 81189713]      *   |    74334266                Y
>> [641899]     ch17 [81189731, 81189731]      *   |    75151244                W
>> [641900]     ch17 [81190324, 81190324]      *   |    76196913                S
>> [641901]     ch17 [81190344, 81190344]      *   |    78502756                R
>> [641902]     ch17 [81190367, 81190367]      *   |     2850176                Y
>> [641903]     ch17 [81190378, 81190378]      *   |    71264801                R
>> [641904]     ch17 [81190400, 81190400]      *   |    74838487                R
>> [641905]     ch17 [81193098, 81193098]      *   |    77334326                R
>> 
>> seqlengths
>>   ch1  ch2  ch3  ch4  ch5  ch6  ch7  ch8 ... ch19 ch20 ch21 ch22  chX  chY chMT
>>    NA   NA   NA   NA   NA   NA   NA   NA ...   NA   NA   NA   NA   NA   NA   NA
>> 
>>> sessionInfo()
>> R version 2.12.1 (2010-12-16)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>> 
>> locale:
>> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> 
>> other attached packages:
>> [1] SNPlocs.Hsapiens.dbSNP.20101109_0.99.2
>> [2] GenomicRanges_1.2.3
>> [3] IRanges_1.8.8
>> 
>> loaded via a namespace (and not attached):
>> [1] tools_2.12.1
>> 
>> 
>> 
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319



More information about the Bioconductor mailing list