[BioC] seqnames on dbSNP object error

Hervé Pagès hpages at fhcrc.org
Sat Apr 9 21:38:43 CEST 2011


Hi Mark,

Sorry for the late answer.

On 11-04-05 08:38 PM, Mark Cowley wrote:
> Hi folks,
> i'm trying to compare a few different GRanges objects (such as mutations made by me, SNPs from SNPlocs.Hsapiens.dbSNP.20101109 and transcripts from GenomicFeatures), and since the dbSNP data package uses 'ch1' as opposed to 'chr1', I need to coerce the seqnames so that they can be matched.
> Here's the error:
>
>> snps<- getSNPlocs("ch17", as.GRanges=TRUE)
>> seqnames(snps)<- rep('chr17', length(snps))
> Error in validObject(.Object) :
>    invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>
> According to the man page this should be doable:
> ?`seqnames,GRanges-method`
>     ‘seqnames(x)’, ‘seqnames(x)<- value’: Gets or sets the sequence names.
>            ‘value’ can be an Rle object, character vector, or factor.

There is an example at the bottom of the man page showing how to
do this. In your case:

   seqnames(snps) <- sub("ch", "chr", seqnames(snps))

Note that the mechanism for renaming, dropping, adding or reordering
the names of the underlying sequences has been revisited in BioC 2.8
(soon-to-be-released). The user will do this thru the seqlevels getter
and setter.

Cheers,
H.

>
> # Try using an Rle:
>> seqnames(snps)<- Rle(rep('chr17', length(snps)))
> Error in validObject(.Object) :
>    invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>
>
> Reproducible code in a fresh R session:
> library(SNPlocs.Hsapiens.dbSNP.20101109)
> snps<- getSNPlocs("ch17", as.GRanges=TRUE)
> seqnames(snps)<- rep('chr17', length(snps))
> Error in validObject(.Object) :
>    invalid class "Seqinfo" object: 'seqnames(x)' must be an unnamed character vector with no NAs
>
> snps
> GRanges with 641905 ranges and 2 elementMetadata values
>           seqnames               ranges strand   |   RefSNP_id alleles_as_ambig
>              <Rle>             <IRanges>   <Rle>    |<character>       <character>
>       [1]     ch17         [ 293,  293]      *   |     9747578                R
>       [2]     ch17         [ 828,  828]      *   |    62053745                Y
>       [3]     ch17         [ 834,  834]      *   |     9747082                R
>       [4]     ch17         [1389, 1389]      *   |    62053747                R
>       [5]     ch17         [1397, 1397]      *   |    34845611                Y
>       [6]     ch17         [1665, 1665]      *   |    34151105                Y
>       [7]     ch17         [1869, 1869]      *   |    62053748                W
>       [8]     ch17         [1880, 1880]      *   |    77383171                Y
>       [9]     ch17         [1897, 1897]      *   |    75157665                R
>       ...      ...                  ...    ... ...         ...              ...
> [641897]     ch17 [81180274, 81180274]      *   |    74430365                R
> [641898]     ch17 [81189713, 81189713]      *   |    74334266                Y
> [641899]     ch17 [81189731, 81189731]      *   |    75151244                W
> [641900]     ch17 [81190324, 81190324]      *   |    76196913                S
> [641901]     ch17 [81190344, 81190344]      *   |    78502756                R
> [641902]     ch17 [81190367, 81190367]      *   |     2850176                Y
> [641903]     ch17 [81190378, 81190378]      *   |    71264801                R
> [641904]     ch17 [81190400, 81190400]      *   |    74838487                R
> [641905]     ch17 [81193098, 81193098]      *   |    77334326                R
>
> seqlengths
>    ch1  ch2  ch3  ch4  ch5  ch6  ch7  ch8 ... ch19 ch20 ch21 ch22  chX  chY chMT
>     NA   NA   NA   NA   NA   NA   NA   NA ...   NA   NA   NA   NA   NA   NA   NA
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] SNPlocs.Hsapiens.dbSNP.20101109_0.99.2
> [2] GenomicRanges_1.2.3
> [3] IRanges_1.8.8
>
> loaded via a namespace (and not attached):
> [1] tools_2.12.1
>
>
>
>
> 	[[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list