[BioC] Repeat masker sequences as GRanges object

Hermann Norpois hnorpois at gmail.com
Mon Sep 15 13:19:58 CEST 2014


Thanks,

I followed your instructions but I got an error message:

library (AnnotationHub)
> hub <- AnnotationHub()
> seq.masked <- hub$goldenpath.hg19.database.rmsk_0.0.1.RData
Retrieving ‘goldenpath/hg19/database/rmsk_0.0.1.RData’
Fehler: Lesefehler aus Verbindung # 'Fehler' means error and 'Verbindung'
connection

But I was able to download the package. So, principally a "connection" is
possible.

sessionInfo ()
R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=de_DE.UTF-8
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] AnnotationHub_1.0.2  GenomicRanges_1.12.5 IRanges_1.18.4
[4] BiocGenerics_0.6.0

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.22.6 Biobase_2.20.1       BiocInstaller_1.10.4
[4] compiler_3.1.0       DBI_0.3.0            rjson_0.2.14
[7] RSQLite_0.11.4       stats4_3.1.0         tools_3.1.0

Whats wrong with my connection?

Thanks
Hermann


2014-09-12 15:30 GMT+02:00 James W. MacDonald <jmacdon at uw.edu>:

> Hi Hermann,
>
> How about this:
>
> > library(AnnotationHub)
> > hub <- AnnotationHub()
> > hub$goldenpath.hg19.database.rmsk_0.0.1.RData
> GRanges with 5298130 ranges and 2 metadata columns:
>                          seqnames               ranges strand   |
> name
>                             <Rle>            <IRanges>  <Rle>   |
> <character>
>         [1]                  chr1 [16777161, 16777470]      +   |
> AluSp
>         [2]                  chr1 [25165801, 25166089]      -   |
> AluY
>         [3]                  chr1 [33553607, 33554646]      +   |
> L2b
>         [4]                  chr1 [50330064, 50332153]      +   |
> L1PA10
>         [5]                  chr1 [58720068, 58720973]      -   |
> L1PA2
>         ...                   ...                  ...    ... ...
> ...
>   [5298126] chr21_gl000210_random       [25379, 25875]      +   |
> MER74B
>   [5298127] chr21_gl000210_random       [26438, 26596]      -   |
> MIRc
>   [5298128] chr21_gl000210_random       [26882, 27022]      -   |
> MIRc
>   [5298129] chr21_gl000210_random       [27297, 27447]      +   |
> HAL1-2a_MD
>   [5298130] chr21_gl000210_random       [27469, 27682]      +   |
> HAL1-2a_MD
>                 score
>             <numeric>
>         [1]      2147
>         [2]      2626
>         [3]       626
>         [4]     12545
>         [5]      8050
>         ...       ...
>   [5298126]      1674
>   [5298127]       308
>   [5298128]       475
>   [5298129]       371
>   [5298130]       370
>   ---
>   seqlengths:
>                     chr1                  chr2 ... chr18_gl000207_random
>                249250621             243199373 ...                  4262
>
> This is a GRanges of all features from UCSC's Repeat Masker table.
>
> Best,
>
> Jim
>
>
>
>
> On Thu, Sep 11, 2014 at 3:16 AM, Hermann Norpois <hnorpois at gmail.com>
> wrote:
>
>> Hello,
>>
>> I would like to have repeat sequences as GRanges object
>> I started with ...
>>
>> library (BSgenome.Hsapiens.UCSC.hg19)
>> ch1 <- Hsapiens$chr1
>> active (masks (ch1))
>> AGAPS   AMB    RM   TRF
>>  TRUE  TRUE FALSE FALSE
>> active (masks(ch1))["RM"] <- TRUE
>> active (masks (ch1))
>> AGAPS   AMB    RM   TRF
>>  TRUE  TRUE  TRUE FALSE
>>
>> Can anyboldy give me a hint how to continue.
>>
>> Thanks
>> hermann
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list