[BioC] mutant allele read counts
Valerie Obenchain
vobencha at fhcrc.org
Sat Jun 14 01:33:03 CEST 2014
Hi,
I see you've got an old version of the package. The release version is
1.10.1. FYI you can see all release versions here:
http://www.bioconductor.org/checkResults/release/bioc-LATEST/
Please update R and reinstall packages with biocLite(). If you need any
help with that guidelines are here:
http://www.bioconductor.org/install/
If you want to send me a small test file offline I can confirm that the
current version can read it.
Valerie
On 06/13/2014 03:06 PM, Murli wrote:
> Sorry, forgot to post the code contained in the R file in the earlier
> post. These are the few lines in vcfToCravat.R that I am sourcing.
> library(VariantAnnotation)
> DataDir="../Project_NAI_01117_TNWGS/Sample_3_Middle_lobe_tumor/analysis/"
> vcfFile=paste(DataDir,"3_Middle_lobe_tumor--2_mucosal_normal.snv.mutect.v1.1.4.annotated.vcf",sep
> ="")
> vcfFile=paste(ataDir,"3_Middle_lobe_tumor--2_mucosal_normal.indel.sominddet.v2.3-9.vcf",sep="")
> vr <- readVcfAsVRanges(vcfFile, "hg19")
> df <- as.data.frame(vr)
>
>
> On Fri, Jun 13, 2014 at 5:47 PM, Murli <murlinair at gmail.com
> <mailto:murlinair at gmail.com>> wrote:
>
> Hi Valerie,
> Thanks for the help. I am getting the following errors when I am
> reading the vcf files.
> vr <- readVcfAsVRanges(vcfFile, "hg19")
> Error in lapply(ivar[inms], drop) :
> error in evaluating the argument 'X' in selecting a method for
> function 'lapply': Error in normalizeSingleBracketSubscript(j, x) :
> subscript contains invalid names
>
> With another file I am getting the following
> > source("vcfToCravat.R")
> Error in validObject(.Object) :
> invalid class VRanges
>
> Cheers../Murli
>
>
> On Fri, Jun 13, 2014 at 3:29 PM, Valerie Obenchain
> <vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>> wrote:
>
> Hi,
>
> Use readVcfAsVRanges() then coerce to a data.frame.
>
> fl <- system.file("extdata", "chr7-sub.vcf.gz",
> package="VariantAnnotation")
> vr <- readVcfAsVRanges(fl, "hg19")
> df <- as.data.frame(vr)
>
> You'll have some extra columns in the data.frame but you can
> remove / rename columns as necessary.
>
> Valerie
>
>
>
>
>
> On 06/13/2014 10:46 AM, Murli [guest] wrote:
>
> Hi,
> I am interested in extracting information for functional
> annotation using CRAVAT. It requires the data to be in the
> following format.
> ==============================__=============
> # UID / Chr. / Position / Strand / Ref. base / Alt. base /
> Sample ID (optional)
> TR1 chr17 7577506 - G T TCGA-02-0231
> TR2 chr10 123279680 - G A
> TCGA-02-3512
> TR3 chr13 49033967 + C A
> TCGA-02-3532
> TR4 chr7 116417505 + G T
> TCGA-02-1523
> TR5 chr7 140453136 - T A
> TCGA-02-0023
> TR6 chr17 37880998 + G T
> TCGA-02-0252
> Ins1 chr17 37880998 + G GT
> TCGA-02-0252
> Del1 chr17 37880998 + GA G
> TCGA-02-0252
> CSub1 chr2 39871235 + ATGCT GA
> TCGA-02-0252
>
> ==============================__=================
> http://www.cravat.us/help.jsp?__chapter=how_to_cite&article=# <http://www.cravat.us/help.jsp?chapter=how_to_cite&article=#>
>
> I am trying to extract this information from vcf files
> generated by mutect. I am using VariantAnnotation extract
> this information. I have read the file using readVcf(), and
> renamed the chromosomes according to txdb.
>
> rowData(newVcfData)
> GRanges with 62991 ranges and 5 metadata columns:
> seqnames ranges strand
> | paramRangeID
> <Rle> <IRanges> <Rle>
> | <factor>
> 1:109641_A/G chr1 [109641, 109641] *
> | <NA>
> 1:526561_T/G chr1 [526561, 526561] *
> | <NA>
> 1:691958_G/A chr1 [691958, 691958] *
> | <NA>
> 1:763781_A/T chr1 [763781, 763781] *
> | <NA>
> rs6594026 chr1 [782981, 782981] *
> | <NA>
> ... ... ... ...
> ... ...
> rs480725 chrX [154903224, 154903224] *
> | <NA>
> X:154925893_C/T chrX [154925893, 154925893] *
> | <NA>
> X:155038107_C/G chrX [155038107, 155038107] *
> | <NA>
> X:155204257_G/T chrX [155204257, 155204257] *
> | <NA>
> X:155234730_T/C chrX [155234730, 155234730] *
> | <NA>
> REF ALT
> QUAL FILTER
> <DNAStringSet> <DNAStringSetList>
> <numeric> <character>
> 1:109641_A/G A G
> 8.90 PASS
> 1:526561_T/G T G
> 9.19 PASS
> 1:691958_G/A G A
> 13.74 PASS
> 1:763781_A/T A T
> 16.03 PASS
> rs6594026 C T
> 11.24 PASS
> ... ... ...
> ... ...
> rs480725 A T
> 6.39 PASS
> X:154925893_C/T C T
> 6.53 PASS
> X:155038107_C/G C G
> 6.64 PASS
> X:155204257_G/T G T
> 6.35 PASS
> X:155234730_T/C T C
> 6.51 PASS
> ---
> seqlengths:
> chr1 chr10 chr11 chr12 chr13 chr14 ... chr5 chr6
> chr7 chr8 chr9 chrX
> NA NA NA NA NA NA ... NA NA
> NA NA NA NA
>
>
> Can the information be extracted using VariantAnnotation()?
> I would appreciate your help with this.
> Thanks ../Murli
>
>
>
> -- output of sessionInfo():
>
> sessionInfo()
>
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils
> datasets methods
> [8] base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.__knownGene_2.10.1
> [2] GenomicFeatures_1.14.5
> [3] AnnotationDbi_1.24.0
> [4] Biobase_2.22.0
> [5] VariantAnnotation_1.8.13
> [6] Rsamtools_1.14.3
> [7] Biostrings_2.30.1
> [8] GenomicRanges_1.14.4
> [9] XVector_0.2.0
> [10] IRanges_1.20.7
> [11] BiocGenerics_0.8.0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0
> DBI_0.2-7
> [5] RCurl_1.95-4.1 RSQLite_0.11.4
> rtracklayer_1.22.7 stats4_3.0.2
> [9] tools_3.0.2 XML_3.98-1.1 zlibbioc_1.8.0
>
>
> --
> Sent via the guest posting facility at bioconductor.org
> <http://bioconductor.org>.
>
>
>
> --
> Valerie Obenchain
> Program in Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, Seattle, WA 98109
>
> Email: vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>
> Phone: (206) 667-3158 <tel:%28206%29%20667-3158>
>
>
>
--
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109
Email: vobencha at fhcrc.org
Phone: (206) 667-3158
More information about the Bioconductor
mailing list