[BioC] Create a sample file /tab delimited file in QuasR

chris [guest] guest at bioconductor.org
Thu Oct 31 17:30:01 CET 2013


I am somewhat new to R and am trying to load a file into QuasR.  I downloaded the file e coli chip-seq data set SRR653521.sra from NCBIs SRA database (SRA accession number: SRX220452, GEO accession number: GSM1072327). I used ncbi sratoolkit to convert this to SRR653521.fastq.  I am trying to load this into R using QuasR to do alignment, GO/pathway analysis, etc, but first need to get the fastq data into R for use with QuasR. I am using Windows 7, with current R, Bioconductor, and required packages.

I followed the vignette found with browseVignettes(), and opened "An Introduction to QuasR".  As in section 2.3, I loaded:

library(QuasR)
library(BSgenome)
library(Rsamtools)
library(rtracklayer)
library(GenomicFeatures)
library(Gviz)
as well as library(ShortRead)

for the reference genome I typed:
available.genomes();
genomeName="BSgenome.Ecoli.NCBI.20080805"

and for the sample file, I initially tried:
sampleFile=readFastq("C:\\Users\\Chris\\Documents\\SRA\\SRX220452\\SRR653521\\SRR653521.fastq")

This did not work, so I tried to make a matrix for use in writing a tab delimited sample file, but was unsuccessful:

> sampleMatrix=matrix(c("C:\\Users\\Chris\\Documents\\SRA\\SRX220452\\SRR653521\\SRR653521.fastq"=FileName, Sample1=SampleName),nrow=2,ncol=2,byrow=TRUE)
Error in matrix(c(`C:\\Users\\Chris\\Documents\\SRA\\SRX220452\\SRR653521\\SRR653521.fastq` = FileName,  : 
  object 'FileName' not found

> sampleMatrix=matrix(c("C:\\Users\\Chris\\Documents\\SRA\\SRX220452\\SRR653521\\SRR653521.fastq", Sample1),nrow=2,ncol=2,byrow=TRUE, dimnames=c(flies,samples)
+ )
Error in matrix(c("C:\\Users\\Chris\\Documents\\SRA\\SRX220452\\SRR653521\\SRR653521.fastq",  : 
  object 'Sample1' not found

Can somebody please tell give me an example of code/syntax that will allow me to create a sample file? Also, I am not sure if my fastq file is for single end read, or paired end read.  The examples listed in the vignette are only for files found in the "extdata" folder, and I don't know how to proceed.  As soon as I get a sample file,  I can probably figure out the rest. I hope this question isn't too basic, and thanks in advance for any help!

 -- output of sessionInfo(): 

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome.Ecoli.NCBI.20080805_1.3.17 BiocInstaller_1.12.0                Gviz_1.6.0                          GenomicFeatures_1.14.0             
 [5] AnnotationDbi_1.24.0                Biobase_2.22.0                      rtracklayer_1.22.0                  BSgenome_1.30.0                    
 [9] ShortRead_1.20.0                    Rsamtools_1.14.1                    lattice_0.20-24                     Biostrings_2.30.0                  
[13] QuasR_1.2.0                         Rbowtie_1.2.0                       GenomicRanges_1.14.3                XVector_0.2.0                      
[17] IRanges_1.20.3                      BiocGenerics_0.8.0                 

loaded via a namespace (and not attached):
 [1] biomaRt_2.18.0      biovizBase_1.10.0   bitops_1.0-6        cluster_1.14.4      colorspace_1.2-4    DBI_0.2-7           dichromat_2.0-0     Hmisc_3.12-2       
 [9] hwriter_1.3         labeling_0.2        latticeExtra_0.6-26 munsell_0.4.2       plyr_1.8            RColorBrewer_1.0-5  RCurl_1.95-4.1      rpart_4.1-3        
[17] RSQLite_0.11.4      scales_0.2.3        stats4_3.0.2        stringr_0.6.2       tools_3.0.2         XML_3.98-1.1        zlibbioc_1.8.0     

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list