[BioC] Problems reading multiple MAQ map files with ShortRead ReadAligned

Martin Morgan mtmorgan at fhcrc.org
Fri Mar 20 13:28:20 CET 2009


Martin Morgan <mtmorgan at fhcrc.org> writes:

> David Rossell <david.rossell at irbbarcelona.org> writes:
>
>> Hi, I'm using ShortRead package version 1.0.7 on R 2.8.0.
>>
>> I have three binary map files produced by MAQ that I want to read into
>> Bioconductor. I can read one file at a time with the function ReadAligned,
>> by typing
>>
>> aln <- readAligned(dir,"S1.map",type="MAQMap")
>>
>> This works like a charm. However, I cannot read all files in the directory
>>
>>> aln <- readAligned(dirPath=dir,type="MAQMap")
>> Error: UserArgumentMismatch
>>   'dirPath', 'pattern' must be 'character(1)'
>>
>> I tried specifying the pattern argument but the function still doesn't work.
>>
>>> aln <- readAligned(dirPath=dir,pattern="S.*",type="MAQMap")
>> Error: UserArgumentMismatch
>>   'dirPath', 'pattern' must be 'character(1)'
>>
>> Is this a bug or am I doing something wrong? Any ideas/help are most
>> welcome.
>
> neither, it's a design inconsistency that should be smoothed out --
> the MAQ binary files are read in one at a time; you can use 'append'
> on the results to combine them.
>
> aln <- append(readAligned(dirPath, "s_1.map", type="MAQMap"),
>               readAligned(dirPath, "s_2.map", type="MAQMap"))

sorry, this advice assumes the development version of ShortRead;
append is not available in the release version, which is what you are
using.

> For these whole-lane files a more typical work flow is to process one
> lane (e.g., filter + coverage) into a much smaller data structure, and
> then perhaps collate the results. Along the lines of
>
> doALane <- function(dirPath, fileName)
> {
>     # read and process a lane
> }
> dataDir <- "/some/path"
> runs <- lapply(list.files(dataDir), doALane, dirPath=dataPath)
> # combine elements of run as necessary
>
> Hope that helps,
>
> Martin
>
>> David
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list