[BioC] shortread error?

Thu Jan 24 05:40:22 CET 2013

On 01/23/2013 07:55 PM, Wang Peter wrote:
> i am running a pipeline to clean the reads
>
> there was an error
> Error in add(raw()) : record does not start with '@'
> Calls: trimRead -> yield -> yield -> <Anonymous> -> add -> .Call
> Execution halted
>
> i never met it before,i think that is maybe due to data problem?
> one fastq record start without  '@'

maybe sequence and quality length mismatch, or maybe a programming error on my 
part. I guess you have a FastqStreamer() or FastqSampler, and you can print the 
streamer after the error to find out about where in the file the error is 
occurring. To illustrate after

   example(FastqStreamer)

we get a file 'fl' that is a fastq file, and create a streamer

 > strm = FastqStreamer(fl, 100)

and yield twice

 > yield(strm)
class: ShortReadQ
length: 100 reads; width: 36 cycles
 > yield(strm)
class: ShortReadQ
length: 100 reads; width: 36 cycles

and then find that the current buffer holds 100 records, and we've already added 
200.

 > strm
class: FastqStreamer
file: s_1_sequence.txt
status: n=100 current=100 added=200 total=200

You could print out your streamer after the error to find the approximate place 
in the file where the error occurs. Also if it is a programming error on my 
part, then you can perhaps work around this by choosing a different yield size 
or 'readerBlockSize'; making these prime numbers might be particularly effective 
at avoiding the error (but I'd rather fix the bug).

It would help to have more code to see what you are doing. If it is not clear 
that there is an error in the fastq file, then it would help to make a 
reproducible example (simple code and sample data file) available to me.

Martin

>

-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793