[BioC] HTqPCR R Bioconductor question (problem reading SDS files for qPCR) (Alexander Williams)

Alexander Williams alex.williams at gladstone.ucsf.edu
Wed Jan 5 20:36:20 CET 2011


Hi everyone,

 Solution to SDS file woes:  in case anyone else is also trying to read ".sds" files and analyze them with HTqPCR (a Bioconductor R module), I just learned that there's Windows-only software from Applied Biosystems that will apparently turn these SDS (raw-output-from-the-machine) files into regular normal human-readable files of the type that HTqPCR expects.

 The person I got the SDS files for is now sending me normal human-readable files with (I assume) the same data. So if anyone else encounters weirdness when handling SDS files, the simple solution appears to be to just have them converted to more normal files.

 Alex


> 
> 
> Hi,
> 
> I'm trying to use HTqPCR (the Bioconductor module for R, by Heidi Dvinge / Paul Bertone) to analyze "SDS"-format qPCR files.
> 
> However, this isn't working for me at all! I instead get an error, as shown below.
> 
> The example in the HTqPCR help page works just fine, but my files appear to be in the SDS format, and none of the example files are in SDS format. In fact, I can't even *find* another SDS file anywhere else on the Internet!
> 
> According to R's "sessionInfo()" command, I have version HTqPCR_1.5.0 of HTqPCR, which appears to be the latest.
> 
> 1. ==============================
> 
> The command I run is:
> 	> readCtData("C_TaqMan_Data/A_HumanA_Plate_1_of_2//Bulk-1A-10-13-10.sds",    SDS=TRUE)
> 
> And the error I get is:
> 	# Error in `[.data.frame`(sample, , Ct) : undefined columns selected
> 
> With the traceback information as follows:
> 	# Enter a frame number, or 0 to exit   
> 	1: readCtData("C_TaqMan_Data/A_HumanA_Plate_1_of_2//Bulk-1A-10-13-10.sds", SDS = TRUE)
> 	2: matrix(sample[, Ct], ncol = n.data[i])
> 	3: as.vector(data)
> 	4: sample[, Ct]
> 	5: `[.data.frame`(sample, , Ct)
> 
> 
> 2. ==============================
> 
> If I enter frame #1 and type ls(), I see the various variables that "readCtData" uses:
> 
> [1] "cat"         "cols"        "Ct"          "feature"     "file.header" "files"       "flag"        "header"      "i"           "na.value"    "ncum"        "n.data"      "n.features"  "n.header"   
> [15] "nsamples"    "nspots"      "out"         "path"        "position"    "readfile"    "sample"      "samples"     "SDS"         "s.names"     "type"        "X"  
> 
> 
> "sample" seems to be relevant, so I checked it out, and found 391 entries of seemingly-random data. I assume this is because the file is not being read in as I am expecting.:
> 
> 388		\xbd
> 389		\x80
> 390		\036\b\xf9\b\xbf\bw\a\xf7\a\x82\a_\a\x87\bY
> 391		\xc2\v\xbd\016\002\020-\021L\021\xac\021\023\017\xa5
> 
> 
> 3. ==============================
> 
> Just to try it out, if I try running the same command with "SDS=FALSE", then I get a similar error:
> 
>> readCtData("C_TaqMan_Data/A_HumanA_Plate_1_of_2//Bulk-1A-10-13-10.sds", SDS=FALSE)
> Error in `[.data.frame`(sample, , Ct) : undefined columns selected
> In addition: Warning message:
> In readCtData("C_TaqMan_Data/A_HumanA_Plate_1_of_2//Bulk-1A-10-13-10.sds",  :
>  384 gene names (rows) expected, got 1003839
> 
> ("sample" is again filled with random-seeming data---although the exact same ones as when SDS=TRUE).
> 
> ==============================
> 
> 
> The files that I'm trying to read are ".sds" files. Each one is about 16 MB, and they look like this at the top (viewing with /bin/less, and truncating long lines):
> 
> SDS2^@^B^@^BRelQRtRnc3C ^@^P^@^X^@^Ao^\^ ...
> ^ANAME^@^@^@^PBulk-1A-10-13-10DETT^@^@^@>^@^CDNAM^@^@^@^NMammU6-4395470TASK^@^@^@ ...
> ^ANAME^@^@^@^PBulk-1A-10-13-10DETT^@^@^@>^@^CDNAM^@^@^@^NMammU6-4395470TASK^@^@^@ ...
> ^ANAME^@^@^@^PBulk-1A-10-13-10DETT^@^@^@1^@^CDNAM^@^@^@^MRNU44-4373384TASK^@^@^@ ...
> ^ANAME^@^@^@^PBulk-1A-10-13-10DETT^@^@^@7^@^CDNAM^@^@^ ...
> ^ANAME^@^@^@^PBulk-1A-10-13-10DETT^@^@^@1^@^CDNAM^@^@^@^MRNU48-4373383TASK ...
> 
> The files are almost-but-not-quite human-readable. Below the text at the top is a whole bunch of binary-format data that makes little sense to me.
> 
> 
> 
> =================================
> 
> 
> If anyone knows what I'm doing wrong in regards to using R's Bioconductor module HTqPCR to analyze SDS files, please let me know!
> 
> Thanks,
> 
> Alex Williams
> 
> 



More information about the Bioconductor mailing list