[BioC] makePdInfoPackage in preparation for RMA with oligo on Nimblegen Expression Arrays

Jack Schonbrun schonbrun at amyris.com
Tue Jul 14 23:58:23 CEST 2009


Here's what I get:

> ndf <- read.delim(ndfFile, stringsAsFactors=FALSE, nrow=100)
> str(ndf)
'data.frame':   100 obs. of  17 variables:
 $ PROBE_DESIGN_ID   : chr  "6531_0301_0005" "6531_0311_0005" "6531_0331_0005" "6531_0333_0005" ...
 $ CONTAINER         : chr  "SACCHAROMYCES1" "SACCHAROMYCES1" "NGS_CONTROLS" "NGS_CONTROLS" ...
 $ DESIGN_NOTE       : chr  "rank_selected" "rank_selected" "upper right fiducial" "" ...
 $ SELECTION_CRITERIA: chr  "rank:03;score:379;uniq:14;count:37;freq:01;rules:1;tm:82.4" "rank:05;score:046;uniq:14;count:1110;freq:30;rules:1;tm:78.3" "bright" "" ...
 $ SEQ_ID            : chr  "SCER070900001885" "SCER070900001596" "FIDUCIAL_UPPER_RIGHT" "CROSSHYBE" ...
 $ PROBE_SEQUENCE    : chr  "GTCAACCCTGCAAGATCTCTGGGTGCCGCCGTTGCTGCCAGATATTTCCCTCATTACCAC" "TCAGTTGGAACGCCTCTGAGCACTCCATCACCTGAGTCAGGTAATACATTTACTGATTCA" "TGAGTTGTTTGATAGGATTATTCATAGAGGTCATTACAGCGAGAGGAANNNNNNNNN" "CGATGCGACGCGAACTAAGCAGTTCGGCGCAGTCGACTAGTATAACAGNNNNNNNN" ...
 $ MISMATCH          : int  0 0 0 0 0 0 0 0 0 0 ...
 $ MATCH_INDEX       : int  72062965 72061238 2000207 70654015 70652179 65069272 65069273 65069274 65069275 65069276 ...
 $ FEATURE_ID        : int  72062965 72061238 71722817 71722819 71722820 71722824 71722825 71722826 71722827 71722828 ...
 $ ROW_NUM           : int  5 5 5 5 6 6 6 6 6 6 ...
 $ COL_NUM           : int  301 311 331 333 1 5 6 7 8 9 ...
 $ PROBE_CLASS       : chr  "experimental" "experimental" "fiducial" "control:crosshybe" ...
 $ PROBE_ID          : chr  "SCER070900001885P00271" "SCER070900001596P00406" "CPK6" "XENOTRACK48P02" ...
 $ POSITION          : int  271 406 0 2 0 0 5 0 6 0 ...
 $ DESIGN_ID         : int  6531 6531 6531 6531 6531 6531 6531 6531 6531 6531 ...
 $ X                 : int  301 311 331 333 1 5 6 7 8 9 ...
 $ Y                 : int  5 5 5 5 6 6 6 6 6 6 ...
>

-----Original Message-----
From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu] 
Sent: Tuesday, July 14, 2009 2:56 PM
To: Jack Schonbrun
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] makePdInfoPackage in preparation for RMA with oligo on Nimblegen Expression Arrays

what do you get if you run the following (assuming ndfFile is a  
variable has the file name)?

ndf <- read.delim(ndfFile, stringsAsFactors=FALSE, nrows=100)
str(ndf)

thanks,

b

On Jul 14, 2009, at 6:49 PM, Jack Schonbrun wrote:

> Benilton,
>
> Thanks for your suggestions.
>
> By every means I have tested, the file is tab delimited.  And the  
> first row is headers, all other data.
>
> Here is how the first (header) row looks:
> PROBE_DESIGN_ID CONTAINER       DESIGN_NOTE      
> SELECTION_CRITERIA      SEQ_ID  PROBE_SEQUENCE  MISMATCH         
> MATCH_INDEX     FEATURE_ID      ROW_NUM COL_NUM PROBE_CLASS      
> PROBE_ID        POSITION        DESIGN_ID       X       Y
>
> Any other details on how the ndf is expected to look?
>
> Thanks again,
> Jack
>
>
>
>
>
> -----Original Message-----
> From: Benilton Carvalho [mailto:bcarvalh at jhsph.edu]
> Sent: Tuesday, July 14, 2009 1:34 PM
> To: Jack Schonbrun
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] makePdInfoPackage in preparation for RMA with  
> oligo on Nimblegen Expression Arrays
>
> Jack,
>
> it looks like your NDF isn't as expected.
>
> When it shows: "inserting 0 rows into table 'featureSet'", it makes me
> wonder how the SEQ_ID column in the NDF looks like.
>
> But, instead of looking at the columns' contents right now, please
> make sure the delimiters of the NDF are tabs. It doesn't appear that's
> the case. Note the warning "In max(ndfdata[["X"]]): no non-missing
> arguments to max; returning -Inf"... It suggests that ndfdata[["X"]]
> is NULL.
>
> Another thing: ensure the first line of the NDF is the header (column
> names) and the data start on the 2nd line.
>
> PLease let me know how it goes.
>
> b
>
> On Jul 14, 2009, at 3:57 PM, Jack Schonbrun wrote:
>
>> Hello,
>>
>> I would like to use the oligo package to run the RMA algorithm on
>> Nimblegen expression arrays.  To that end, I am attempting to
>> construct an annotation package using makePdInfoPackage().
>>
>> I have followed the pattern in the "Building Annotation Packages
>> with pdInfoBuilder
>> for Use with the oligo Package" vignette:
>>
>> ----------------
>>
>>> ndfFile.test <- "test.ndf"
>>> xysFile.test <- "test.xys"
>>> seed.test <- new("NgsExpressionPDInfoPkgSeed", ndfFile =
>>> ndfFile.test, xysFile = xysFile.test)
>>> makePdInfoPackage(seed.test, destDir = "./Annotation")
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> Building annotation package for Nimblegen Expression Array
>> NDF:  test.ndf
>> XYS:  test.xys
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> Parsing file: test.ndf ... OK
>> Parsing file: test.xys ... OK
>> Merging NDF and XYS files ...OK
>> Preparing contents for featureSet table ...OK
>> Preparing contents for bgfeature table ...OK
>> Preparing contents for pmfeature table ...OK
>> Creating package in ./Annotation/pd.test
>> Inserting 0 rows into table "featureSet"... Error in
>> sqliteExecStatement(con, statement, bind.data) :
>> RS-DBI driver: (incomplete data binding: expected 2 parameters, got
>> 0)
>> In addition: Warning messages:
>> 1: In max(ndfdata[["Y"]]) :
>> no non-missing arguments to max; returning -Inf
>> 2: In max(ndfdata[["X"]]) :
>> no non-missing arguments to max; returning -Inf
>> 3: In sqliteExecStatement(con, statement, bind.data) :
>> ignoring zero-row bind.data
>>
>> ------------------
>>
>> Any help on why it would only be inserting 0 rows, or any of the
>> other messages would be greatly appreciated.  It does make some
>> files in the destDir, but does not run to completion.  Listing of
>> this directory available if it would help.
>>
>> I am running on Windows XP SP 2.  sessionInfo follows.
>>
>>> sessionInfo()
>> R version 2.9.1 (2009-06-26)
>> i386-pc-mingw32
>>
>> locale:
>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.
>> 1252;LC_MONETARY=English_United States.
>> 1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] pdInfoBuilder_1.8.1      affxparser_1.16.0
>> RSQLite_0.7-1            DBI_0.2-4
>> makePlatformDesign_1.8.0 oligo_1.8.1
>> [7] preprocessCore_1.6.0     oligoClasses_1.6.0
>> Biobase_2.4.1            affyio_1.12.0
>>
>> loaded via a namespace (and not attached):
>> [1] Biostrings_2.12.7 IRanges_1.2.3     splines_2.9.1     tools_2.9.1
>>
>>
>> ===========================
>> Jack Schonbrun Ph.D.
>> Software Developer
>> Amyris Biotech
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list