[BioC] pd.mapping 10Karray

Seth Falcon sfalcon at fhcrc.org
Sat Aug 18 19:30:31 CEST 2007


Hi Tobias,

Tobias Verbeke <tobias.verbeke at telenet.be> writes:
> It appears the cause is that the author and genomebuild
> field are empty. It might be a good idea to check for
> this or enforce the presence of these fields.

I agree.  The pdInfoBuilder code is a bit rough around the edges.  We
wanted to make a prototype available asap and the interface is not as
friendly as it could be.

> However, along the way, we discovered other issues.
> For example, in the loadAffyCsv function (loaders.R),
> there is a selection of columns based on column number
> that is not appropriate for the 10k files:
>
> This is the relevant snippet:
>
>   wantedCols <- c(1,2,3,4,7,8,10,12,13,14,17)
>                                         # added 10/14
>   df <- read.table(con, sep=",", stringsAsFactors=FALSE, nrows=10,
>                    na.strings="---", header=TRUE)[, wantedCols]
>
> To match the needed columns for 10k files, the numbers 5, 6 and 15 are
> needed as well. It might however be a better idea to just read in
> the header and match on a character vector with prespecified names
> to determine the wanted columns (before reading in the rest for real).

Yes, I'm not sure why we are using the column numbers instead of
names.

> Once this problem is solved, the function runs fine. There is however
> another error message in the loadAffySeqCsv
> file
>
> t <- ST(loadAffySeqCsv(db, csvSeqFile, cdfFile, batch_size=batch_size))
>
> Error in sqliteExecStatement(con, statement, bind.data) :
> 	RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must be
> unique)
> Timing stopped at: 0.58 0.05 0.73 NA NA

The error is telling you that you are trying to insert a record into
the sequence table with a feature ID (fid) that is already in the
table.  Why that would be occuring, I'm not sure.  There could be
something different about how the 10k chips are organized, I suppose.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
BioC: http://bioconductor.org/
Blog: http://userprimary.net/user/



More information about the Bioconductor mailing list