[BioC] custom affy chip, mm values with NA (affy, makecdfenv)

James Bullard bullard at berkeley.edu
Wed May 4 07:03:19 CEST 2005


Sorry to respond to my own thread, but I believe I have some more 
insight on what is happening. From my previous post I was having some 
problems with some mm values which were NA. I interogated which they 
were and I got the following:

 > indexProbes(affybatch,  which = "both",  "02280106180000.6757_pPM_GC")
$"02280106180000.6757_pPM_GC"
[1] 505546     NA

I have exactly 5 genes where this is true (pm = index, mm = NA). When I 
examined the code for make.cdf.env there is a block:

if(length(mm)==0)
      return(cbind(pm=pm, rep(NA,length(pm))))
else
      return(cbind(pm=pm,mm=mm))

I put some print statements in there and the mm length is 0 for my five 
genes. I assumed since the code was checking and handling this case that 
it probably meant that NA values in the mm field were okay, but now I am 
not sure that I can say either way. On a possibly related note, I 
realized that the cbind was giving me many warnings saying that the 
lengths of the columns dont match, ie:

1: number of rows of result
    is not a multiple of vector length (arg 2) in: cbind(pm = pm, mm = mm)

This occurrs in the cbind above for a fair number of the probesets. I 
had assumed (again probably incorrectly) that because the code wasnt 
even checking to see if they were same length that these were 
*ignorable* warnings. Now, upon typing that thought I realize that it is 
probably the opposite, but I am pretty sure I dont know enough about cdf 
files to say what this indicates, and how I can correct the problem.

thanks again, jim


James Bullard wrote:

>
> I have a custom affy chip (dont know what more information would be 
> relevant here (I am new to this)). I am attempting to perform 
> background correction using the bg.correct.mas function and am running 
> into problems because of na values in a very small (5 or so) number of 
> the mm values (At least, I think this is why I am running into problems).
>
> First, I noticed that none of the bg.correct.*, normalize.* methods 
> have an na.rm parameter - this seems to indicate to me that having NA 
> values in pm, mm matrices is not expected and therefore I am reading 
> in the data incorrectly. Is this true? I took it for granted that it 
> was not true, and have been trying to exclude them after the fact.
>
> First, to get the data into R i do:
>
> > cdf.env     <- make.cdf.env("Mar_12_2004.cdf")
> > affybatch <- read.affybatch(filenames = c("T1.CEL"))
> > affybatch at cdfName <- "cdf.env"
>
> This occurs without incident (save the warning: Incompatible phenoData 
> object. Created a new one.)
>
> So then I want to do the following:
>
> > bg.correct.mas(affybatch)
> Error in as.vector(data) : NA/NaN/Inf in foreign function call (arg 1)
>
> So... My first thought was to find all probesets with NA values, and 
> then remove them from the AffyBatch object (please excuse the codes 
> ugliness, just trying to make it work for now):
>
> remove.na.vals <- function(affybatch) {
>  na.probesets <- NULL
>
>  for (ps in probeset(affybatch)) {
>    if (is.na(sum(pm(ps))) || is.na(sum(mm(ps)))) {
>      na.probesets <- c(na.probesets, ps at id)
>    }
>  }
>  na.probesets
> }
>
> So using the above function I do the following:
>
> ab.probes <-  probeset(affybatch, setdiff(geneNames(affybatch), 
> remove.na.vals(affybatch)))
>
> This gives me a list of probeset objects which have no NA values in 
> either pm or  mm column. I then want create/modify the AffyBatch 
> object to use just this probeset. I cannot set the pm, mm values 
> because they have different dimensions. I am sure there are 
> alternate/superior solutions to this problem. As I said before I am 
> new to bioconductor and so potentially I am on the wrong track 
> entirely. Some information which might be important
> is below, Thanks in advance. Jim
>
> R 2.0.1
> bioconductor 1.5
> affy_1.5.8-1
> makecdfenv_1.4.8
> debian 2.6 (sarge)
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list