[BioC] flowCore split function with curv2Filter and population argument

Thu Apr 14 02:21:06 CEST 2011

Hi, Aric
Looking at the source code for split(signature("flowSet","list")) in flowCore, the function presumes that the number of populations in each filterResult is equal, which is clearly not the case here. There are notes in the code that this is a problem, so it seems to be a known bug.

Below is a modified "split" method for signature("flowSet","filterResultList").

I'll work on getting this fix into the development version.

setMethod("split",
       signature=signature(x="flowSet",
                           f="filterResultList"),
       definition=function(x, f, drop=FALSE, population=NULL,
                           prefix=NULL, ...)
   {
		lf<-length(f)
		sample.name <- sampleNames(x)
		if(length(x)!=length(f)){
			stop("filterResultList and flowSet must be same ",
                "length.", call.=FALSE)		
		}
		lapply(f, flowCore:::compatibleFilters,  f[[1]])
       if(is.null(population)){
           if(all(unlist(lapply(f,function(q)!is.null(names(q))))))
				population<-unique(unlist(lapply(f,names)))
           else
               population <- c("positive", "negative")
       } else if(!all(sapply(population, is, "character")))
           stop("'population' must be a single character vector ",
                "or a list of character vectors", call.=FALSE)
       if(!is.list(population)){
           n <- population
           population <- as.list(population)
           names(population) <- n
       }
       finalRes <- vector(mode="list", length=length(population))
       names(finalRes) <- names(population)
       for(p in seq_along(population)){
	          tp <- population[p]
           res <- vector(mode="list", length=lf)
 		for(i in 1:lf){
				l <- try(split(x[[i]], f[[i]], population=tp,
                        prefix=prefix, flowSet=FALSE, ...),silent=TRUE)
				if(inherits(l,"try-error")){
					if(geterrmessage()==paste("Error : The following are not valid population names in this filterResult:\n\t",tp,"\n",sep="")){
						message("Creating an empty flowFrame for population ",tp,"\n")
						#Create an empty flowFrame
						l<-x[[i]][0,];
						identifier(l)<-paste(identifier(l),paste("(",tp,")",sep=""),sep=" ")
						l<-list(l);
					}else
						stop("Can't split flowFrame ",sampleNames(x[i])," on population ",tp);
				}
             res[[i]] <- l[[1]]
             if(!is.null(prefix)){
                 if(is.logical(prefix) && prefix)
                     names(res)[i] <- paste(names(l), "in", sample.name[i])
                 else if(is.character(prefix))
                     names(res)[i] <- paste(prefix, sample.name[i])
             }else
             names(res)[i] <- sample.name[i]      
			}
			np <- names(population)[p]
         finalRes[[np]] <- flowSet(res, phenoData=phenoData(x))
         phenoData(finalRes[[np]])$population <- np
         varMetadata(finalRes[[np]])["population", "labelDescription"] <-
             "population identifier produced by splitting"
		  }
       #n <- f at frameId
       #f <- f at .Data
       #names(f) <- n
       #split(x, f, drop=drop, population=NULL, prefix=NULL, ...)
		return(finalRes);
   })

On 2011-04-12, at 5:21 PM, Aric Gregson wrote:

> Hello,
> 
> I am attempting to split out flowSet result from a curv2Filter applied to
> a flowSet. I realize that to split without using the 'population'
> argument all samples must have the same populations, which is not the
> case here. Plotting the result of the filter shows populations 'area 1'
> through 'area 4'. I am interested in obtaining 'area 3' which is present
> in flowFrames 1, 2 and 5. I attempt this with the following code:
> 
>> nk <- split(Data(wf[['NK-']])[c(1,2,5)], filter_cv2_cd3,
> population="area 3")
> Error in `names<-`(`*tmp*`, value = c("rest", "area 1", "area 2", "area 3" : 
> Length of replacement vector doesn't match.
> 
> which suggests to me that it is ignoring the 'population'
> argument. Taking a look at the result 'nk' confirms this:
> 
>> nk
> $rest
> A flowSet with 2 experiments....
> 
> $`area 1`
> A flowSet with 2 experiments....
> 
> $`area 2`
> A flowSet with 2 experiments....
> 
> $`area 3`
> A flowSet with 2 experiments....
> 
> These flowFrames do not have an 'area 4'. Is this not the proper way to
> go about splitting the flowSet or is 'split' not functioning correctly?
> 
> Thanks in advance for any suggestions. 
> 
> Aric
> (flowCore 1.16.0)
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Greg Finak, PhD
Post-doctoral Research Associate
PS Statistics, Vaccine and Infectious Disease Division.
Fred Hutchinson Cancer Research Center
Seattle, WA
(206)667-3116
gfinak at fhcrc.org