[BioC] Agi4x44PreProcess probe summarization bug?

Tobias Straub tstraub at med.uni-muenchen.de
Tue Aug 4 20:24:09 CEST 2009


Hi Massimo,

in principle the effect should not be dramatic if the replicate  
signals are not too different. it's rather easy to correct the  
function by overriding it with the correct code. this is my quick fix,  
not tested at all.
have a look and try it out.

best
Tobias

######

`summarize.probe` <-
function(ddFILT,makePLOT,targets) {

	cat("SUMMARIZATION OF non-CTRL PROBES ","\n")
	cat("\n")
	if (!is(ddFILT, "RGList")){
	  stop("'input' must be a RGList")
   	 	if (is.null(dim(ddFILT)[1])) {
        		stop("'input' is empty")
	 	}
	}

	if(missing(targets)){
		stop("'targets' is missing ")
	}

	if("GErep" %in% colnames(targets)){
		GErep=targets$GErep
		nGE=sum(table(table(GErep)))
	}else{
		stop("'targets' needs 'GErep' field")
	}
		
	ddDUP=ddFILT
	dup=which(duplicated(ddDUP$genes$ProbeName)==TRUE)

		if(length(dup) == 0){
			ddPROC=ddFILT
			stop("NOT DUPLICATED ProbeName in chip")
			}

	
	Ldup=length(dup)
	reps=table(ddDUP$genes$ProbeName[dup])
	t=table(reps)
	rN=names(reps)
	Lreps=length(rN)
	
	nARR=dim(ddDUP)[2]

	if(Lreps !=0){
		for(i in 1:Lreps){
  			index=which(ddDUP$genes$ProbeName==rN[i])
  			MED=apply(ddDUP$G[index,],2,median)
  			for (k in 1:length(index)) {
  				ddDUP$G[index,][k,] <- MED
  			}
		}
		ddPROC=ddDUP[-dup,]
	} else{
		ddPROC=ddDUP
	}

		# cat("REPLICATED NonCtrl",Lreps,"\n")
		# cat("DISTRIBUTION OF REPLICATED NonControl Probes","\n")
		# print(t)
		# cat("# REPLICATED (redundant) probeNames",Ldup,"\n")
		# cat("\n")
		cat("SUMMARIZED DATA: ",dim(ddPROC),"\n")
		cat("------------------------------------------------------","\n")

	
	if(!missing(makePLOT)) {
	if(makePLOT){
	
	colorfill="green"		
	maintitle="NORMALIZED & SUMMARIZED SIGNAL"

	X11()
	par(mfrow=c(1,1))
	plotDensity(ddPROC$G,maintitle)

	X11()
	par(mfrow=c(1,1))
	BoxPlot(ddPROC$G,maintitle,colorfill)

	X11()
	maintitle="NORMALIZED & SUMMARIZED DATA - RLE "
	RLE(ddPROC$G,maintitle,colorfill)

	X11()
	par(mfrow=c(2,2),ask=T)
	maintitle="summ data"
	MVAplotMED(ddPROC$G,colorfill,maintitle)

	X11()
	par(mfrow=c(1,1))
	hierclus(ddPROC$G,GErep,methdis="euclidean",
        methclu="complete",sel=FALSE,100)

	}
	}

return(ddPROC)
} # end function

#####

On Aug 4, 2009, at 2:18 PM, Massimo Pinto wrote:

> Hi Tobias,
> I am trying to figure out what would be the effect of this potential  
> bug. Have you tried to correct the function to see what happens when  
> MED is returned?
> Massimo
>
> -- 
> Massimo Pinto
> Post Doctoral Research Fellow
> Enrico Fermi Centre and Italian Public Health Research Institute  
> (ISS), Rome
> http://claimid.com/massimopinto
>
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> L'Istituto Superiore di Sanità (ISS) è tra i beneficiari dei  
> proventi del 5 per mille dell'IRPEF.
> Nella scheda allegata alla dichiarazione dei redditi è sufficiente  
> apporre la propria firma nel riquadro "Finanziamento della Ricerca  
> Sanitaria" e indicare il Codice Fiscale dell'ISS, che è 80211730587,  
> per destinare tali fondi a sostegno dell'impegno scientifico  
> dell'ISS a difesa della salute di tutti.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

----------------------------------------------------------------------
Tobias Straub   ++4989218075439   Adolf-Butenandt-Institute, München D



More information about the Bioconductor mailing list