[BioC] Problem in obtaining differentially expressed genes from an unbalanced set of data

Fátima Núñez fnunez at usal.es
Mon Jun 30 19:09:16 MEST 2003


Hello all,

Again, a very naïve question regarding the analysis of my data which I
am performing following some of the scripts imparted during the Milan
BioC course. I'm trying to obtain differentially expressed genes from a
subset of data (3 CONTROL files and 2 PROBLEM files) by assigning scores
and cutoffs. But I encounter the following problem when I try:

> rmadata3$Genotype
[1] CONT CONT CONT VAV  VAV 
Levels: CONT VAV
> Index1 <- which(rmadata3$Genotype=="CONT")
> Index2 <- which(rmadata3$Genotype=="VAV")
> scores <- esApply (rmadata3, 1, function(x) {
+       tmp <-t.test(x[Index2], x[Index1], var.equal =TRUE)     
+       c(mean(tmp$estimate), -diff(tmp$estimate), tmp$statistic,
tmp$p.value) 
+  })
> scores <- t(scores)
> colnames(scores) <- c("a", "m", "t.test", "p.value")
> plot(scores[,1], scores[,2], xlab = "A", ylab = "M", pch =".")
> abline(h= c(-1, 1))
> plot(scores[,2], abs(scores[,3]), xlab = "M", ylab ="t.test", pch =
".")
> abline(v= c(-1, 1))
> a <- qt(0.975, 4)
> abline(h = a)
> sum(scores[, 4] <= 0.05)
[1] NA
> sum(scores[, 4] <= 0.01)
[1] NA

I have tried this with a balanced set of data (3 CONT, 3 PROB) with no
problems whatsoever. I wonder if this is then due to the fact that the
data set is unbalanced for the number of CONT and PROB samples. If this
is the case, how could I go on about it?

Thanks in advance for your help,

Fatima
_______
 

 
Fátima Núñez, PhD
Centre for Cancer Research (CIC)
University of Salamanca-CSIC
Campus Unamuno
37007 Salamanca                                                   
Spain
Phone: + 34 923 294802
Fax:     + 34 923 294743
E-mail: fnunez at usal.es



More information about the Bioconductor mailing list