[BioC] using limma package for paired t-test: Error: (subscript) logical subscript too long

Wed Jan 18 21:04:16 CET 2012

Hi Viritha,

On 1/18/2012 2:47 PM, viritha k wrote:
> Hi James,
> I used row.names =1 as suggested by you and Martin.I did not get any 
> error and included these steps.
> >fit_pair <- eBayes(fit_pair)
> > topTable(fit_pair, coef="TreatT")
> and got a list
>                  ID             logFC    AveExpr         
> t              P.Value            adj.P.Val         B
> 114659 3807490 1938.54365 4407.17510  93.62245 5.066916e-06 0.6773048 
> -4.555623
> 93178  3536336   89.25101   76.95120  57.13317 2.044799e-05 0.9313141 
> -4.555652
> 15914  2523632   85.55042   75.01551  48.02453 3.339131e-05 0.9313141 
> -4.555671
> 82198  3401197  114.17548  185.84179  46.26832 3.709470e-05 0.9313141 
> -4.555677
> 82576  3405396 -112.96339  214.93046 -43.24963 4.487733e-05 0.9313141 
> -4.555687
> 44334  2900091  227.49472  197.41073  39.63887 5.739658e-05 0.9313141 
> -4.555702
> 1373   2330451   76.19923  155.62751  38.99148 6.012674e-05 0.9313141 
> -4.555706
> 124852 3923312   62.34274   96.81574  38.91146 6.047631e-05 0.9313141 
> -4.555706
> 6531   2398894  279.18572  372.08333  38.28233 6.332302e-05 0.9313141 
> -4.555709
> 34618  2772414 -162.60332  150.62089 -36.12449 7.458547e-05 0.9313141 
> -4.555722
>
> Here I have considered only 6 samples and tumor vs normal as I would 
> like to try the whole dataset in 64 bit machine( due to memory issues) 
> later if this code works.
> My actuall intention is to design the paired ttest for multiple 
> subgroups for 80 patients with tumor and their respective normal 
> samples(80). (with in brackets are the no of subjects)
> Subgroups:
>
> Group(160)-   Tumor(80),     Normal(80)
> Gender(80)-   Female(27),    Male(53)
> Stage(80)-     I(4),      II(7),        III(54),        IV(15)
> Age(77)->=55(53), <55(24), unknown(3)
>
> How to include these conditions, is it by just mentioning in the 
> targets file? and how do I have to change the rest to get this design?
> Is it possible to perform this in one go or should it be performed as 
> different conditions indiviually?
> waiting for your suggestions,

Hypothetically you could set this up by a correctly-designed targets 
file, but I generally forgo the targets file for direct construction of 
the design matrix.

That said, your question has diverged IMO from a technical (how do I get 
the software to work) into a statistical (how do I analyze these data) 
question. I am more than happy to help with technical issues, but I am 
not so keen to help with statistical questions. The reasons for this are 
many, but include the fact that there is much more to a given analysis 
than setting up a design matrix (and without the data in hand, I cannot 
say what other issues may exist), as well as the fact that I get paid to 
do analyses and it isn't in my best interest to give my work away for free.

I would suggest a close reading of the limma User's Guide, as well as 
any number of linear modeling textbooks (or perhaps a consultation with 
a local statistician).

Best,

Jim

> Thanks,
> Viritha
> On Tue, Jan 17, 2012 at 4:49 PM, James W. MacDonald 
> <jmacdon at med.umich.edu <mailto:jmacdon at med.umich.edu>> wrote:
>
>     Hi Viritha,
>
>
>     On 1/17/2012 4:36 PM, viritha k wrote:
>
>         Hi group,
>         I am trying to perform paired t-test for 6 samples which are
>         paired one is
>         from normal tissue of the subject and the other is tumor
>         tissue of the same
>         subject.
>         I am following the code as mentioned in the Limma User
>         guide,p.40,8.3
>         Paired Samples)
>
>         Code:
>
>             source("http://bioconductor.org/biocLite.R")
>             biocLite("limma")
>             library(limma)
>             targets<-readTargets("targets.txt")
>             head(targets)
>
>            FileName Pair Treatment
>         1 GSM675890    1         N
>         2 GSM675891    1         T
>         3 GSM675892    2         N
>         4 GSM675893    2         T
>         5 GSM675894    3         N
>         6 GSM675895    3         T
>
>             eset<-as.matrix(read.table("6samples.txt",sep='\t',header=TRUE,colClasses=c(rep('numeric',7)),nrow=133673))
>             head(eset)
>
>
>     At the very least you should add a row.names = 1 to your call to
>     read.table(). You want the ID to be the row.names of your matrix,
>     not the first column.
>
>     Since the dimensions of your matrix don't match the number of rows
>     of your design matrix, I would expect a different error,
>
>     Error in lm.fit(design, t(M)) : incompatible dimensions
>
>     So there might be something else wrong. You don't show the final
>     design matrix, so no telling.
>
>     Best,
>
>     Jim
>
>
>          ID_REF GSM675890 GSM675891 GSM675892 GSM675893 GSM675894
>         GSM675895
>         [1,] 2315129  30.32278  20.42571   7.60854  17.15130  14.57533
>          22.22889
>         [2,] 2315145  12.74657   6.30117  11.43528   4.10696   3.12693
>          10.96096
>         [3,] 2315163 175.96267 125.77725  52.19822 102.07567 116.91966
>         174.41690
>         [4,] 2315198   6.57030   1.85541   3.34829   1.13516   0.34278
>           1.83917
>         [5,] 2315353  88.49511  48.77128  50.60524  62.92448  47.10977
>          45.06430
>         [6,] 2315371   2.01707   1.90644 536.07636   2.21359   0.00212
>           0.43249
>
>             Pair<-factor(targets$Pair)
>             Treat<-factor(targets$Treatment,levels=c("N","T"))
>             design<-model.matrix(~Pair+Treat)
>             fit_pair<-lmFit(eset,design)
>
>         Error: (subscript) logical subscript too long
>
>             sessionInfo()
>
>         R version 2.14.1 (2011-12-22)
>         Platform: i386-pc-mingw32/i386 (32-bit)
>         locale:
>         [1] LC_COLLATE=English_United States.1252
>         [2] LC_CTYPE=English_United States.1252
>         [3] LC_MONETARY=English_United States.1252
>         [4] LC_NUMERIC=C
>         [5] LC_TIME=English_United States.1252
>         attached base packages:
>         [1] stats     graphics  grDevices utils     datasets  methods
>           base
>         other attached packages:
>         [1] limma_3.10.1        BiocInstaller_1.2.1
>         loaded via a namespace (and not attached):
>         [1] tools_2.14.1
>         Please suggest as to where is the issue?
>         Thanks,
>         Viritha
>
>                [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     Douglas Lab
>     University of Michigan
>     Department of Human Genetics
>     5912 Buhl
>     1241 E. Catherine St.
>     Ann Arbor MI 48109-5618
>     734-615-7826 <tel:734-615-7826>
>
>     **********************************************************
>     Electronic Mail is not secure, may not be read every day, and
>     should not be used for urgent or sensitive issues
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues