[BioC] TCC::ERROR: Need the design matrix for GLM

Gordon K Smyth smyth at wehi.EDU.AU
Fri Apr 18 02:36:23 CEST 2014


Dear Panka,

It seems as if you are just using the TCC package to call methods from the 
edgeR package indirectly.

Why not use the edgeR package directly?  That would probably be easier and 
you would have a more direct understanding of the methods being used. 
Your experiment is almost identical to the oral carcinoma case study in 
the edgeR User's Guide.

Best wishes
Gordon


> Date: Tue, 15 Apr 2014 13:51:17 +0000
> From: Pankaj Agarwal <p.agarwal at duke.edu>
> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Cc: "kadota at bi.a.u-tokyo.ac.jp" <kadota at bi.a.u-tokyo.ac.jp>
> Subject: [BioC] TCC::ERROR: Need the design matrix for GLM.
>
> Hi,
>
> I have a rna-seq data consisting of matched tumor/normal samples from two patients.  For normalization of the counts I am following the steps in the TCC vignette section "3.3 Normalization of two-group count data without replicates (paired)".  The output from the commands are as follows:
>
>>  data=read.delim("count_bt2_iGenomes_Ensembl.tsv")
>
>> head(data)
>                A.sorted.bam B.sorted.bam
> ENSG00000000003                               2400                      1130
> ENSG00000000005                                  2                         3
> ENSG00000000419                               1819                       575
> ENSG00000000457                               1317                      1262
> ENSG00000000460                                799                      1743
> ENSG00000000938                                203                        41
>                C.sorted.bam D.sorted.bam
> ENSG00000000003                          12                          72
> ENSG00000000005                           0                           0
> ENSG00000000419                         938                        1608
> ENSG00000000457                         821                        1469
> ENSG00000000460                         367                         800
> ENSG00000000938                       33303                       16355
>
>> group <- c(1,1,2,2)
>> pair <- c(1,2,1,2)
>>  c1 <- data.frame(group=group, pair=pair)
>> colnames(data) <- c("T_BRPC13.1118", "T_BRPC_13.764", "N_DU04_PBMC", "N_DU06_PBMC")
>>  tcc <- new("TCC", data, c1)
>> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE)
> TCC::INFO: Calculating normalization factors using DEGES
> TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 )
> Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) :
>  TCC::ERROR: Need the design matrix for GLM.
>
> Reading further for steps needed for edgeR without TCC I saw something related to design and tried it, but got the same error:
>
>> design <- model.matrix(~ group + pair)
>>  tcc <- new("TCC", data, c1)
>> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE)
> TCC::INFO: Calculating normalization factors using DEGES
> TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 )
> Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) :
>  TCC::ERROR: Need the design matrix for GLM.
>
> I would appreciate help with understanding the cause of the error.
>
> The output from sessionInfo() and package description is as follows:
>
>> sessionInfo()
> R version 3.0.3 (2014-03-06)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> packageDescription("TCC")
> Package: TCC
> Type: Package
> Title: TCC: Differential expression analysis for tag count data with
>        robust normalization strategies
> Version: 1.2.0
> Author: Jianqiang Sun, Tomoaki Nishiyama, Kentaro Shimizu, and Koji
>        Kadota
> Maintainer: Jianqiang Sun <wukong at bi.a.u-tokyo.ac.jp>, Tomoaki
>        Nishiyama <tomoakin at staff.kanazawa-u.ac.jp>
> Description: This package provides a series of functions for performing
>        differential expression analysis from RNA-seq count data using
>        robust normalization strategy (called DEGES). The basic idea of
>        DEGES is that potential differentially expressed genes or
>        transcripts (DEGs) among compared samples should be removed
>        before data normalization to obtain a well-ranked gene list
>        where true DEGs are top-ranked and non-DEGs are bottom ranked.
>        This can be done by performing a multi-step normalization
>        strategy (called DEGES for DEG elimination strategy). A major
>        characteristic of TCC is to provide the robust normalization
>        methods for several kinds of count data (two-group with or
>        without replicates, multi-group/multi-factor, and so on) by
>        virtue of the use of combinations of functions in other
>        sophisticated packages (especially edgeR, DESeq, and baySeq).
> Depends: R (>= 2.15), methods, DESeq, edgeR, baySeq, ROC
> Imports: EBSeq, samr
> Suggests: RUnit, BiocGenerics
> Enhances: snow
> biocViews: HighThroughputSequencing, DifferentialExpression, RNAseq
> License: GPL-2
> Copyright: Authors listed above
> Packaged: 2013-10-15 05:31:33 UTC; biocbuild
> Built: R 3.0.3; ; 2014-03-31 20:00:49 UTC; unix
>
> -- File: /general/installs/R/R-3.0.3/lib64/R/library/TCC/Meta/package.rds
>
> Thank you,
>
> - Pankaj
> --------------------------------------
> Pankaj Agarwal, M.S
> Bioinformatician
> Bioinformatics Shared Resource
> Duke Cancer Institute
> Duke University
> 919-681-6573
> p.agarwal at duke.edu

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list