[BioC] edgeR with two different groups (unpaired) at two time points (paired): Need some input
Sindre Lee
sindre.lee at studmed.uio.no
Tue Oct 22 22:56:46 CEST 2013
On 2013-10-22 20:03, Sindre Lee wrote:
> Hi!
> The edgeR manual is quite nice, but I'm not quite sure if I'm on the
> right track..
>
> The question to answer is: "Is there any significantly changed genes
> in diabetics at time point 2, compared to healthy?"
>
> Heres my codes:
>
> #Making the design matrix
>
> status <- factor(c(rep("Healthy",26), rep("Diabetic",22)),
> levels=c("Healthy","Diabetic"))
> patients <-
> factor(c(88,87,86,83,82,81,75,72,71,70,13,08,01,88,87,86,83,82,81,75,72,71,70,13,08,01,79,77,76,73,67,62,61,55,53,21,04,79,77,76,73,67,62,61,55,53,21,04))
> timepoints =
> as.factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2))
> design <- model.matrix(~patients + timepoints*status)
>> colnames(design)
> [1] "(Intercept)" "patients4"
> [3] "patients8" "patients13"
> [5] "patients21" "patients53"
> [7] "patients55" "patients61"
> [9] "patients62" "patients67"
> [11] "patients70" "patients71"
> [13] "patients72" "patients73"
> [15] "patients75" "patients76"
> [17] "patients77" "patients79"
> [19] "patients81" "patients82"
> [21] "patients83" "patients86"
> [23] "patients87" "patients88"
> [25] "timepoints2" "statusDiabetic"
> [27] "timepoints2:statusDiabetic"
>
> #Reading in HTSeq data
>
> description <- c("test");
> fileNames <-
> list.files(path="/Volumes/timemachine/HTseq_DEseq2",pattern="*.txt");
> files <- sort(fileNames,decreasing=TRUE);
> targets <- data.frame(files=files, group=status,
> description=description);
> library(edgeR);
> d <-
> readDGE(targets,path="/Volumes/timemachine/HTseq_DEseq2",skip=5,comment.char="!");
> colnames(d)
> <-(c("N288","N287","N286","N283","N282","N281","N275","N272","N271","N270","N213","N208","N201","N188","187","N186","N183","N182","N181","N175","N172","N171","N170","N113","N108","N101","D279","D277","D276","D273","D267","D262","D261","D255","D253","D221","D204","D179","D177","D176","D173","D167","D162","D161","D155","D153","D121","D104"));
> d$samples;
> dim(d)
>
> #Normalization
>
> d <- (calcNormFactors(d,method="RLE"));
>
> #Estimating the dispersion:
>
> d <- estimateCommonDisp(d,verbose=TRUE)
> Disp = 0.05823 , BCV = 0.2413
> d <- estimateTagwiseDisp(d,trend="none")
>
> #diffeksp
>
>> fit <- glmFit(d,design)
> Error in glmFit.default(y = y$counts, design = design, dispersion =
> dispersion, :
> Design matrix not of full rank. The following coefficients not
> estimable:
> statusDiabetic
>
> Can someone tell me why I get this error?
>
> Thanks alot!
MARK!
In group 1: 13 samples
In group 2: 11 samples
I guess this is the problem? How to handle this?
More information about the Bioconductor
mailing list