[BioC] Help With RNA-seq

Valerie Obenchain vobencha at fhcrc.org
Mon Jan 30 05:54:27 CET 2012


On 01/29/12 07:49, Tina Asante Boahene wrote:
> Hi all,
>
> I am still having problems with bayseq
>
> Having followed the pdf document associated with it and also tailoring it to the Marioni et al data I am using, it seems that the code has been running for over two days without any results.
>
> I am wondering this code be down to my code.
>
> I have therefore attached my code to this email hoping that someone can help me solve this problem, thank you.
>
>
> library(baySeq)
> library(edgeR)
> library(limma)
> library(snow)
>
> cl<- makeCluster(4, "SOCK")
>
>
> ##Calculating normalization factors##
> D=MA.subsetA$M
> head(D)
> names(D)
> dim(D)
Please provide the output of these (i.e., head, names, dim).
>
> g<- gsub("R[1-2]L[1-8]", "", colnames(D))
> d<- DGEList(counts = D, group = substr(colnames(D), 5, 30))
> d$samples
> names(d)
> dim(d)
These values would be helpful too.
>
>
> CD<- new("countData", data = as.matrix(MA.subsetA$M), libsizes =
> as.integer(d$samples$lib.size), replicates = g)
> groups(CD)<- list(rep(1, ncol(CD)), g)
>
> CD at libsizes<- getLibsizes(CD)
>
> plotMA.CD(CD, samplesA = c(1,3,6,8,10), samplesB = c(2,4,5,7,9), col = c(rep("red",
> 100), rep("black", 900)))

Did this work? Your original question was about plotMA.CD not 
recognizing your groups. Does the plot work for you now?
>
> ## Optionally adding annotation details to the @annotation slot of the countData object. ##
> CD at annotation<- data.frame(name = paste("gene", 1:1000, sep = "_"))
>
>
>
>
> ### Poisson-Gamma Approach ###
>
> CDP.Poi<- getPriors.Pois(CD, samplesize = 2, takemean = TRUE, cl = cl)
>
> CDP.Poi at priors  ## This takes time ###
Is the call to getPriors.Pois() that has been running for over 2 days? 
If not, please specify which function call is taking so long. Did the 
rest of the code below work for you?

Valerie
>
> CDPost.Poi<- getLikelihoods.Pois(CDP.Poi, pET = "BIC", cl = cl)
> CDPost.Poi at estProps
>
> CDPost.Poi at posteriors[1:10, ]  ## A list of the posterior likelihoods each model for the first 10 genes ##
> CDPost.Poi at posteriors[101:110, ]   ## A list of the posterior likelihoods each model for the genes from 101 to 110 ##
>
>
> ### Negative-Binomial Approach ###
>
> CDP.NBML<- getPriors.NB(CD, samplesize = 1000, estimation = "QL", cl = cl)
>
> CDPost.NBML<- getLikelihoods.NB(CDP.NBML, pET = 'BIC', cl = cl)
>
> CDPost.NBML at estProps
>
> CDPost.NBML at posteriors[1:10,]
>
> CDPost.NBML at posteriors[101:110,]
>
>
> Kind Regards
>
> Tina
> ________________________________________
> From: Valerie Obenchain [vobencha at fhcrc.org]
> Sent: 25 January 2012 18:14
> To: Tina Asante Boahene
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Help With RNA-seq
>
> Hi Tina,
>
> It's difficult to help without knowing what your data look like or what
> error message you are seeing. Both pieces of information would be helpful.
>
> For starters I think you need to provide 'replicate' and 'groups'
> arguments when you create your new "countData" object. Depending on what
> order your data are in you need something like,
>
>       groups<- list(NDE = c(1,1,1,1,1,1,1,1,1,1), DE =
> c(1,2,1,2,2,1,2,1,2,1))
>       replicates<- c("Kidney", "Liver", "Kidney", "Liver", "Liver',
> "Kidney", "Liver", "Kidney", "Liver", "Kidney")
>
> Then create your "countData" with these variables,
>
>       CD<- new("countData", data = as.matrix(MA.subsetA$M), libsizes =
> as.integer(d$samples$lib.size),
>           replicates = replicates, groups = groups)
>
> Now look at the CD object and make sure the columns are labeled as they
> should be and the other slot values make sense. The MA plot call would
> look something like,
>
>       plotMA.CD(CD, samplesA = "Kidney", samplesB = "Liver")
>
> The author used the red and black colors for the vignette plot because
> there was a known structure to the data; the first 100 counts showed
> differential expression and the last 900 did not. You probably have a
> different situation in your data so using the same color scheme may not
> make sense.
>
> Valerie
>
>
> On 01/23/2012 06:13 AM, Tina Asante Boahene wrote:
>> Hi all,
>>
>> I am conducting some analysis using the Marioni et al data.
>>
>> However, I am having a bit of trouble using my data to conduct the analysis based on the baySeq package.
>>
>>    And I was wondering if you could stir me in the right direction.
>>
>> I have already used edgeR to find the library sizes for the ten libraries I have for my data as well as for the groups (Liver and Kidney) as stated below.
>>
>>
>> library(baySeq)
>> library(edgeR)
>> library(limma)
>> library(snow)
>>
>> cl<- makeCluster(4, "SOCK")
>>
>>
>> ##Calculating normalization factors##
>> D=MA.subsetA$M
>> head(D)
>> names(D)
>> dim(D)
>>
>> g<- gsub("R[1-2]L[1-8]", "", colnames(D))
>> d<- DGEList(counts = D, group = substr(colnames(D), 5, 30))
>> d$samples
>> names(d)
>> dim(d)
>>
>>
>> I will like to know how to model my code in order to produce the MA plot for count data
>>
>>
>> This is what I have, however it runs with the wrong response.
>>
>> Can someone help me fix this please.
>>
>> CD<- new("countData", data = as.matrix(MA.subsetA$M), libsizes = as.integer(d$samples$lib.size))
>>
>> plotMA.CD(CD, samplesA = 1:5, samplesB = 6:10, col = c(rep("red",
>> 100), rep("black", 900)))
>>
>>
>> How can I get it to recognise the "groups" as "g" (Library and Kidney)
>>
>> This is the output for the groups   [1] "Kidney" "Liver"  "Kidney" "Liver"  "Liver"  "Kidney" "Liver"  "Kidney"  "Liver"  "Kidney"
>>
>> thank you.
>>
>>
>>
>>
>>
>>
>> Kind Regards
>>
>> Tina
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list