[BioC] Tigre Package question 3

Solanki, Anisha a.solanki.12 at ucl.ac.uk
Tue Feb 11 13:30:38 CET 2014


Dear Antti,

Thanks for your reply. I would like you to know that the Data I have
doesn't have any replicates its just a time series with 7 samples.

Is there a different method of calculating the variances?

Please advise.

Thanks

Anisha




On 11/02/2014 10:32, "Antti Honkela" <antti.honkela at hiit.fi> wrote:

>Dear Anisha,
>
>(FYI: taking the discussion back to the list.)
>
>mmgmos is only meant for processing Affymetrix microarray data so it
>cannot be used with RNA-seq data.
>
>It is possible to obtain similar error bars for expression estimates as
>mmgmos provides from RNA-seq using suitable analysis tools such as
>BitSeq, but unfortunately using those in tigre does not currently work
>out of the box. If you need this, e.g. if you have data with very few or
>no replicates of your time series, please let me know so we can try to
>work it out.
>
>If you have sufficiently many replicate time series, you should be OK
>without any pre-specified variances - the model will fit a simple
>variance model in this case. To trigger this behaviour, you should use
>processRawData() to process your expression data matrix and pass the
>resulting ExpressionTimeSeries object to GP...() functions.
>
>
>Antti
>
>
>On 2014-02-11 00:16 , Solanki, Anisha wrote:
>> Dear Antti,
>>
>> Thanks for your reply. The information you have given me has been very
>> useful. I had another quick question regarding the mmgmos command. I
>> understand that the command accepts data as an AffyObject. However, I
>>have
>> data from RNA-Seq and not from affymetrix microarrays. Hence I cannot
>> create an Affyobject from my data as the object requires CEL files to
>> convert the data into an AffyObject. Is there any other alternative
>>other
>> than using an Affyobject. I have tried to run a matrix with the
>>expression
>> values of every sample from my data. However the command mmgmos doesn't
>> seem to accept this as a valid object.
>>
>> Please advise.
>>
>> Thanks
>>
>> Anisha
>>
>>
>> On 10/02/2014 09:38, "Antti Honkela" <antti.honkela at hiit.fi> wrote:
>>
>>> On 2014-02-09 18:49 , Solanki, Anisha wrote:
>>>
>>> Dear Anisha,
>>>
>>>> I have now solved the previous error by adding variances independently
>>>> to
>>>> the expression Dataset.
>>>
>>> The error variances are critical to the accuracy of the method, so you
>>> should never just impute any values there without careful
>>>consideration.
>>> More about how you could fix this better below.
>>>
>>>> I just had another quick question. The targets are
>>>> ranked by the log-likelihood. Does this mean that the higher the
>>>> log-likelihood the greater the probability of the gene being a target
>>>>or
>>>> vice versa? Also what does null log likelihood stand for?
>>>
>>> Our method is based on comparing log-likelihoods over different data
>>> sets (time series for different genes), which is slightly trickier than
>>> usual comparison of log-likelihoods over the same data.
>>>
>>> The log-likelihood measures how well the data fit a model assuming
>>> regulation, therefore higher log-likelihood should be counted as
>>> evidence for being a target.
>>>
>>> That said, some time series are easy to fit, and get a high likelihood
>>> over practically any model. To catch these, we fit the baseline or null
>>> model (which is just a time-independent Gaussian). We can then filter
>>> out genes that fit the null model equally well or better than the true
>>> model.
>>>
>>> Finally, even though one might consider the likelihood ratio of real
>>>vs.
>>> null a useful statistic, it is actually not good for ranking. This is
>>> because the range of null model likelihoods is much larger, and
>>> therefore the ranking will be determined by how badly the null model
>>> fits instead of how well the real model fits, and tell nothing about
>>>the
>>> regulation.
>>>
>>> In summary, you should:
>>> 1. *Filter* by likelihood ratio real/null: only keep genes where
>>>   log-likelihood > null-log-likelihood
>>> 2. *Rank* remaining genes by log-likelihood
>>>
>>>>> I think this means that my Data lacks calculated variances. As I
>>>>> understand from your User guide you process affymetrix Datasets using
>>>>> the
>>>>> mmgmos command from the PUMA package which automatically calculates
>>>>>the
>>>>> variances for you. However, when I try to run my expression value
>>>>> matrix
>>>>> through this mmgmos command it doesn't work and gives me this error
>>>>> "unable to find an inherited method for function ŒprobeNames¹ for
>>>>> signature Œ"ExpressionTimeSeries"¹
>>>
>>> You should run mmgmos on the original AffyBatch object, not on an
>>> ExpressionTimeSeries object.
>>>
>>>
>>> Hope this helps,
>>>
>>> Antti
>>>
>>> --
>>> Antti Honkela
>>> antti.honkela at hiit.fi   -   http://www.hiit.fi/u/ahonkela/
>>>
>>
>
>-- 
>Antti Honkela
>antti.honkela at hiit.fi   -   http://www.hiit.fi/u/ahonkela/
>



More information about the Bioconductor mailing list