[BioC] Tigre Package question 2

Antti Honkela antti.honkela at hiit.fi
Mon Feb 10 10:38:42 CET 2014


On 2014-02-09 18:49 , Solanki, Anisha wrote:

Dear Anisha,

> I have now solved the previous error by adding variances independently to
> the expression Dataset.

The error variances are critical to the accuracy of the method, so you 
should never just impute any values there without careful consideration. 
More about how you could fix this better below.

> I just had another quick question. The targets are
> ranked by the log-likelihood. Does this mean that the higher the
> log-likelihood the greater the probability of the gene being a target or
> vice versa? Also what does null log likelihood stand for?

Our method is based on comparing log-likelihoods over different data 
sets (time series for different genes), which is slightly trickier than 
usual comparison of log-likelihoods over the same data.

The log-likelihood measures how well the data fit a model assuming 
regulation, therefore higher log-likelihood should be counted as 
evidence for being a target.

That said, some time series are easy to fit, and get a high likelihood 
over practically any model. To catch these, we fit the baseline or null 
model (which is just a time-independent Gaussian). We can then filter 
out genes that fit the null model equally well or better than the true 
model.

Finally, even though one might consider the likelihood ratio of real vs. 
null a useful statistic, it is actually not good for ranking. This is 
because the range of null model likelihoods is much larger, and 
therefore the ranking will be determined by how badly the null model 
fits instead of how well the real model fits, and tell nothing about the 
regulation.

In summary, you should:
1. *Filter* by likelihood ratio real/null: only keep genes where
  log-likelihood > null-log-likelihood
2. *Rank* remaining genes by log-likelihood

>> I think this means that my Data lacks calculated variances. As I
>> understand from your User guide you process affymetrix Datasets using the
>> mmgmos command from the PUMA package which automatically calculates the
>> variances for you. However, when I try to run my expression value matrix
>> through this mmgmos command it doesn't work and gives me this error
>> "unable to find an inherited method for function ŒprobeNames¹ for
>> signature Œ"ExpressionTimeSeries"¹

You should run mmgmos on the original AffyBatch object, not on an 
ExpressionTimeSeries object.


Hope this helps,

Antti

-- 
Antti Honkela
antti.honkela at hiit.fi   -   http://www.hiit.fi/u/ahonkela/



More information about the Bioconductor mailing list