[R] cpquery problem

Sun Jul 31 10:11:16 CEST 2016

Hi Marco

Thanks for your prompt reply.

First, I have been using the parse(eval()) convention because I saw it
used in some example code for running cpquery, but am happy to drop this
practice.

I have tried running the cpquery in the debug mode, and found that it
typically returns the following for instances where the conditional
probability is returned as 0:

   > event matches 0 samples out of 0 (p = 0)

Am I right in understanding that the Monte Carlo sampling has been unable
to create any cases that match the query?  If so, why would this be if the
evidence used is very typical of an average case in the data used to train
the network?

Also, I have run predict on this network and get very good correlations
between the predicted and actual observations (r-squared 0.8 - 0.9).  Why
would it be that the network can return near perfect predictions can be so
good for a test set while the conditional probabilities remain at zero for
when exploring the same data set?

I think that I must be missing something in my deployment of the package
or the interpretation of the output.

Many thanks for your help.

Ross

On Fri, July 29, 2016 7:34 pm, Marco Scutari wrote:
> Hi Ross,
>
>
> first, I have a side question: is there a particular reason why you are
> using parse(eval()) in your queries? I know sometimes there is no other
> solution if you only use exported functions, but you should try not to. It
> makes for brittle code that breaks easily depending on how variables are
> scoped.
>
> On 29 July 2016 at 07:37,  <ross.chapman at ecogeonomix.com> wrote:
>
>> However, if I replace EST=='x' with EST=='z' or EST=='y' I get 0
>> probability of obtaining a value for ABW that is either greater or less
>> than the threshold.
>>
>> For example:
>>
>>
>>> cpquery(fitted,event=(ABW>=11), evidence=eval(parse(text="(EST=='y' &
>>> TR>9
>>>
>> & BU>15819 &  RF>2989)")),n=10^6)
>>
>>
>> [1] 0
>>
>>
>> and
>>
>>> cpquery(fitted,event=(ABW<=11), evidence=eval(parse(text="(EST=='y' &
>>> TR>9
>>>
>> & BU>15819 &  RF>2989)")),n=10^6)
>>
>>
>> [1] 0
>>
>
> From this output, my guess is that the evidence has probability
> (exactly or close to) zero. You can check by running cpdist(): if the
> evidence has probability zero, no random samples will be returned. Turning
> on the debugging output in cpquery() should also highlight what the
> problem is.
>
> If that turns out to be the case, switch from the default method =
> "ls" to method = "lw" in cpdist(). (Note that the syntax changes
> slightly, check the documentation for examples.)
>
>> My own knowledge from the data is that these classes should both
>> typically return a value for ABW that is very much higher than the
>> threshold value.
>
> That may be, but much depends on the specific sample the model was
> fitted from. How does the fitted network look like?
>
> Cheers,
> Marco
>
>
> --
> Marco Scutari, Ph.D.
> Lecturer in Statistics, Department of Statistics
> University of Oxford, United Kingdom
>
>