Christiaan Pauw cjpauw at gmail.com
Wed Oct 16 20:18:55 CEST 2013

I have a large dataset (questionnaire results) of mostly categorical
variables. I have tested for dependency between the variables using
chi-square test. There are an incomprehensible number of dependencies.
I used the chaid() function in the CHAID package to detect
interactions and separate out (what I hope to be) the underlying
structure of these dependencies for each variable. What typically
happens is that the chi-square test will reveal a large number of
dependencies (say 10-20) for a variable and the chaid function will
reduce this to something much more comprehensible (say 3-5). What I
want to do is to extract the names of those variable that were shown
to be relevant in the chaid() results.

The chaid() output is in the form of a constparty object. My question
is how to extract the variable names associated with the nodes in such
an object.

Here is a self contained code example:

library(evtree) # for the ContraceptiveChoice dataset

longform <- formula(contraceptive_method_used ~ wifes_education +
                 husbands_education +  wifes_religion + wife_now_working +
                 husbands_occupation + standard_of_living_index +
z <- chaid(longform, data = ContraceptiveChoice)
# plot(z)
# This is the part I want to do programatically
shortform <- formula(contraceptive_method_used ~ wifes_education +
# The thing I want is a programatic way to extract 'shortform'  from 'z'

# Examples of use of 'shortfom'
loglm(shortform, data = ContraceptiveChoice)

Thanks in advance
Christiaan Pauw
Nova Institute

