[R] Trouble with mChoice() in the Hmisc package.

Jim Java jjava at priscian.com
Fri Apr 9 18:53:24 CEST 2010


Hi All:--

I've started using the the Hmisc reporting facilities recently, mostly
successfully. I'm having some trouble with mChoice() multiple-choice
objects, though. Here's some example code and output from R-help a
couple of years ago:

library(Hmisc)
Symptom1 <- c("Headache", "Headache", NA)
Symptom2 <- c(NA, "Anxiety", NA)
Symptoms <- mChoice(Symptom1, Symptom2)
summary(~ Symptoms, method="reverse")

[Output:]

Descriptive Statistics  (N=3)

+-------------------+-------+
|                   |       |
+-------------------+-------+
|Symptom1 : Headache|67% (2)|
+-------------------+-------+
|    NA             |67% (2)|
+-------------------+-------+
|    Anxiety        |33% (1)|
+-------------------+-------+

However, if I try this same example using R 2.10.1 and the latest
version of Hmisc, I get the following output table:

+-------------------+--------+
|                   |        |
+-------------------+--------+
|Symptom1 : Headache|100% (3)|
+-------------------+--------+
|    NA             |100% (3)|
+-------------------+--------+
|    Anxiety        |100% (3)|
+-------------------+--------+

Further, as.double(Symptoms) incorrectly returns

     Headache <NA> Anxiety
[1,]        1    1       1
[2,]        1    1       1
[3,]        1    1       1

instead of

     Headache <NA> Anxiety
[1,]        1    1       0
[2,]        1    0       1
[3,]        0    1       0

N.B.:
> levels(Symptoms)
[1] "Headache" NA         "Anxiety"

The problems with as.double.mChoice() appear related to inmChoice(),
which calls match.mChoice(). The Hmisc docs say, "inmChoice() creates
a logical vector the same length as x whose elements are TRUE when the
observation in x contains at least one of the codes or value labels in
the second argument," but inmChoice() doesn't seem to be returning a
vector the same length as x; it incorrectly returns, e.g.,

> inmChoice(Symptoms, 1) # "Headache"
[1] TRUE

instead of

[1]  TRUE  TRUE FALSE

I tried rewriting inmChoice() to avoid the call to match.mChoice(),
which fixed as.double.mChoice(), but that didn't change the output
from summary.formula(). The R function match.mChoice() '.Call's the C
function do_mchoice_match(SEXP x, SEXP table, SEXP nomatch), which may
contain the source of these problems, but I haven't had the time to
try and debug it yet. Until then, I thought I'd post the problem here
in hope of getting the attention of someone who might recognize the
problem I'm having. I get the same results on various Windows systems
-- XP (x86), Vista (x64), Windows 7 (x86).

Thanks to anyone who can help. I really like Hmisc reporting, so if
I'm able to make the changes myself I'll update this thread.

 -- Jim Java

> R.Version()
$platform
[1] "i386-pc-mingw32"

$arch
[1] "i386"

$os
[1] "mingw32"

$system
[1] "i386, mingw32"

$status
[1] ""

$major
[1] "2"

$minor
[1] "10.1"

$year
[1] "2009"

$month
[1] "12"

$day
[1] "14"

$`svn rev`
[1] "50720"

$language
[1] "R"

$version.string
[1] "R version 2.10.1 (2009-12-14)"



More information about the R-help mailing list