[R] car::deltaMethod() fails when a particular combination of categorical variables is not present

John Fox j|ox @end|ng |rom mcm@@ter@c@
Tue Sep 26 15:49:44 CEST 2023


Dear Michael,

You're testing a linear hypothesis, so there's no need to use the delta 
method, but the linearHypothesis() function in the car package also 
fails in your case:

 > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0")
Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") :
there are aliased coefficients in the model.

One work-around is to ravel the two factors into a single factor with 5 
levels:

 > df$bc <- factor(with(df, paste(b, c, sep=":")))
 > df$bc
  [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent   t2:unsent
  [7] t2:unsent t1:sent   t2:unsent t2:unsent t2:other  t2:unsent
[13] t1:unsent t1:sent   t2:unsent t2:other  t1:unsent t2:sent
[19] t2:sent   t2:unsent
Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent

 > m <- lm(a ~ bc, data=df)
 > summary(m)

Call:
lm(formula = a ~ bc, data = df)

Residuals:
     Min      1Q  Median      3Q     Max
-57.455 -11.750   0.439  14.011  37.545

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)    20.50      17.57   1.166   0.2617
bct1:unsent    37.50      24.85   1.509   0.1521
bct2:other     32.00      24.85   1.287   0.2174
bct2:sent      17.17      22.69   0.757   0.4610
bct2:unsent    38.95      19.11   2.039   0.0595

Residual standard error: 24.85 on 15 degrees of freedom
Multiple R-squared:  0.2613,	Adjusted R-squared:  0.06437
F-statistic: 1.327 on 4 and 15 DF,  p-value: 0.3052

Then the hypothesis is tested directly by the t-value for the 
coefficient bct2:sent.

I hope that this helps,
  John

-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-09-26 1:12 a.m., Michael Cohn wrote:
> Caution: External email.
> 
> 
> I'm running a linear regression with two categorical predictors and their
> interaction. One combination of levels does not occur in the data, and as
> expected, no parameter is estimated for it. I now want to significance test
> a particular combination of levels that does occur in the data (ie, I want
> to get a confidence interval for the total prediction at given levels of
> each variable).
> 
> In the past I've done this using car::deltaMethod() but in this dataset
> that does not work, as shown in the example below: The regression model
> gives the expected output, but deltaMethod() gives this error:
> 
> error in t(gd) %*% vcov. : non-conformable arguments
> 
> I believe this is because there is no parameter estimate for when the
> predictors have the values 't1' and 'other'. In the df_fixed dataframe,
> putting one person into that combination of categories causes deltaMethod()
> to work as expected.
> 
> I don't know of any theoretical reason that missing one interaction
> parameter estimate should prevent getting a confidence interval for a
> different combination of predictors. Is there a way to use deltaMethod() or
> some other function to do this without changing my data?
> 
> Thank you,
> 
> - Michael Cohn
> Vote Rev (http://voterev.org)
> 
> 
> Demonstration:
> ------
> 
> library(car)
> # create dataset with outcome and two categorical predictors
> outcomes <- c(91,2,60,53,38,78,48,33,97,41,64,84,64,8,66,41,52,18,57,34)
> persontype <-
> c("t2","t2","t2","t2","t2","t2","t2","t1","t2","t2","t2","t2","t1","t1","t2","t2","t1","t2","t2","t2")
> arm_letter <-
> c("unsent","unsent","unsent","unsent","sent","unsent","unsent","sent","unsent","unsent","other","unsent","unsent","sent","unsent","other","unsent","sent","sent","unsent")
> df <- data.frame(a = outcomes, b=persontype, c=arm_letter)
> 
> # note: there are no records with the combination 't1' + 'other'
> table(df$b,df$c)
> 
> 
> #regression works as expected
> minimal_formula <- formula("a ~ b*c")
> minimal_model <- lm(minimal_formula, data=df)
> summary(minimal_model)
> 
> #use deltaMethod() to get a prediction for individuals with the combination
> 'b2' and 'sent'
> # deltaMethod() fails with "error in t(gd) %*% vcov. : non-conformable
> arguments."
> deltaMethod(minimal_model, "bt2 + csent + `bt2:csent`", rhs=0)
> 
> # duplicate the dataset and change one record to be in the previously empty
> cell
> df_fixed <- df
> df_fixed[c(13),"c"] <- 'other'
> table(df_fixed$b,df_fixed$c)
> 
> #deltaMethod() now works
> minimal_model_fixed <- lm(minimal_formula, data=df_fixed)
> deltaMethod(minimal_model_fixed, "bt2 + csent + `bt2:csent`", rhs=0)
> 
>          [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list