[R] R glm function ignores some predictor variables

Gabrielle Perron gabrielle.perron at mail.mcgill.ca
Mon Mar 27 17:23:07 CEST 2017


Hi,


This is my first time using this mailing list. I have looked at the posting guide, but please do let me know if I should be doing something differently.


Here is my question, I apologize in advance for not being able to provide example data, I am using very large tables, and what I am trying to do works fine with simpler examples, so providing example data cannot help. It has always worked for me until now. So I am just trying to get your ideas on what might be the issue. But if there is any way I could provide more information, do let me know.


So, I have a vector corresponding to a response variable and a table of predictor variables. The response vector is numeric, the predictor variables (columns of the table) are in the binary format (0s and 1s).


I am running the glm function (multivariate linear regression) using the response vector and the table of predictors:


    fit <- glm(response ~ as.matrix(predictors), na.action=na.exclude)

    coeff <- as.vector(coef(summary(fit))[,4])[-1]


When I have been doing that in the past, I would extract the vector of regression coefficient to use it for further analysis.


The problem is that now the regression returns a vector of coefficients which is missing some values. Essentially some predictor variables are not attributed a coefficient at all by glm. But there are no error messages.


The summary of the model looks normal, but some predictor variables are missing like I mentioned. Most other predictors have assigned data (coefficient, pvalue, etc.).

About 30 predictors are missing from the model, over 200.


I have tried using different response variables (vectors), but I am getting the same issue, although the missing predictors vary depending on the response vector...


Any ideas on what might be going on? I think this can happen if some variables have 0 variance, but I have checked that. There are also no NA values and no missing values in the tables.


What could cause glm to ignore/remove some predictor variables?


Any suggestion is welcome!


Thank you,


Gabrielle







	[[alternative HTML version deleted]]



More information about the R-help mailing list