[R] Getting INDIVIDUAL effects of multiple qualitative variables (ordered and unordered factors)
Rafael Costa
rafaelcarneirocosta.rc at gmail.com
Thu May 7 20:43:52 CEST 2015
Dear R users,
I have data from a questionnaire and I want to estimate the individual
effect of each explanatory variable (all are qualitative) on the dependent
variable (continuous). However, the default is to consider the estimated
coefficients as the difference between the reference group (estimated value
of the intercept) and the coefficient of the group. Each qualitative
variable relates to a characteristic of a particular activity and the
continuous variable is the time taken to perform this activity. I emphasize
that the reference level of each factor relates to the case where none of
the options for that factor was marked. The data is in "
http://www.datafilehost.com/d/c7f0d342". I did not put them in the script,
because I still do not know how to do this, but I hope this is not a
problem (and I ask my sincere apologies). I do not put just a sample of the
data, since there was singular matrix problems.
First (and main) issue - In order to obtain the individual effect of the
levels of each factor, I considered that the reference group has zero
effect and I did the following steps:
# Since the file was not loaded in the script, it is assumed here that it
was downloaded from the internet and is already loaded in R.
# I will make a quantile regression, so the package follows.
install.packages (quantreg)
library (quantreg)
# Transforming factors into individual objects:
p_1 = table (1: length (tabela1.1 $ p1), as.factor (tabela1.1 $ p1))
p_21 = table (1: length (tabela1.1 $ p21), as.factor (tabela1.1 $ p21))
p_22 = table (1: length (tabela1.1 $ p22), as.factor (tabela1.1 $ p22))
p_23 = table (1: length (tabela1.1 $ p23), as.factor (tabela1.1 $ p23))
p_24 = table (1: length (tabela1.1 $ p24), as.factor (tabela1.1 $ p24))
p_25 = table (1: length (tabela1.1 $ p25), as.factor (tabela1.1 $ p25))
p_34 = table (1: length (tabela1.1 $ p34), as.ordered (tabela1.1 $ p34))
p_5 = table (1: length (tabela1.1 $ p5), as.ordered (tabela1.1 $ p5))
p_6 = table (1: length (tabela1.1 $ p6), as.ordered (tabela1.1 $ p6))
p_7 = table (1: length (tabela1.1 $ p7), as.ordered (tabela1.1 $ p7))
p_8 = table (1: length (tabela1.1 $ p8), as.ordered (tabela1.1 $ p8))
p_9 = table (1: length (tabela1.1 $ p9), as.ordered (tabela1.1 $ p9))
# Regressing the model without intercept, but considering that the
reference group = 0, considering that the reference group means that none
of the factors has been marked (if any was marked, I believe that the time
taken to perform the activity is practically zero).
qrModel=rq(data=tabela1.1, pontoefetivo ~ 0 + p_1[,-1] + p_21[,-1] +
p_22[,-1] + p_23[,-1] + p_24[,-1] + p_25[,-1] + p_34[,-1] + p_5[,-1] +
p_6[,-1] + p_7[,-1] + p_8[,-1] + p_9[,-1], tau=0.5)
summary(qrModel)
My idea was that since the effect of the reference group is zero, the
estimated coefficient of each level is precisely the individual effect of
the chosen variable level. My idea is right? If not, what do I do to get
these individual effects?
Problem 2 - Assuming all is right above, ordered factors not have
increasing effects [See summary (qrModel)]. But should not they have? If
so, what do I do to ensure such an effect?
Problem 3 - Again assuming that everything is correct, I hope that any
estimated coefficients (individual effects on the runtime of the activity)
are not negative values. Am I right about that? If so, what do I do to
ensure that all values are not negative?
I am looking forward any help.
Thanks in advance ,
Rafael Costa.
[[alternative HTML version deleted]]
More information about the R-help
mailing list