[R] Linear regression with a rounded response variable

Thu Oct 22 14:48:48 CEST 2015

> Yes, and I think that the suggestion in another post to look at censored regression is more in the right direction.

I think this is right and perhaps the best (or at least better) pathway to pursue than considering this within the framework of measurement error (ME). Of course there *is* ME in the observed walking time since the observed value is only one draw from the distribution of potential times that could have been observed for each individual.

But, the typical econometric correction for ME requires that we have an observed value and then an estimate of its variance. Theoretically, I would imagine this variance to be heteroscedastic and to vary by individual.  In Ravi's regression with the observed value on the LHS, there is no bias in the regression coefficients because the ME is not correlated with the error term, but the standard errors of the coefficients would be too large. If such this conditional variance did exist, you could treat the reciprocal of the variance as a weight in WLS, such that values with less ME have greater weight in the estimation and there would also exists a closed form way to correct the standard errors.

This however, is not the problem as I understand it from Ravi. Instead, he observes x which lies within a known interval, x_l < x < x_u where x_l and x_u denote upper and lower limits for the observed values.

At first this threw me for a loop because censoring in my work is typically done at the extremes with left/right censored data. But, there is also a package in R for interval censoring (called interval), though I have not used it before. Some googling on this topic drew me to some good worked examples that I think fit within the framework Ravi is working within.

So, perhaps Ravi's question really has two issues, one of which might be solvable: there is ME in the outcome value, y. But, perhaps that is ignorable. The censoring is perhaps not ignorable, and even better yet solvable?