[R] Dequantizing

Richard.Cotton at hsl.gov.uk Richard.Cotton at hsl.gov.uk
Thu Nov 20 18:23:45 CET 2008

> I have some data measured with a coarsely-quantized clock.  Let's say
> the real data are
>       q<- sort(rexp(100,.5))
> The quantized form is floor(q), so a simple quantile plot of one
> against the other can be calculated using:
>       plot(q,type="l"); points(floor(q),col="red")
> which of course shows the characteristic stair-step.  I would like to
> smooth the quantized form back into an approximation of the underlying
> data.
> The simplest approach I can think of adds a uniform random variable of
> the size of the quantization:
>       plot(q,type="l"); points(floor(q),col="red");
> points(floor(q)+runif(100,0,1),col="blue")
> This gives pretty good results for uniform distributions, but less
> good for others (like exponential).  Is there a better
> interpolation/smoothing function 

I'm not convinced that adding a random amount to the floor values to 'make 
up' the underlying data is very meaningful. 

If you know what the underlying distribution is, then you are best off 
using this distribution to generate plots and extra pretend data.

If you know the distribution is exponential, then you can estimate the 
rate by treating the true values as interval censored data, somewhere 
between floor and floor+1.

q <- sort(rexp(100,.5))
qq <- floor(q)
qq[qq==0] <- 0.00001     #survreg doesn't like values that are exactly 
ss <- Surv(qq, qq+1, type="interval2") 
model <- survreg(ss ~ 1, dist="exponential")
rate <- 1/exp(model$coefficients["(Intercept)"]); rate   #hopefully 
something close to 0.5

If you don't know the underlying distribution either, then things get 
trickier.  Try a histogram/kernel density plot/boxplot to see what it 
looks like.


Mathematical Sciences Unit


This message contains privileged and confidential inform...{{dropped:20}}

More information about the R-help mailing list