[R] Computing a reliability index of a statistic with missing data

Chaouch, Aziz achaouch at NRCan.gc.ca
Fri May 26 14:36:54 CEST 2006


 Thanks Spencer, that is interesting but I must say I'm a bit lost with
the terminology. I'll try to catch up but I'm not sure I need a
complicated model (MC sounds complicated to me but it may not be...). I
plan to use this reliability index just as an indication and I need to
compute it in batch for several different charts so I try to keep the
statistic as simple as possible but yet efficient.

Aziz

-----Original Message-----
From: Spencer Graves [mailto:spencer.graves at pdf.com] 
Sent: May 25, 2006 8:12 PM
To: Chaouch, Aziz
Cc: R-help at stat.math.ethz.ch
Subject: Re: [R] Computing a reliability index of a statistic with
missing data

	  Have you considered some kind of binary time series model? 
'RSiteSearch("binary time series")' produced 150 hits.  One of the first
20 mentioned "continuous-time hidden Markov chains" 
(http://finzi.psych.upenn.edu/R/library/repeated/html/chidden.html).  I
don't know if this will help you or not, but it might be worth
examining.

	  hope this helps.
	  Spencer Graves

Chaouch, Aziz wrote:
> Hi All,
> 
> I'd like to compute a kind of reliability index (RI) that would in a 
> sense stand as a measure of reliability of a statistic (histogram etc)

> computed on a time serie with missing values. The final goal is that:
> 
> RI=1 for a perfect reliability
> RI=0 for a total unreliability (no data at all as an extreme case...)
> 
> The percentage of missing data is one indication: the more missing 
> data, the less confidence we can have in the statistic. But the 
> distribution of missing data throughout the data serie is important as
well:
> independently of the number of missing data, if available data are 
> regularily spaced in time the RI should be higher than if available 
> data are irregulary spaced. As a measure of sampling regularity, I 
> thought about computing the time to next record and then take its 
> variance over the time interval on which the statistic is computed. 
> The variance of the time to next record would be a measure of sampling

> regularity so that the final RI could be of the form:
> 
> RI=1 when n=0
> RI~1/n*var(T)
> 
> with
> n=% of missing data
> T=time to next record (in hours)
> 
> However I need to "normalize" var(T) to use it to compute the RI. Does

> someone have an idea on how to do this (or another proposal to compute

> the RI)?
> 
> Thanks,
> 
> Aziz
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html



More information about the R-help mailing list