[R] R-help; generating censored data

(Ted Harding) Ted.Harding at wlandres.net
Thu Apr 12 00:23:43 CEST 2012


On 11-Apr-2012 16:28:31 Christopher Kelvin wrote:
> Hello,
> can i implement this as 10% censored data where t gives me
> failure and x censored.
> Thank you
> 
> p=2;b=120
> n=50
> 
> set.seed(132);
> r<-sample(1:50,45)
> t<-rweibull(r,shape=p,scale=b)
> t
> set.seed(123);_
> cens <- sample(1:50, 5)_
> x<-runif(cens,shape=p,scale=b)_
> x
> 
> Chris Guure
> Researcher,
> Institute for Mathematical Research
> UPM

This query is obscure!

First, its approach does not seem to conform to the standard
notion of "censored data". This refers to a situation where,
for each item observed, either (a) it value falls within a
certain range (which may itself depend on the item), in which
case its value is recorded as a value in the data; or (b) its
value falls outside that range, in which case that fact is
recorded but the value is not recorded (thus being "censored").

Eaxmple: Patients who have been admitted to hospital for a
particular disease are subsequently monitored for a period
of time (days/months/years) which may vary from patient to
patient. The reason for the time limitation may be that the
design of the investigation set a limit, or may be haphazard
as a result of the patient moving away and no longer being
accessible. The value recorded (if available) is the time
from admission to death. If not available, then all that can
be recorded is that the event occurred later than the upper
time limit for thaqt patient.

As far as I can see, no element of your code above corresponds
to this notion of "censored".

Next, your "r<-sample(1:50,45)" selects 45 different values
from (1:50), and then your "t<-rweibull(r,shape=p,scale=b)"
generates 45 values sampled from the Weibull distribution,
** regardless of the 45 values from (1:50) in r ** -- See
under '?rweibull' where it says:

  "n: number of observations. If 'length(n) > 1', the length
   is taken to be the number required."

So it would seem that your "r<-sample(1:50,45)" is superfluous,
and you could simply have written "t<-rweibull(45,shape=p,scale=b)".

Similar comments apply to your

  cens <- sample(1:50, 5)
  x<-runif(cens,shape=p,scale=b)

where you could have equivalently written "x<-runif(5,shape=p,scale=b)".
Also, the parameters "shape" and "scale" would not be recognised
by runif(), whose parameters are as in "runif(n, min=..., max=...)".
Maybe you meant to write "x<-rweibull(cens,shape=p,scale=b)",
but then you would simply be sampling a further 5 values from the
same Weibull distribution, along with your original 45.

So how does censoring come into this?

If you would explain, in plain words, what you are seeking to do,
it would help to remove this obscurity and confusion!

Hoping this helps,
Ted.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 11-Apr-2012  Time: 23:23:36
This message was sent by XFMail



More information about the R-help mailing list