[R] Cox model -missing data.

Michael Dewey info at aghmed.fsnet.co.uk
Fri Dec 19 12:37:39 CET 2014


Comment inline

On 19/12/2014 11:17, aoife doherty wrote:
> Many thanks, I appreciate the response.
>
> When I convert the missing values to NA and run the cox model as described
> in previous post,  the cox model seems to remove all of the rows with a
> missing value (as the number of rows "n" in the cox output after I
> completely remove any row with missing data is the same as the number of
> rows "n" in the cox output after I change the missing values to NA).
>
> What I had been hoping to do is not completely remove a row with missing
> data for a co-variable, but rather somehow censor or estimate a value for
> the missing value?

I think you are searching for some form of imputation here. A full 
answer would be way beyond the scope of this list as it depends on so 
many things including the mechanism driving the missingness.

Have a look at
http://missingdata.lshtm.ac.uk/
and see whether that helps.

>
> In reality, I have ~600 people with survival data and say 6 variables
> attached to them. After I incorporate a 7th variable (for which the
> information isn't available for every individual), I have 400 people left.
> Since I still have survival data and almost all of the information for the
> other 200 people (the only thing missing is information about that 7th
> variable), it seems a waste to remove all of the survival data for 200
> people over one co-variate. So I was hoping instead of completely removing
> the rows, to just somehow acknowledge that the data for this particular
> co-variate is missing in the model but not completely remove the row? This
> is more what I was hoping someone would know if it's possible to
> incorporate into the model I described above?
>
> Thanks
>
>
>
> On Fri, Dec 19, 2014 at 10:21 AM, Ted Harding <Ted.Harding at wlandres.net>
> wrote:
>>
>> Hi Aoife,
>> I think that if you simply replace each "*" in the data file
>> with "NA", then it should work ("NA" is usually interpreted
>> as "missing" for those functions for which missingness is
>> relevant). How you subsequently deal with records which have
>> missing values is another question (or many questions ... ).
>>
>> So your data should look like:
>>
>> V1       V2          V3               Survival       Event
>> ann      13          WTHomo           4                1
>> ben      20          NA               5                1
>> tom      40          Variant          6                1
>>
>> Hoping this helps,
>> Ted.
>>
>> On 19-Dec-2014 10:12:00 aoife doherty wrote:
>>> Hi all,
>>>
>>> I have a data set like this:
>>>
>>> Test.cox file:
>>>
>>> V1        V2         V3               Survival       Event
>>> ann      13          WTHomo           4                1
>>> ben      20          *                5                1
>>> tom      40          Variant          6                1
>>>
>>>
>>> where "*" indicates that I don't know what the value is for V3 for Ben.
>>>
>>> I've set up a Cox model to run like this:
>>>
>>> #!/usr/bin/Rscript
>>> library(bdsmatrix)
>>> library(kinship2)
>>> library(survival)
>>> library(coxme)
>>> death.dat <- read.table("Test.cox",header=T)
>>> deathdat.kmat <-2*with(death.dat,makekinship(famid,ID,faid,moid))
>>> sink("Test.cox.R.Output")
>>> Model <- coxme(Surv(Survival,Event)~ strata(factor(V1)) +
>>> strata(factor(V2)) + factor(V3)) +
>>> (1|ID),data=death.dat,varlist=deathdat.kmat)
>>> Model
>>> sink()
>>>
>>>
>>>
>>> As you can see from the Test.cox file, I have a missing value "*". How
>> and
>>> where do I tell the R script "treat * as a missing variable". If I can't
>>> incorporate missing values into the model, I assume the alternative is to
>>> remove all of the rows with missing data, which will greatly reduce my
>> data
>>> set, as most rows have at least one missing variable.
>>>
>>> Thanks
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> -------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
>> Date: 19-Dec-2014  Time: 10:21:23
>> This message was sent by XFMail
>> -------------------------------------------------
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2015.0.5577 / Virus Database: 4253/8764 - Release Date: 12/19/14
>
>

-- 
Michael
http://www.dewey.myzen.co.uk



More information about the R-help mailing list