[R] C-statistic comparison with partially paired datasets

Thu Aug 13 14:52:27 CEST 2009

Frank, thank you very much for your answer.
Yes, I also estimated splines like you suggested but wanted to add "more
custom" analyses as well. Your answer does help me however, thanks.
Hanneke

> Hanneke Wijnhoven wrote:
>> Frank,
>>
>> Thank you for your quick response!
>>
>> I want to compare the discriminative capacity of different
>> anthropometric measures in predicting mortality, focussing on the "thin"
>> site of these measures.
>> Since these associations are not linear (U shaped for BMI and inversily
>> J-shaped for mid-upper arm circumference) and I do not want to include
>> the prediction by "obesity", I am using all values below the median of
>> each separate measure to calculate a C-statistic (below the median, the
>> association is approximately linear).
>> As a result, some different and some overlapping cases are included.
>> I understand your point though.
>>
>> Any suggestion is welcome.
>>
>> Hanneke
>
> Subsetting the data will make the two task difficulties unequal, I fear.
>   This would make it difficult to compare predictive discrimination
> indexes.
>
> I think it would be better to fit splines to the continuous predictors,
> to allow for a unified analysis over the whole range.  Then everything
> is paired.
>
> Frank
>
>>
>> Frank E Harrell Jr schreef:
>>> Hanneke Wijnhoven wrote:
>>>> Does anyone know of an R-function or method to compare two
>>>> C-statistics (Harrells's C - rcorr.cens) obtained from 2 different
>>>> models in partially paired datasets (i.e. some similar and some
>>>> different cases), with one continuous independent variable in each
>>>> separate model? (in a survival analysis context)?
>>>> I have noticed that the rcorrp.cens function can be used for paired
>>>> data.
>>>>   Thanks for any help,
>>>>
>>>> Hanneke Wijnhoven
>>>>
>>>
>>> Hanneke,
>>>
>>> I'm having trouble seeing how the unpaired observations can contribute
>>> information in general.  If for example all of the observations were
>>> unpaired, one C-statistic might be larger because it came from a
>>> dataset with more extreme observations that were easier to
>>> discriminate.
>>>
>>> Frank
>>>
>>
>>
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                       Department of Biostatistics   Vanderbilt University
>