[R] SAS-like method of recoding variables?

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue Jun 23 16:02:16 CEST 2009


David Freedman wrote:
> Frank, would you feel comfortable giving us the reference to the NEJM article
> with the 'missing vs <' error ?  I'm sure that things like this happen
> fairly often, and I'd like to use this example in teaching
> 
> thanks, david freedman

@ARTICLE{gus93int,
   author = {{The GUSTO Investigators}},
   year = 1993,
   title = {An international randomized trial comparing four thrombolytic
           strategies for acute myocardial infarction},
   journal = NEJM,
   volume = 329,
   pages = {673-682},
   annote = {GUSTO; t-PA; mega-trials}
}

The error was in the incidence of the secondary endpoint of death or 
stroke (the union of the two).  The incidence is slightly wrong because 
the secondary endpoint was computed by setting the event indicator to 
one if the time until stroke or death was less than the follow-up time. 
  Some patients had time until stroke or death missing.  Although the 
statistical team was alerted to this error after publication, no 
correction was issued.

Frank

> 
> 
> Frank E Harrell Jr wrote:
>> Dieter Menne wrote:
>>>
>>> P.Dalgaard wrote:
>>>>> IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);
>>>> vehicles <- ifelse(TYPE=='TRUCK' & count=12, TRUCK+((CAR+BIKE)/2.2), NA)
>>>>
>>>>
>>> Read both versions to an audience, and you will have to admit that this
>>> is
>>> one of the cases where SAS is superior.
>> Here's a case where SAS is clearly not superior:
>>
>> IF type='TRUCK' AND count<12 THEN vehicles=truck+(car+bike)/2.2;
>>
>> If count is missing, the statement is considered TRUE and the THEN is 
>> executed.  This is because SAS considers a missing as less than any 
>> number.  This resulted in a significant error, never corrected, in a 
>> widely cited New England Journal of Medicine paper.
>>
>> Frank
>>
>>> Dieter
>>>
>>>
>>
>> -- 
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                       Department of Biostatistics   Vanderbilt University
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list