[R] Green and Byar (1980) Prostate Cancer Data set from Andrews and Herzberg - Data

Rolf Turner r.turner at auckland.ac.nz
Tue Mar 24 22:50:01 CET 2009


On 25/03/2009, at 10:04 AM, Frank E Harrell Jr wrote:

> Ravi Varadhan wrote:
>> Hi,
>>
>> I am looking for a data set containing the information from a  
>> randomized trial evaluating the effect of DES (diethylsilbestrol)  
>> on multiple time-to-event endpoints, prostate cancer, CVD, and  
>> other causes.  The original source of this data is Green and Byar  
>> (1980).  This is a popular competing risks problem that has  
>> subsequently been discussed in a number of statistical papers  
>> including Kay (1986).
>>
>> Does anyone have a digital version of this data set?
>>
>> This data is also presented in Andrews, D. F. and Herzberg, A. M.  
>> (1985). Data.   Does a digital version of all the data sets in A &  
>> H exist?
>>
>> Thanks very much,
>> Ravi.
>
> An R binary dataset is at http://biostat.mc.vanderbilt.edu/Datasets
>
> Note that there is something strange about the AP variable with a  
> lot of
> ties at some value near 1.0.  I have never been able to find any
> documentation about this problem.  If you find any please let me know.

Out of idle curiosity I went to have a look at this data set.

I had problems.

(1) The given URL didn't work for me; when I clicked on it, I got an  
error 404.
But if I went to http://biostat.mc.vanderbilt.edu I found a link to  
``Datasets'',
and clicking on that got me to some data sets.

(2) Scrolling down to ``Byar and Green prostate cancer data''  
appeared to get
me to the right place.  But I couldn't see any signs of any ``R  
binary files''.

The available formats appear to be *.sav (SPSS?), *.sdd (???), and  
*.xls.

(3) I downloaded the prostate.xls file O.K.  But when I tried to read  
it in with
the read.xls() function from the gdata package, I got an error to the  
effect

 > X <- read.xls("prostate.xls")
Converting xls file to csv file... Done.
Reading csv file... Error in read.table(file = file, header = header,  
sep = sep, quote = quote,  :
   no lines available in input

I was able to ``open'' the prostate.xls file with the version of  
Excel available
on my Mac, save it as a *.csv file, and then read *that* in with  
read.csv()

What am I missing?  *Are* there ``R binary'' files lurking about that  
I am somehow
not seeing?  Why won't read.xls() work on this data set?

	cheers,

		Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}




More information about the R-help mailing list