[Rd] Re: PR#886: Error (?) in documentation of 'swiss'
Sun, 22 Apr 2001 00:18:53 +0200 (MET DST)
No, you're quite right, I was wrong. I must have been very confused when I
put up the first 'bug' report, #886. I noticed my error a couple of days
after and put up a comment to that effect on R-bugs, but it seems to have
vanished (or alternatively I did that wrong as well). I agree with you on
the criterion for the choice of districts as well. There remains a minuscule
discrepancy between the Mosteller and Tukey description and the Princeton
one, in that M&T say that the 4th variable is the % of the population with
education beyond primary school, whereas the Princeton source says it's the
percentage of 'draftees' with this level of education; given the date, the
differences between male and female education levels at the time, and the
(presumed) fact that the 'draftees' are all male, this might make a
difference to interpretation, I suppose.
It did occur to me to wonder why Mosteller and Tukey chose these particular
variables out of all those given in the source...
Department of Statistics
The Open University
----- Original Message -----
From: "Prof Brian D Ripley" <email@example.com>
To: "k.j.mcconway" <firstname.lastname@example.org>
Cc: "R-bugs" <>
Sent: 21 April 2001 13:09
Subject: Re: PR#886: Error (?) in documentation of 'swiss'
> >From email@example.com Wed Mar 28 16:46:49 2001
> Hardly crucial, but I've come upon a potential error in the documentation
> of the 'swiss' datafram in the R base package. The description accurately
> matches what is said in the Mosteller and Tukey source quoted, but
> according to the data archived at Princeton (links from
> http://opr.princeton.edu/archive/eufert/switz.html), the variable that
> Mosteller and Tukey report as infant mortality is actually the proportion
> of 'draftees' with education beyond primary school. Infant mortality is on
> the archived file, but the values are quite a lot different. Of course,
> it's possible that Mosteller and Tukey were right and the people who did
> the archiving at Princenton, later, got it wrong.
> It seems it is you that got it wrong! In the file sw1888.dat in their
> switz.zip, the variable
> 36 USCHOOL 321-330 3 Prop. draftees with > primary educ.
> corresponds to `Education' and
> 22 INFMORT 181-190 3 Infant Mortality Rate
> corresponds to `Infant Mortality'.
> Apart from guessing factors of 10, the data correspond to the 47 districts
> with > 50% French speakers to the accuracy they are given.
> If you still disagree, can you please explain how you did this? I used
> > read.fwf("sw1888.dat", widths=c(rep(10, 45),1,2,2,2,8,1,15)) ->sw0
> > ind <- sw0[, 27] > 50000
> > sw <- sw0[ind, ]
> > sw[, c(5, 12, 34, 36, 9, 22)-3]
> the first column in the file being variable 4.
> Thanks for the source: as from 1.3.0 I have added the district names.
> Brian D. Ripley, firstname.lastname@example.org
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272860 (secr)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: email@example.com