[Rd] Re: PR#886: Error (?) in documentation of 'swiss'

k.j.mcconway@open.ac.uk k.j.mcconway@open.ac.uk
Sun, 22 Apr 2001 00:18:53 +0200 (MET DST)


No, you're quite right, I was wrong. I must have been very confused when I
put up the first 'bug' report, #886. I noticed my error a couple of days
after and put up a comment to that effect on R-bugs, but it seems to have
vanished (or alternatively I did that wrong as well). I agree with you on
the criterion for the choice of districts as well. There remains a minuscule
discrepancy between the Mosteller and Tukey description and the Princeton
one, in that M&T say that the 4th variable is the % of the population with
education beyond primary school, whereas the Princeton source says it's the
percentage of 'draftees' with this level of education; given the date, the
differences between male and female education levels at the time, and the
(presumed) fact that the 'draftees' are all male, this might make a
difference to interpretation, I suppose.

It did occur to me to wonder why Mosteller and Tukey chose these particular
variables out of all those given in the source...

Regards,

Kevin McConway
Department of Statistics
The Open University
k.j.mcconway@open.ac.uk
----- Original Message -----
From: "Prof Brian D Ripley" <ripley@stats.ox.ac.uk>
To: "k.j.mcconway" <k.j.mcconway@open.ac.uk>
Cc: "R-bugs" <>
Sent: 21 April 2001 13:09
Subject: Re: PR#886: Error (?) in documentation of 'swiss'


>
> >From k.j.mcconway@open.ac.uk  Wed Mar 28 16:46:49 2001
>
> Hardly crucial, but I've come upon a potential error in the documentation
> of the 'swiss' datafram in the R base package. The description accurately
> matches what is said in the Mosteller and Tukey source quoted, but
> according to the data archived at Princeton (links from
> http://opr.princeton.edu/archive/eufert/switz.html), the variable that
> Mosteller and Tukey report as infant mortality is actually the proportion
> of 'draftees' with education beyond primary school. Infant mortality is on
> the archived file, but the values are quite a lot different. Of course,
> it's possible that Mosteller and Tukey were right and the people who did
> the archiving at Princenton, later, got it wrong.
>
> It seems it is you that got it wrong! In the file sw1888.dat in their
> switz.zip, the variable
>
>  36 USCHOOL        321-330       3  Prop. draftees with > primary educ.
>
> corresponds to `Education' and
>
>  22 INFMORT        181-190       3  Infant Mortality Rate
>
> corresponds to `Infant Mortality'.
>
> Apart from guessing factors of 10, the data correspond to the 47 districts
> with > 50% French speakers to the accuracy they are given.
>
> If you still disagree, can you please explain how you did this?  I used
>
> > read.fwf("sw1888.dat", widths=c(rep(10, 45),1,2,2,2,8,1,15)) ->sw0
> > ind <- sw0[, 27] > 50000
> > sw <- sw0[ind, ]
> > sw[, c(5, 12, 34, 36, 9, 22)-3]
>
> the first column in the file being variable 4.
>
>
> Thanks for the source: as from 1.3.0 I have added the district names.
>
> --
> Brian D. Ripley,                  ripley@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272860 (secr)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._