[R] getting strata/cluster level values with survey package?

Thomas Lumley tlumley at u.washington.edu
Wed Feb 15 22:01:39 CET 2006


On Tue, 7 Feb 2006, Jeff D. Hamann wrote:

> First, I appoligise for the rookie question, but...
>
> I'm trying to obtain standard errors, confidence intervals, etc. from a
> sample design and have been trouble getting the results for anything other
> than the basic total or mean for the overall survey from the survey
> package.

You want svyby() and then perhaps ftable() for formatting. (?svyby, 
?ftable.svyby).

(You also want to send only one copy of the email message, not three).

 	-thomas



>
> For example, using the following dataset,
>
> strata,cluster,vol
> A,1,18.58556192
> A,1,12.55175443
> A,1,21.65882438
> A,1,17.11172946
> A,1,15.41713348
> A,2,13.9344623
> A,2,17.13104821
> A,2,14.6806479
> A,2,14.68357291
> A,2,18.86017714
> A,2,20.67642515
> A,2,15.15295351
> A,2,13.82121102
> A,2,12.9110477
> A,2,14.83153677
> A,2,21.90772687
> A,3,18.69795427
> A,3,18.45636428
> A,3,15.77175793
> A,3,15.54715217
> A,3,20.31948393
> A,3,19.26391445
> A,3,15.54750775
> A,3,19.18724018
> A,4,12.89572151
> A,4,12.92047701
> A,4,12.64958757
> A,4,19.85888418
> A,4,19.64057669
> A,4,19.19188964
> A,4,18.81619298
> A,4,21.73670878
> A,5,15.99430802
> A,5,18.66666517
> A,5,21.80441654
> A,5,14.22081904
> A,5,16.01576433
> A,5,14.92497202
> A,5,17.95123218
> A,5,19.82027165
> A,5,19.35698273
> A,5,19.10826519
> B,6,13.40892677
> B,6,14.3956207
> B,6,13.82113391
> B,6,16.37338569
> B,6,19.70159575
> B,7,14.74334178
> B,7,16.55125245
> B,7,12.38329798
> B,7,18.16472408
> B,7,16.32938475
> B,7,16.06465494
> B,7,12.63086062
> B,7,14.46114813
> B,7,21.90134013
> B,7,13.81025827
> B,7,15.85805494
> B,7,20.18195326
> B,8,19.05120792
> B,8,12.83856639
> B,8,12.61360139
> B,8,21.30434314
> B,8,14.19960469
> B,8,17.38397826
> B,8,15.66477339
> B,8,22.07182834
> B,8,12.07487394
> B,8,20.36357359
> B,8,20.2543677
> B,9,14.44499362
> B,9,17.77235228
> B,9,13.01620902
> B,9,18.10976359
> B,10,18.22350661
> B,10,18.41504728
> B,10,17.94735486
> B,10,18.39173938
> B,10,14.21729704
> B,10,16.95753684
> B,10,21.11643087
> B,10,16.09688752
> B,10,19.54707452
> B,10,22.00450065
> B,10,15.15308873
> B,10,14.72488972
> B,10,17.65280737
> B,10,14.61615255
> B,10,12.89525607
> B,11,22.35831089
> B,11,18.0853187
> B,11,22.12815791
> B,11,17.74562214
> B,11,21.45724242
> B,11,20.57933779
> B,11,19.97397415
> B,11,16.34967424
> B,12,22.14385376
> B,12,17.82816113
> B,12,18.37056381
> B,12,16.13152759
> B,12,22.06764318
> B,12,12.80924472
> B,12,18.95522175
> B,13,20.40554286
> B,13,19.72951878
> C,14,15.51581
> C,14,15.4836358
> C,14,13.35882363
> C,14,13.16072916
> C,14,21.69168971
> C,14,19.09686303
> C,14,14.47450457
> C,14,12.04870424
> C,14,13.33096141
> C,14,17.38388981
> C,14,16.29015289
> C,14,16.32707754
> C,14,16.2784054
> C,15,15.0170597
> C,15,14.95767365
> C,15,15.20739614
> C,15,22.10458509
> C,15,12.3362457
> C,15,19.87895753
> C,15,18.8363682
> C,15,16.43738666
> C,15,12.84570744
> C,15,15.99869357
> C,15,14.42551321
> C,15,13.63489872
> C,15,15.67179885
> C,16,14.61700901
> C,16,14.64864676
> C,16,14.13014582
> C,16,21.7637441
> C,16,20.66825543
> C,16,17.05977818
> C,16,17.80118916
> C,16,15.16641698
>
> where this is read into stand.data. When I use the following survey designs,
>
> srv1 <- svydesign(ids=~1, strata=~strata, data=stand.data )
>
> or,
>
> srv1 <- svydesign(ids=~cluster, strata=~strata, data=stand.data )
>
> with,
>
> print( svytotal( ~vol, srv1 ) )
>
> I only obtain the total,
>
>> print( svytotal( ~vol, srv1 ) )
>    total     SE
> vol  2377 34.464
>
> or worse,
>
> print( svytotal( ~vol + strata, srv1 ) )
>         total     SE
> vol     2377.0 34.464
> strataA   42.0  0.000
> strataB   64.0  0.000
> strataC   34.0  0.000
>
> which reports the number of observations in each of the strata. I'm sure
> this is a RTFM question, but I just need a start. The size of each "plot"
> is 0.04 units (hectares) and I want to be able to quickly examine working
> up each sample with and without clusters (this is going to be part of a
> larger simulation study).
>
> I'm trying to not use SAS for this and hate to admit defeat.
>
> Thanks,
> Jeff.
>
>
>
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle




More information about the R-help mailing list