[R] Three Stage Sampling of categorical variable using 'survey' in R

Kristof bostoen at irc.nl
Tue Sep 25 14:45:03 CEST 2012


For a sanitation project in Bangladesh I need to design a three stage sample
survey to be representative of around 40 million people.  I find myself
suddenly with several challenges with which I struggle and would be
gratefully for any help. As the questions are linked I kept them together
rather than creating multiple posts

1) SURVEY DESIGN
So far  I designed mainly two stage cluster surveys but never did a three
stage cluster survey design. It seems that in the analysis only the PSU is
taken into account and enumeration area. So whatever happens at the second
stage seems irrelevant to the analysis which seem odd to me.  
Our intention was to do a PPM at the first and the second stage and have
same size takes  in each enumeration area.
The design would be to select 50 out of 150 Upazila's (sub-districts) as 
PSU using probability proportionate to size.
The second stage would be 6 village-groups out of an average of 250
village-groups per Upazila using PPS
use SRS to select 26 households in each of the 6 selected villages per
Upazila. Total sample size 7800
Household is the BSU and where we need to calculate information on the
individual level we are confident to be able to correct the sample weights
for that. 
In the two stage sampling I managed to optimise in other projects I could
base the sample design based on cost to optimise it but it seems more
difficult with three stage sampling.

2) CATEGORICAL VARIABLES
So far I worked mainly with binary data but now we are collecting ranked
categorical variables and I'm  not sure how to treat these.  The categorical
variable form a scale to adherence to a certain level of sanitation but the
scale is not linear.

3) Using "R" instead of STATA
While always wanted learn "R" I always found it hard to get my hear around
it.  Even with Rstudio and Rcommander installed.  I installed the "survey"
package and tried to read up on how to use it but fail to.  Is their anybody
willing to help?

While I can get my head around basic probability principle in survey
sampling I'm not a statistician so I'll bite my pride and ask to explain it
as I would be a 10 year old just to be be sure I get it.  Any good reference
material is always welcome but more direct answers who help me the most due
to time constraints as we have to finish the design in the days to come.

Thanks a lot in advance for all your help
Kristof



--
View this message in context: http://r.789695.n4.nabble.com/Three-Stage-Sampling-of-categorical-variable-using-survey-in-R-tp4644110.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list