[R] svyboxplot - library (survey)

Muhuri, Pradip (SAMHSA/CBHSQ) Pradip.Muhuri at samhsa.hhs.gov
Fri Oct 19 02:56:18 CEST 2012


Hi Dr. Lumley,


Further thoughts:  To get the histogram of age with proportions (relative frequencies) on y-axis, I probably need to rescale the weight for each subgroup separately so that the  rescaled weight would sum to 1 for the respective subgroup.  Am I correct?

Thanks,

Pradip Muhuri


________________________________________
From: Muhuri, Pradip (SAMHSA/CBHSQ)
Sent: Thursday, October 18, 2012 4:45 PM
To: 'Thomas Lumley'
Cc: Anthony Damico; R help; Muhuri, Pradip (SAMHSA/CBHSQ)
Subject: RE: [R] svyboxplot - library (survey)

Hello Dr. Lumley,

Thank you for your advice/suggestions.

I have rescaled the weight (i.e., "original weight" divided by "total weighted count" averaged across 8 surveys - NHIS). As can be seen below (R console), the new weight sums to 1.

I have used the freq=TRUE argument in the svyhist () function along with a new svydesign object which includes the recalled weight.  There are two issues:

        1) I am getting a warning message: In plot.histogram(h, ..., freq = freq, xlab =        xlab,   main = main) :  the AREAS in the plot are wrong -- rather use freq=FALSE.

        2) The scale of two graphs looks different (please see the attachment).

Any thoughts on how to resolve these issues?

Regards,

Pradip Muhuri

###### R console is appended below ######
> options (width=120)
> sum (tor$new_wt)
[1] 1
>
> # object with survey design variables and data with new_wt (rescaled) that sums to 1
> xnhis <- svydesign (id=~psu,strat=~stratum, weights=~new_wt, data=tor, nest=TRUE)
>
> MyBreaks <- c(18, 25, 35, 45, 55, 65, 75, 85, 95)
>
> par(mfrow=c(2,2))
> # Chart 1
>
> options( survey.lonely.psu = "adjust" )
> svyhist (~age_p,
+          subset (xnhis, xspd2=='SPD'), breaks=MyBreaks,
+           #ylim = c(0,0.040),
+          main= " ", freq=TRUE,
+          col="red",
+          xlab="Age at Interview (SPD Category)"
+          )
Warning message:
In plot.histogram(h, ..., freq = freq, xlab = xlab, main = main) :
  the AREAS in the plot are wrong -- rather use freq=FALSE
> #lines (svysmooth(~age_p, bandwidth=5,subset(nhis, xspd2=='SPD')), lwd=2)
>
> #Chart 2
>
> options( survey.lonely.psu = "adjust" )
>  svyhist (~age_p,
+          subset (xnhis, xspd2=='No SPD'), breaks=MyBreaks,
+          #ylim = c(0,0.040),
+          main= " ", freq=TRUE,
+          col="yellow", xlab="Age at Interview (No SPD Category)"
+          )
Warning message:
In plot.histogram(h, ..., freq = freq, xlab = xlab, main = main) :
  the AREAS in the plot are wrong -- rather use freq=FALSE




Pradip K. Muhuri
Statistician
Substance Abuse & Mental Health Services Administration
The Center for Behavioral Health Statistics and Quality
Division of Population Surveys
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857

Tel: 240-276-1070
Fax: 240-276-1260
e-mail: Pradip.Muhuri at samhsa.hhs.gov

The Center for Behavioral Health Statistics and Quality your feedback.  Please click on the following link to complete a brief customer survey:   http://cbhsqsurvey.samhsa.gov


-----Original Message-----
From: Thomas Lumley [mailto:tlumley at uw.edu]
Sent: Wednesday, October 17, 2012 11:13 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: Anthony Damico; R help
Subject: Re: [R] svyboxplot - library (survey)

On Thu, Oct 18, 2012 at 2:04 PM, Muhuri, Pradip (SAMHSA/CBHSQ)
<Pradip.Muhuri at samhsa.hhs.gov> wrote:
> Hello,
>
> I understand that svyhist ()  provides density histograms with density values on the y-axis (R code shown below).  Is there a way one can have relative relative frequency histograms with relative freqencies on the y-axis?

You get frequencies just by asking for them with freq: compare
   svyhist(~enroll, dstrat, main="Survey weighted",col="purple",freq=TRUE)
   svyhist(~enroll, dstrat, main="Survey weighted",col="purple")

If you mean that you want the heights of the bars to sum to 1, the
simplest way I know of is to rescale the weights to sum to 1 and use
freq=TRUE

   -thomas


> Any advice/help would be appreciated.
>
> Thanks,
>
> Pradip Muhuri
>
>
>
>
>
> ###### svyhist - Density Histogram
>
> options( survey.lonely.psu = "adjust" )
> svyhist (~age_p,
>          subset (nhis, xspd2=='SPD'), breaks=MyBreaks,
>           ylim = c(0,0.040),
>          main= " ",
>          col="red",
>          xlab="Age at Interview (SPD Category)"
>          )
> lines (svysmooth(~age_p, bandwidth=5,subset(nhis, xspd2=='SPD')), lwd=2)
>
>
> ________________________________________
> From: Anthony Damico [ajdamico at gmail.com]
> Sent: Monday, October 01, 2012 10:07 AM
> To: Muhuri, Pradip (SAMHSA/CBHSQ)
> Cc: R help
> Subject: Re: [R] svyboxplot - library (survey)
>
> using a slight modification of the example shown in ?svyboxplot
>
>
> # load survey library
> library(survey)
>
> # load example data
> data(api)
>
> # create an example svydesign
> dstrat <- svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat,
>     fpc = ~fpc)
>
> # set the plot window to display 1 plot x 2 plots
> par(mfrow=c(1,2))
>
> # generate two example boxplots
> svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
> svyboxplot(enroll~1,dstrat)
>
> # done
>
>
>
> # alternative: not as nice
>
> # set the plot window to display 2 plots x 1 plot
> par(mfrow=c(2,1))
>
> # generate two example boxplots
> svyboxplot(enroll~stype,dstrat,all.outliers=TRUE)
> svyboxplot(enroll~1,dstrat)
>
> # done
>
>
>
>
>
>
>
> On Mon, Oct 1, 2012 at 9:50 AM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>> wrote:
> Hello,
>
> I have used the library (survey) package for boxplots using the following code.
>
> Could anyone please tell me why I am getting only 1  boxplot instead of 2 boxplots (1-SPD,  2-No SPD).
>
> What changes in the following code would be required to get 2 boxplots in the same plot frame?
>
> Thanks,
>
> Pradip
>
> ###################################################
> nhis <- svydesign (id=~psu, strat=~stratum, weights=~wt8,
>             data=tor, nest=TRUE)
>
> svyboxplot (dthage~xspd2, subset (nhis, mortstat==1), col="gray80",
>              varwidth=TRUE, ylab="Age at Death", xlab="SPD Status: 1-SPD, 2=No SPD")
>
>
> Pradip K. Muhuri
> Statistician
> Substance Abuse & Mental Health Services Administration
> The Center for Behavioral Health Statistics and Quality
> Division of Population Surveys
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
>
> Tel: 240-276-1070
> Fax: 240-276-1260
> e-mail: Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>
>
> The Center for Behavioral Health Statistics and Quality your feedback.  Please click on the following link to complete a brief customer survey:   http://cbhsqsurvey.samhsa.gov
>
> vide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org<mailto:R-help at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Thomas Lumley
Professor of Biostatistics
University of Auckland




More information about the R-help mailing list