[R] cut ()

Muhuri, Pradip (SAMHSA/CBHSQ) Pradip.Muhuri at samhsa.hhs.gov
Tue Jan 1 02:54:39 CET 2013


Dear David,

Thank you so much for catching the mistake that is kind of careless.  Sorry about that.

Happy New Year.

Pradip
________________________________________
From: David L Carlson [dcarlson at tamu.edu]
Sent: Monday, December 31, 2012 6:18 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ); 'R help'
Subject: RE: [R] cut ()

A misplaced right parenthesis caused the problem:

p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile
(p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE))

Should be

p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile
(p1_st_data$obt_mrj_p, (0:5/5)), include.lowest=TRUE)

---------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Muhuri, Pradip (SAMHSA/CBHSQ)
> Sent: Monday, December 31, 2012 4:25 PM
> To: R help
> Subject: [R] cut ()
>
> Hello List,
>
>
> My goal is to create a 5 category variable (p1_st_data$ob_mrj_cat),
> based on the p1_st_data$obt_mrj_p variable, using the following code
> for 50 States and District of Columbia (N=51).
>
> p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile
> (p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE))
>
> The issue is that, for Utah, I am getting an <NA> instead of (42,48.7]
> in the ob_mrj_cat column.
>
> Is there a way to tweak the code (i.e., programmatically) to resolve
> the issue?
>
> I would appreciate receiving your help.
>
> Happy New Year and Best Wishes to R Expert-members, who have been so
> kind and helpful to beginner R users like me.
>
> Thanks and regards,
>
> Pradip Muhuri
>
>
>
> ##########################  console followed the reproducible example
> #######
> > table(p1_st_data$ob_mrj_cat)
>
>   (42,48.7] (48.7,50.9] (50.9,52.8] (52.8,54.2] (54.2,58.7]
>          10          10          10          10          10
>
> > p1_st_data [p1_st_data$state =="Utah",] [, 1:4]
>    state obt_mrj_p obt_mrj_se ob_mrj_cat
> 45  Utah        42       1.49       <NA>    # I expected this to be
> (42,48.7] instead of <NA>.
>
>
> ### The Reproducible Example (data and code) is shown below:
>
>
> #read estimates of risk factors for substances use (ages 12-17) by
> State obtained from SUDAAN output
> p1_st_data <-read.table (text="
> Alabama,  49.60,               1.37
> Alaska,  55.00,    1.41
> Arizona,  52.50, 1.56
> Arkansas,            50.50,    1.22
> California,            51.10,    0.65
> Colorado,            55.10,    1.26
> Connecticut,      56.30,    1.28
> Delaware,           53.60,    1.30
> District of Columbia,  53.50,         1.22
> Florida,  52.70,   0.67
> Georgia,               52.50,    1.15
> Hawaii, 49.40,    1.33
> Idaho,   48.30,    1.23
> Illinois,  52.70,    0.63
> Indiana,                49.60,    1.16
> Iowa,     46.30,    1.37
> Kansas, 44.30,    1.43
> Kentucky,            52.90,    1.37
> Louisiana,            49.70,    1.23
> Maine,  55.60,    1.44
> Maryland,           53.90,    1.46
> Massachusetts,                55.40,    1.41
> Michigan,            52.40,    0.62
> Minnesota,         51.50,    1.20
> Mississippi,         43.20,    1.14
> Missouri,             48.70,    1.20
> Montana,            56.40,    1.16
> Nebraska,           45.70,    1.51
> Nevada,               54.20,    1.17
> New Hampshire,              56.10,    1.30
> New Jersey,       53.20,    1.45
> New Mexico,     57.60,    1.34
> New York,           53.70,    0.67
> North Carolina, 52.20,    1.26
> North Dakota,   48.60,    1.34
> Ohio,     50.90,    0.61
> Oklahoma,          47.20,    1.42
> Oregon,               54.00,    1.35
> Pennsylvania,    53.00,    0.63
> Rhode Island,    57.20,    1.20
> South Carolina, 50.50,    1.21
> South Dakota,   43.40,    1.30
> Tennessee,        48.90,    1.35
> Texas,   48.70,    0.62
> Utah,     42.00,    1.49
> Vermont,            58.70,    1.24
> Virginia,                51.80,    1.18
> Washington,      53.50,    1.39
> West Virginia,    52.80,    1.07
> Wisconsin,          49.90,    1.50
> Wyoming,           49.20,    1.29",
> sep=  "," , col.names = c("state" ,   "Obt_mrj_p" ,  "Obt_mrj_se" ),
> colClasses = c( "character" ,  "numeric" , "numeric" )
> )
>
> #change the names to lower cases
> names(p1_st_data) <- tolower (names(p1_st_data))
>
> # cerate five equal-sized groups for the perceived ease of obtaining
> marijuana variable
> p1_st_data$ob_mrj_cat <- cut (p1_st_data$obt_mrj_p, quantile
> (p1_st_data$obt_mrj_p, (0:5/5), include.lowest=TRUE))
>
> p1_st_data
> dim (p1_st_data)
> table(p1_st_data$ob_mrj_cat)
> p1_st_data [p1_st_data$state =="Utah",] [, 1:4]
>
>
>
> Pradip K. Muhuri, PhD
> Statistician
> Substance Abuse & Mental Health Services Administration
> The Center for Behavioral Health Statistics and Quality
> Division of Population Surveys
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
>
> Tel: 240-276-1070
> Fax: 240-276-1260
> e-mail:
> Pradip.Muhuri at samhsa.hhs.gov<mailto:Pradip.Muhuri at samhsa.hhs.gov>
>
> The Center for Behavioral Health Statistics and Quality your feedback.
> Please click on the following link to complete a brief customer survey:
> http://cbhsqsurvey.samhsa.gov<http://cbhsqsurvey.samhsa.gov/>
>
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list