[R] Help: formatting the result of 'cut' function

Gabor Grothendieck ggrothendieck at gmail.com
Wed Apr 21 18:15:49 CEST 2010


gsubfn is like gsub except instead of a replacement string it uses a
replacement function whose input is the string matched by the regular
expression and whose output replaces the match.   The replacement
function can optionally be specified as a formula as we do here.  If
there is no left hand side to the formula then the arguments are taken
to be the free variables in the right hand side of the formula, here
its just x.  Finally we use sub to replace each comma with comma
space.   See http://gsubfn.googlecode.com for more.

library(gsubfn)
levels(c1) <- gsubfn("\\d+", ~ sprintf("%03d", as.numeric(x)), levels(c1))
levels(c1) <- sub(",", ", ", levels(c1))


On Wed, Apr 21, 2010 at 10:24 AM, Jose Claudio Faria
<joseclaudio.faria at gmail.com> wrote:
> Dear list,
>
> I would like to format the result of the 'cut' function to perform a subsequent
> frequency distribution table (fdt) suitable for publications.
> Below an reproducible example:
>
> set.seed(1)
> x <- c(rnorm(1e3, mean=10, sd=1), 50, 100)
>
> start <- 0
> end   <- 110
> h     <-10
>
> c1 <- cut(x, br=seq(start, end, h), right=TRUE)
> levels(c1)
> # I get:
> # [1] "(0,10]"    "(10,20]"   "(20,30]"   "(30,40]"
> # [5] "(40,50]"   "(50,60]"   "(60,70]"   "(70,80]"
> # [9] "(80,90]"   "(90,100]"  "(100,110]"
>
> # I need (observe digits and space after the comma):
> # [1] "(000, 010]"  "(010, 020]"  "(020, 030]"  "(030, 040]"
> # [5] "(040, 050]"  "(050, 060]"  "(060, 070]"  "(070, 080]"
> # [9] "(080, 090]"  "(090, 100]"  "(100, 110]"
>
> c2 <- cut(x, br=seq(start, end, h), right=FALSE)
> levels(c2)
> # I get:
> # [1] "[0,10)"    "[10,20)"   "[20,30)"   "[30,40)"
> # [5] "[40,50)"   "[50,60)"   "[60,70)"   "[70,80)"
> # [9] "[80,90)"   "[90,100)"  "[100,110)"
>
> # I need (observe digits and space after the comma):
> # [1] "[000, 010)"  "[010, 020)"  "[020, 030)"  "[030, 040)"
> # [5] "[040, 050)"  "[050, 060)"  "[060, 070)"  "[070, 080)"
> # [9] "[080, 090)"  "[090, 100)"  "[100, 110)"
>
> # Making fdt:
> table(c1)
> # I get:
> # c1
> #    (0,10]   (10,20]   (20,30]   (30,40]   (40,50]   (50,60]
> #       518        482           0           0            1           0
> #   (60,70]   (70,80]   (80,90]  (90,100] (100,110]
> #            0           0           0            1             0
>
> # I need (observe digits and space after the comma):
> # c1
> #  (000, 010]  (010, 020]  (020, 030]  (030, 040]  (040, 050]  (050, 060]
> #            518           482               0               0
>    1               0
> #  (060, 070]  (070, 080]  (080, 090]  (090, 100]  (100, 110]
> #               0               0               0              1               0
>
> table(c2)
> # I get:
> # c2
> #    [0,10)   [10,20)   [20,30)   [30,40)   [40,50)   [50,60)
> #       518        482           0            0           0           1
> #   [60,70)   [70,80)   [80,90)  [90,100) [100,110)
> #           0           0            0            0             1
>
> # I need (observe digits and space after the comma):
> # c2
> #   [000, 010)  [010, 020)  [020, 030)  [030, 040)  [040, 050)  [050, 060)
> #            518            482              0               0
>     0               1
> #   [060, 070)  [070, 080)  [080, 090)  [090, 100)  [100, 110)
> #                0               0               0               0
>         1
>
>
> Is it possible? Any tip will be welcome!
>
> Thanks in advance,
> --
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
> Jose Claudio Faria
> Estatistica - prof. Titular
> UESC/DCET/Brasil
> joseclaudio.faria at gmail.com
> ///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list