[R] Split values in vector

Johannes Radinger JRadinger at gmx.at
Thu Jan 19 16:07:53 CET 2012


Hi,

just for explaining it a little bit furhter
here a small sample dataframe (similar to that
I am working with).


var1 <-seq(1,5)
var2 <-c("A","B","C","D","E")
var3 <-c("00","01-1;02-3;04-1","01-2;02-1","01-0;04-12",NA)

x <- data.frame(var1,var2,var3)

The final dataframe should look like:
When there is the category "00" then the column "00" should
be 1 and all others 0. The other values should be according
to the input and when the category is not stated then the value
is 0. Sounds probably a little bit confusing but hopefully
the example makes it easier to understand.

var1  var2  var3_00   var3_01   var3_02   var3_04
1     A     1         0         0         0
2     B     0         1         3         1
3     C     0         2         1         0
4     D     0         0         0         12
5     E     NA        NA        NA        NA


When I try it with the recommended approach I get an error
when I want it executes table() and I am not sure if I will
get exactly the result I want.

X <- unlist(strsplit(as.character(x$var3), split = ";", fixed = TRUE))
X <- strsplit( X, split = "-", fixed = TRUE)

X <- sapply( X, function( x)
			if( length(x) == 2)
				rep( x[1], as.numeric( x[2])) else x[1]
)

table(X, useNA = "always")

Thank you for you help, I really don't know how this can be handled....

best regards,
johannes


-------- Original-Nachricht --------
> Datum: Thu, 19 Jan 2012 13:42:24 +0100 (MET)
> Von: Gerrit Eichner <Gerrit.Eichner at math.uni-giessen.de>
> An: Johannes Radinger <JRadinger at gmx.at>
> CC: R-help at r-project.org
> Betreff: Re: [R] Split values in vector

> Hi, Johannes,
> 
> maybe
> 
> X <- unlist( strsplit( as.character( x$ART), split = ";", fixed = TRUE))
> X <- strsplit( X, split = "-", fixed = TRUE)
> 
> X <- sapply( X, function( x)
>                   if( length(x) == 2)
>                    rep( x[1], as.numeric( x[2])) else x[1]
>              )
> 
> table(X, useNA = "always")
> 
> 
> comes close to what you want.
> 
>   Hth  --  Gerrit
> 
> 
> On Thu, 19 Jan 2012, Johannes Radinger wrote:
> 
> > Hello,
> >
> > I have a vector which looks like
> >
> > x$ART
> > ...
> 
> > [35415] 00                        01-1;02-1;05-1;
> > [35417] 01-1;                     01-1;02-1;
> > [35419] 01-1;                     00
> > [35421] 01-1;04-1;                05-1;
> > [35423] 02-1;                     01-1;02-1;
> > [35425] 01-1;02-1;                <NA>
> > [35427] 01-1;                     <NA>
> > ...
> >
> >
> > This is a vector I got in this format. To explain it:
> > there are several categories (00,01,02 etc) and its counts (values after
> -)
> > So I have to split each value and create new dataframe-columns/vectors
> > for each categories one column and the value should be then in the
> > corresponding cell. I know that this vector has 7 categories (00-06)
> > and NA values but each case (row) has not all the categories (as you can
> see).  How can do such as split?
> >
> > In the end I should get:
> > x$ART_00, x$ART_01, x$ART_03,... with its values. In the case of <NA>
> all the categories should have also <NA>.
> >
> > Maybe someone can help.
> >
> > Thank you,
> >
> > Best regards
> >
> > Johannes
> >
> >
> >
> > -- 
> > "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...



More information about the R-help mailing list