[R] dataframe, transform, strsplit

Gabor Grothendieck ggrothendieck at gmail.com
Mon Oct 25 19:30:13 CEST 2010


On Mon, Oct 25, 2010 at 1:20 PM, Matthew Pettis
<matthew.pettis at gmail.com> wrote:
> Thanks Gabor and Jim,
> Both solutions worked equally well for me (now I have an embarrassment of
> riches for a solution :-) ).
> Now that my main problem is solved, I am happy, but I was wondering if
> anyone would care to comment as to why my 'strsplit' solution doesn't behave
> the way I think it should...
> Thank you both again,
> Matt
>
> On Mon, Oct 25, 2010 at 12:09 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>>
>> On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis
>> <matthew.pettis at gmail.com> wrote:
>> > Hi,
>> >
>> > I have a dataframe that has a column of vectors that I need to extract
>> > off
>> > the character string before the first '.' character and put it into a
>> > separate column.  I thought I could use 'strsplit' for it within
>> > 'transform', but I can't seem to get the right invocation.  Here is a
>> > sample
>> > dataframe that has what I have, what I want, and what I get.  Can
>> > someone
>> > tell me how to get what is in the 'want' column from the 'have' column
>> > programatically?
>> >

1. split = "." is a regular expression which means every character is
a split character, not just dot.

2.  Even if this is corrected picking off [[1]] means picking off the
first element which would be c("a", "b", "c") whereas we want the
first element of each component of the result, not the first element
overall.

A corrected version using the same approach looks like this:

   transform(df, want = sapply(strsplit(as.character(have), ".", fixed
= TRUE), "[", 1))

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list