[R] vectorizing problem

Gabor Grothendieck ggrothendieck at gmail.com
Mon Oct 4 23:19:34 CEST 2010


On Mon, Oct 4, 2010 at 2:39 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Mon, Oct 4, 2010 at 1:54 PM, Dylan Miracle <dylan.miracle at gmail.com> wrote:
>> Hello,
>>
>> I have a two column dataframe that
>> has entries that look like this:
>>
>> 2315100       NR_024005,NR_024004,AK093685
>> 2315106       DQ786314
>>
>> and I want to change this to look like this:
>>
>> 2315100       NR_024005
>> 2315100       NR_024004
>> 2315100       AK093685
>> 2315106       DQ786314
>>
>> I can do this with the following "for" loop but the dataframe (GPL)
>> has ~140,000 rows and this takes about 15 minutes:
>
> Try this assuming that the columns of GPL are character.  You may need
> to use as.character first if they are factor:
>
> library(reshape2)
> V2 <- strsplit(GPL$V2, ",")
> names(V2) <- GPL$V1
> melt(V2)

We alternately use stack or lattice's make.groups so the solution becomes:

V2 <- strsplit(GPL$V2, ",")
names(V2) <- GPL$V1

# followed by one of these three

stack(V2)

library(reshape2)
melt(V2)

library(lattice)
do.call("make.groups", V2)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list