[R] faster unlist,strsplit,gsub,for

Romain Francois romain at r-enthusiasts.com
Fri Sep 10 11:58:19 CEST 2010


Hi,

You can leverage read.table using a textConnection:

 > txt <- "x,y,z,a,b,c,d<d>a,b,c,d,e,f,g<d>"
 > con <- textConnection( gsub( "<d>", "\\\n", txt ) )
 > read.table( con, sep = "," )
   V1 V2 V3 V4 V5 V6 V7
1  x  y  z  a  b  c  d
2  a  b  c  d  e  f  g
 > close( con )

Romain

Le 10/09/10 06:41, rajesh j a écrit :
>
> Ok. These operations are on a string and the result is added to a
> data.frame.
> I have strings of the form
> "x,y,z,a,b,c,d<d>a,b,c,d,e,f,g<d>
> essentially comma separated values delimited by a<d>
> I first do a
> unlist(strsplit(string,split="<d>"))
> and then a
> strsplit(string,split=",")
>
> The list of vectors i end up with is added row by row to a preallocated
> data.frame like..
> df[i,]<-list[[i]]
>
> all of this is in a for loop and it runs for 1000 times atleast and the
> strings are 7000 to 8000 characters in length
>
>
>
> On Fri, Sep 10, 2010 at 9:14 AM, jim holtman<jholtman at gmail.com>  wrote:
>
>> First thing to do is to use Rprof to profile your code to see where
>> the time is being spent, then you can make a decision as to what to
>> change.  Are you carrying out the operations on a dataframe, if so can
>> you change it to a matrix for some of the operations?  You have
>> provided no idea of what your code or data looks like, or how often
>> each of the operations is being done.
>>
>> There are probably many ways of speeding up the code, but with no idea
>> of what the code is, no solutions can be specified.
>>
>> On Thu, Sep 9, 2010 at 11:09 PM, rajesh j<akshay.rajesh at gmail.com>  wrote:
>>> Hi,
>>>
>>> I perform the operations unlist,strsplit,gsub and the for loop on a lot
>> of
>>> strings and its heavily slowing down the overall system. Is there some
>> way
>>> for me to speeden up these operations..maybe like alternate versions that
>>> exist which use multiprocessors etc.
>>>
>>> --
>>> Rajesh.J


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/bzoWrs : Rcpp svn revision 2000
|- http://bit.ly/b8VNE2 : Rcpp at LondonR, oct 5th
`- http://bit.ly/aAyra4 : highlight 0.2-2



More information about the R-help mailing list