[R] splitting strings efficiently

Andrew Roberts andrew at thinkingbone.org
Sun Jan 8 22:10:37 CET 2012


Thanks Enrico & Jim,

The following finished the job in under a minute!

res <- unlist(strsplit(data[["ComputerName"]], "\\."))
ii <- seq(1, nrow(data)*4, by = 4)
data$IPA <-res[ii]   ## A
data$IPB <-res[ii+1] ## B
data$IPC <-res[ii+2] ## C
data$IPD <-res[ii+3] ## D

Andrew

On 08/01/2012 13:11, Enrico Schumann wrote:
>
> Hi Andrew,
>
> you can use strsplit for a character vector; you do not have to call 
> it for every element data$ComputerName[i].
>
> If I understand correctly, maybe something like this helps
>
> > ip <- "123.456.789.321"  ## example data
> > df <- data.frame(ip = rep(ip, 9), stringsAsFactors=FALSE)
> > df
>                ip
> 1 123.456.789.321
> 2 123.456.789.321
> 3 123.456.789.321
> 4 123.456.789.321
> 5 123.456.789.321
> 6 123.456.789.321
> 7 123.456.789.321
> 8 123.456.789.321
> 9 123.456.789.321
>
> >
> > res <- unlist(strsplit(df[["ip"]], "\\."))
> > ii <- seq(1, nrow(df)*4, by = 4)
> > res[ii]   ## A
> [1] "123" "123" "123" "123" "123" "123" "123"
> [8] "123" "123"
> > res[ii+1] ## B
> [1] "456" "456" "456" "456" "456" "456" "456"
> [8] "456" "456"
> > res[ii+2] ## C
> [1] "789" "789" "789" "789" "789" "789" "789"
> [8] "789" "789"
> > res[ii+3] ## D
> [1] "321" "321" "321" "321" "321" "321" "321"
> [8] "321" "321"
>
>
> Regards,
> Enrico
>
>
> Am 08.01.2012 11:06, schrieb Andrew Roberts:
>> Folks,
>>
>> I have a data frame with 4861469 rows that contains an ip address
>> xxx.xxx.xxx.xxx as one of the columns. I want to assign a site to each
>> row based on IP ranges. To do this I have a function to split the ip
>> address as character into class A,B,C and D components. It works but is
>> horribly inefficient in terms of speed. I can't quite see how one of the
>> l/s/m/t/apply functions could be brought to bear on the problem. Does
>> anyone have any thoughts?
>>
>> for(i in 1:4861469)
>>     {
>>     lst<-unlist(strsplit(data$ComputerName[i], "\\."))
>>     data$IPA[i]<-lst[[1]]
>>     data$IPB[i]<-lst[[2]]
>>     data$IPC[i]<-lst[[3]]
>>     data$IPD[i]<-lst[[4]]
>>     rm(lst)
>>     }
>>
>> Andrew
>>
>> Andrew Roberts
>> Children's Orthopaedic Surgeon
>> RJAH, Oswestry, UK
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list