[R] gsub with regular expression

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jun 25 17:56:40 CEST 2010


On Fri, Jun 25, 2010 at 11:11 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Fri, Jun 25, 2010 at 10:48 AM, Sebastian Kruk
> <residuo.solow at gmail.com> wrote:
>> If I have a text with 7 words per line and I would like to put first
>> and second word joined in a vector and the rest of words one per
>> column in a matrix how can I do it?
>>
>> First 2 lines of my text file:
>> "2008/12/31 12:23:31 numero 343.233.233 Rodeo Vaca Ruido"
>> "2010/02/01 02:35:31 palabra 111.111.222 abejorro Rodeo Vaca"
>>
>> Results:
>>
>> Vector:
>> 2008/12/31 12:23:31
>> 2010/02/01 02:35:31
>>
>> Matrix
>> "numero" 343.233.233 "Rodeo"   "Vaca"   "Ruido"
>> "palabra" 111.111.222 "abejorro" "Rodeo" "Vaca"
>>
>
> Here are two solutions.  Both solutions are three statements long
> (read in the data, display the vector, display the matrix).  Replace
> textConnection(text) with "myfile.dat", say, in each.
>
> 1. Here is a sub solution:
>
> L <- readLines(textConnection(Lines))
> sub("(\\S+ \\S+) .*", "\\1", L)
> sub("\\S+ \\S+ ", "", L)

The last line should be:

as.matrix(read.table(textConnection(sub("\\S+ \\S+ ", "", L)), as.is = TRUE))

3. And a third solution which perhaps is the most obvious:

DF <- read.table(textConnection(Lines), as.is = TRUE)
paste(DF[, 1], DF[, 2]) # vector
as.matrix(DF[-(1:2)]) # matrix



More information about the R-help mailing list