[R] gsub,regex and replacing

David Winsemius dwinsemius at comcast.net
Wed Apr 28 14:30:28 CEST 2010


On Apr 28, 2010, at 8:04 AM, arnaud Gaboury wrote:

> Sorry Jim, here is my data.frame:
>
> avprix <-
> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN May/10",
> "ROBUSTA COFFEE (10) Jul/10", "SOYBEANS Jul/10", "SPCL HIGH GRADE  
> ZINC USD
> Jul/10",
> "STANDARD LEAD USD Jul/10"), prix = c(-1.5, -1082, 11084, 1983.5,
> -2464, -118), quantity = c(0, -3, 8, 2, -1, 0)), .Names =  
> c("DESCRIPTION",
> "prix", "quantity"), row.names = c(NA, -6L), class = "data.frame")
>
>
I replied to an earlier (at least on my client)  posting.

-- 
David.
>
>
> From: jim holtman [mailto:jholtman at gmail.com]
> Sent: Wednesday, April 28, 2010 1:31 PM
> To: arnaud Gaboury
> Cc: r-help at r-project.org
> Subject: Re: [R] gsub,regex and replacing
>
> Next time you include a data frame that has spaces in the fields,  
> please
> consider using 'dput' to provided the data.  It is hard to read in
> otherwise.
>
> Will this do it:
>
>> x <- read.table(textConnection("                     DESCRIPTION     
>> prix
> quantity
> + 1                     'CORN Jul/10'    -1.5        0
> + 2                     'CORN May/10' -1082.0       -3
> + 3      'ROBUSTA COFFEE (10) Jul/10' 11084.0        8
> + 4                 'SOYBEANS Jul/10'  1983.5        2
> + 5 'SPCL HIGH GRADE ZINC USD Jul/10' -2464.0       -1
> + 6        'STANDARD LEAD USD Aug/10'  -118.0        0"), header=TRUE,
> as.is=TRUE)
> .>
>> x$DESCRIPTION <- sub("USD [[:alpha:]]+/[[:digit:]]+", "USD",
> x$DESCRIPTION)
>>
>> x
>                  DESCRIPTION    prix quantity
> 1                CORN Jul/10    -1.5        0
> 2                CORN May/10 -1082.0       -3
> 3 ROBUSTA COFFEE (10) Jul/10 11084.0        8
> 4            SOYBEANS Jul/10  1983.5        2
> 5   SPCL HIGH GRADE ZINC USD -2464.0       -1
> 6          STANDARD LEAD USD  -118.0        0
>
> On Wed, Apr 28, 2010 at 7:13 AM, arnaud Gaboury <arnaud.gaboury at gmail.com 
> >
> wrote:
> Dear group,
>
> I need to modify some characters in a data frame. I want to use gsub  
> and the
> regex functionalities to do this.
>
> Here is the data frame (df):
>
>                      DESCRIPTION    prix quantity
> 1                     CORN Jul/10    -1.5        0
> 2                     CORN May/10 -1082.0       -3
> 3      ROBUSTA COFFEE (10) Jul/10 11084.0        8
> 4                 SOYBEANS Jul/10  1983.5        2
> 5 SPCL HIGH GRADE ZINC USD Jul/10 -2464.0       -1
> 6        STANDARD LEAD USD Aug/10  -118.0        0
>
>
> For each df$DESCRIPTION element containing "USD" I want to remove  
> the last
> part of it(i.e Jul/10, or Aug/10...).
> I was thinking of something like that:
>
>> Df$DESCRIPTION<-gsub("USD","new name whithout last part",df 
>> $DESCRIPTION)
>
> to get this following result:
>
>
>                      DESCRIPTION    prix quantity
> 1                     CORN Jul/10    -1.5        0
> 2                     CORN May/10 -1082.0       -3
> 3      ROBUSTA COFFEE (10) Jul/10 11084.0        8
> 4                 SOYBEANS Jul/10  1983.5        2
> 5        SPCL HIGH GRADE ZINC USD -2464.0       -1
> 6               STANDARD LEAD USD  -118.0        0
>
> My problem is that I have no idea how to write the regular  
> expression in my
> command line.
>
> Any help would be appreciated.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list