[R] Removing values containing a specific character

Uwe Ligges ligges at statistik.tu-dortmund.de
Sun Jan 27 15:45:24 CET 2013



On 27.01.2013 07:11, ypodeswa wrote:
> Actually, it worked perfectly for my sample data, but my actual data has
> 5.5 million rows, and grep doesn't seem to work with over a million rows.
>   Any idea on a workaround?


It is not a matter of grep() but of available memory, I guess.
Hence try to reduce the number of copies of your data, e.g. by not 
generating an interim df2.

Best,
Uwe Ligges



>
> On Sat, Jan 26, 2013 at 9:37 PM, Yasha Podeswa <ypodeswa at gmail.com> wrote:
>
>> Awesome, thanks Arun, that's exactly what I was looking for!
>>
>>
>> On Sat, Jan 26, 2013 at 9:21 PM, arun kirshna [via R] <
>> ml-node+s789695n4656749h63 at n4.nabble.com> wrote:
>>
>>> Hi,
>>> Try this:
>>> df[]<-lapply(df,as.character)
>>> df2<-df
>>> df[,1][grep("@",df$names)]<- ""
>>> df
>>>    #names             emails
>>> #1   bob       bobj at cup.com
>>> #2   joe joesmith at gmail.com
>>> #3          craig at gmail.com
>>> #4 emily   emily2 at yahoo.com
>>> #5           jane at yahoo.com
>>>
>>> #2nd part:
>>>
>>>   df2[-grep("@",df2$names),]
>>>    names             emails
>>> #1   bob       bobj at cup.com
>>> #2   joe joesmith at gmail.com
>>> #4 emily   emily2 at yahoo.com
>>> A.K.
>>>
>>> ------------------------------
>>>   If you reply to this email, your message will be added to the
>>> discussion below:
>>>
>>> http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656749.html
>>>   To unsubscribe from Removing values containing a specific character, click
>>> here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4656744&code=eXBvZGVzd2FAZ21haWwuY29tfDQ2NTY3NDR8LTEyMTY0MzM4NDk=>
>>> .
>>> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>
>>
>>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Removing-values-containing-a-specific-character-tp4656744p4656751.html
> Sent from the R help mailing list archive at Nabble.com.
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list