[R] HOW TO FILTER DATA

Leilei Ruan ruanleilei at gmail.com
Wed Jan 3 21:54:44 CET 2018


Try the code below:


df <- read_delim("C:/Users/lruan1/Desktop/1112.csv", "|", escape_double =
FALSE, trim_ws = TRUE)

df_new <- subset(df,df$IPC == 'H04M001/02'| df$IPC == 'C07K016/26' )

You can add more condition with "|" in the subset function. Good luck!

On Wed, Jan 3, 2018 at 2:53 PM, Saptorshee Kanto Chakraborty <
chkstr at unife.it> wrote:

> Hello,
>
> I have a data of Patents from OECD in delimited text format with IPC being
> one column, I want to filter the data by selecting only certain IPC in that
> column and delete other rows which do not have my required IPCs. Please,
> can anybody guide me doing it, also the IPC codes are string variables.
>
> The data is somewhat like below, but its a huge dataset containing more
> than 11 million rows
>
>
> Appln_id|Prio_Year|App_year|IPC
> 1|1999|2000|H04Q007/32
> 1|1999|2000|G06K019/077
> 1|1999|2000|H01R012/18
> 1|1999|2000|G06K017/00
> 1|1999|2000|H04M001/2745
> 1|1999|2000|G06K007/00
> 1|1999|2000|H04M001/02
> 1|1999|2000|H04M001/275
> 2|1991|1992|C12N015/62
> 2|1991|1992|C12N015/09
> 2|1991|1992|C07K019/00
> 2|1991|1992|C07K016/26
>
>
>
> Thanking You
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list