[R] How do I use R to build a dictionary of proper nouns?

θ " yarmi1224 at hotmail.com
Fri May 5 07:58:14 CEST 2017


¦È £¢ ÒÑÅcÄú¹²Óà OneDrive ™n°¸¡£ÈôÒª™zÒ•™n°¸£¬Õˆ°´ÏÂÃæµÄßB½Y¡£


<https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>
[https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>

2.corpus_patent text.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>

<https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>
[https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>

3ontology_proper nouns keywords.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>

<https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>
[https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>

1.patents.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>




Hi :

I want to do patents text mining in R.
I need to use the proper nouns of domain ontology to build a dictionary.
Then use the dictionary to analysis my corpus of patent files.
I want to calculate the proper nouns and get the word frequency that appears in each file.

Now I have done the preprocess for the corpus and extract the proper nouns from domain ontology.
But I have no idea how to build a proper nouns dictionary and use the dictionary to analysis my corpus.

The Attachments are my texts, corpus preprocesses and proper nouns.

Thanks.

	[[alternative HTML version deleted]]



More information about the R-help mailing list