[R] PROBLEM USING DICTIONARY WITH TM PACKAGE

Patrick Casimir patrcasi at nova.edu
Fri May 19 16:12:45 CEST 2017


Dear Members & Experts,


Since the Dictionary () function is no longer available with the tm package. How do I use other functions to do the same as below? I want to capture a list of specific terms from a corpus. By example, if my corpus has 102 files. I want to see a list with occurrences of prostatic, adenocarcinoma, grade in all 102 files. When I use the function Dictionary (), I got the error: Error: could not find function "Dictionary"


> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade"))
> inspect(DocumentTermMatrix(docs, list(dictionary = d)))


But if I use the codes below using inspect, the dictionary only returns the terms for 10 files instead of 102. I need a way to get my dictionary to capture and return those terms for all 102 files or whatever other terms I select. I know I am close but inspect () is not the right function.


> myTerms <- c("prostatic", "adenocarcinoma", "grade")
> inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))

 <<DocumentTermMatrix (documents: 102, terms: 3)>>
 Non-/sparse entries: 292/14
 Sparsity           : 5%
 Maximal term length: 14
 Weighting          : term frequency (tf)
 Sample             :
                Terms
 Docs            adenocarcinoma grade prostatic
   Patient14.txt             11     6         3
   Patient15.txt              7    12         2
   Patient16.txt             13    16         4
   Patient19.txt              5    13         2
   Patient24.txt             11    12         4
   Patient25.txt              8     9         4
   Patient41.txt              8    10         4
   Patient46.txt              8    10         3
   Patient8.txt               9    12         2
   Patient9.txt               8    23         2


Thanks



Patrick Casimir, PhD
Health Analytics, Data Science, Big Data Expert & Independent Consultant
C: 954.614.1178



	[[alternative HTML version deleted]]



More information about the R-help mailing list