[R] Extracting information from text data

Peter Ehlers ehlers at ucalgary.ca
Mon Jan 24 14:26:21 CET 2011


On 2011-01-23 19:28, Deb Midya wrote:
> Hi R-Users,
>
> Thanks in advance.
>
> I am using R-2.12.0 on Windows XP.
>
> I am trying to produce an n X m matrix from text data stored in different files. Where n = number of words (say w1, w2, …, wn). M is the number of documents (say d1, d2, …, dm)
>
> A. Using package tm
>
> I am using package tm to do the job. I have provided the code below:
>
>> my.corpus<- Corpus(DirSource(my.path), readerControl = list (reader=readPlain))
>
> In readLines(y, encoding = x$Encoding) :
>    incomplete final line found on 'M:\textmine/slr.txt'
>

So it looks like your slr.txt file has a problem.
Inspect it with your editor.

>> x<- TermDocMatrix(my.corpus)
> Error: could not find function "TermDocMatrix"

Where did you get the idea that package tm has this function?
I see a function TermDocumentMatrix(). As you can see,
R provides a very helpful reminder that you should
check the name of the function.

Peter Ehlers


>
> B. Using package(s) other than tm
>
> Once again, thank you very much for the time you have given.
>
> Regards,
>
> Deb
>
> The code:
>
> library(tm)
> my.path<- 'M:\\textmine'
> my.corpus<- Corpus(DirSource(my.path), readerControl = list (reader=readPlain))
> x<- TermDocMatrix(my.corpus)
> x
>
>
>
>
> 	[[alternative HTML version deleted]]
>



More information about the R-help mailing list