[R] Document Term Matrix will not maintain decimal places of numbers or capture all terms

Will Ebert willebert34 at gmail.com
Tue Mar 14 18:46:59 CET 2017


Before I updated my version of RStudio (1.0.136), everything worked great.
With the update something has changed with Document Term Matrix in the 'tm'
package. I want to create a dtm, but with numbers. For instance if I have a
.csv with one column as shown below:

x1.0111.21123.35212.11

I want the column names in my term matrix to look like this:

1.01 11.21 123.35 212.111    0     0      00    1     0      00    0
  1      00    0     0      1

But instead it looks like this:

123 2120   00   01   00   1

Here's the code that used to work:

corpus = Corpus(VectorSource(x))
dtm = DocumentTermMatrix(corpus)
dtm_df = as.data.frame(as.matrix(dtm))

I have tried uninstalling everything and reinstalling, tried older versions
(Studio 0.99.489 & R 3.3.1), but I get the same results. I ask others to
test it out and it works for them. Also, I had someone download R, Rtools,
and RStudio to test this and they got the same results I did. I have no
idea what has happened and would greatly appreciate help on this matter as
it is extremely urgent.

Thanks in advance

Will

	[[alternative HTML version deleted]]



More information about the R-help mailing list