[R] Data Structure to Unnest_tokens in tidytext package

Tue Dec 10 17:09:21 CET 2019

Hi--I'm fairly new to R and trying to do a text mining project on a novel
using the tidytext package. The novel is saved as a plain text document and
I can import it into RStudio just fine. For reference I'm trying to do
something similar to section 1.3 of this tidy text tutorial
<https://www.tidytextmining.com/tidytext.html>, except I'm working with one
novel instead of many. So I import the novel and then run:

"tidy_novel <- quicksandr %>%
unnest_tokens (word, text)"

I get the following error:

Error in check_input(x) :
  Input must be a character vector of any length or a list of character
  vectors, each of which has a length of 1.

typeof(novel) returns "list" and str(novel) returns

Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 955 obs. of  1
variable:
 $ FOR E. S. I.: chr  "FOR E. S. I." "My old man died in a fine big house.
My ma died in a shack. I wonder where I'm gonna die, Being neither white
nor black?'" "LANGSTON HUGHES" "ONE" ...
 - attr(*, "problems")=Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 8 obs. of
 5 variables:
  ..$ row     : int  530 726 733 836 853 886 889 942
  ..$ col     : chr  NA NA NA NA ...
  ..$ expected: chr  "1 columns" "1 columns" "1 columns" "1 columns" ...
  ..$ actual  : chr  "2 columns" "2 columns" "2 columns" "2 columns" ...
  ..$ file    : chr  "'quicksandr.txt'" "'quicksandr.txt'"
"'quicksandr.txt'" "'quicksandr.txt'" ...
 - attr(*, "spec")=
  .. cols(
  ..   `FOR E. S. I.` = col_character()
  .. )
>

I'm just importing the text file and then trying to run the unnest_tokens
function, so maybe I'm missing a step in between? I seem to need my text
file in a different format, so would appreciate answers on how to do that.
Thanks, and let me know if I need to provide more info!

	[[alternative HTML version deleted]]