0

It is fixed now after following the comments.

I'm following the tutorial given here - https://www.tidytextmining.com/ngrams.html.

What I want to do is create a bigram network graph of review text present in a CSV file.

Here is the link to the dataset - https://app.box.com/s/y6nsmji4ir7nbggmbncnhf21xla96nml.

Here is the code:

library(dplyr)
library(tidyr)
library(tidytext)
library(ggplot2)
library(igraph)
library(ggraph)
library(stringr)

kjv <- read.csv(file.choose())

count_bigrams <- function(dataset) {
  dataset %>%
    unnest_tokens(bigram, commentText, token = "ngrams", n = 2) %>%
    separate(bigram, c("word1", "word2"), sep = " ") %>%
    filter(!word1 %in% stop_words$word,
           !word2 %in% stop_words$word) %>%
    count(word1, word2, sort = TRUE)
}

visualize_bigrams <- function(bigrams) {
  set.seed(2016)
  a <- grid::arrow(type = "closed", length = unit(.15, "inches"))

  bigrams %>%
    graph_from_data_frame() %>%
    ggraph(layout = "fr") +
    geom_edge_link(aes(edge_alpha = n), show.legend = FALSE, arrow = a) +
    geom_node_point(color = "lightblue", size = 5) +
    geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
    theme_void()
}

kjv_bigrams <- kjv %>% 
  count_bigrams()

Here I'm getting the following error: Error in summarise_impl(.data, dots) : Evaluation error: argument...should be a character vector (or an object coercible to).

This is how the dataset looks like:

enter image description here

Thanks for the read!

user709413
  • 505
  • 2
  • 7
  • 21
  • Your `count_bigrams` function is expecting a dataframe as input but you are only passing it a single column. – Andrew Gustar Mar 27 '18 at 10:12
  • Thanks. I updated the code, now getting another error. – user709413 Mar 27 '18 at 10:14
  • I guess the headings or structure of `kjv` do not match what your function is expecting. It is hard to say more without seeing an example of your data. Try running the code in the function line by line to see where it is failing. – Andrew Gustar Mar 27 '18 at 10:21
  • Try using `read_csv` rather than `read.csv` to read in your data - the problem might just be that it is converting the text to factors. – Andrew Gustar Mar 27 '18 at 10:24
  • Tried that as well. Same issue. – user709413 Mar 27 '18 at 10:27
  • Some of your lines have commas in the text. Perhaps try `read_tsv()` (you will need the `readr` package loaded for these). – Andrew Gustar Mar 27 '18 at 10:30
  • 2
    Found it - the second line of your function is using the wrong column name - you need `unnest_tokens(bigram, commentText, token = "ngrams", n = 2)` (this was after using `read_csv` to read the data) – Andrew Gustar Mar 27 '18 at 10:37
  • 1
    How silly of me! Thanks -- it is solved now. – user709413 Mar 27 '18 at 10:42

0 Answers0