0

I tried to use tidytext to do sentiment analysis

library(tidytext)
get_sentiments("nrc")

but it gives me an error:

Error: '~/Library/Caches/textdata/nrc/NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-v0.92/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt' does not exist.

I have then tried to install the following packages from github

library(remotes)
install_github("EmilHvitfeldt/textdata")
install_github("juliasilge/tidytext")

and I still receive the same error. Can anyone help me with this? Thank you!

Kexin Ni
  • 3
  • 3
  • It sounds like something went wrong with downloading and then unpacking the NRC sentiment lexicon. You could try going to that path and deleting the `NRC-Emotion-Lexicon` directory, and then attempting to use the lexicon again to re-prompt the download afresh. – Julia Silge Nov 16 '21 at 05:36

1 Answers1

0

I had the same error because the file downloaded in another folder than the one specified in the subfunction. Thus, changing the path solved it for me.

library(tidyverse)
library(tidytext)
library(textdata)
library(readr)
library(utils)

# check the error
get_sentiments("nrc") # select 1: will throw error but data still has been downloaded
# where is the file, then?
textdata::lexicon_nrc(return_path = T) # it's here
folder_path <- "~/Library/Caches/textdata/nrc"

# the problem is that the default path is wrong, so we have to adjust it
system(paste0("mkdir ", file.path(folder_path, "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-v0.92")))
system(paste0("cp ", file.path(folder_path, "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt"), " ", file.path(folder_path, "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-v0.92/")))

# now we have to process the nrc data using a slightly modified version of the subfunction detailed in the original function from the textdata-package: https://github.com/EmilHvitfeldt/textdata/blob/main/R/lexicon_nrc.R
name_path <- file.path(folder_path, "NRCWordEmotion.rds")
# slightly modified version:
process_nrc <- function(folder_path, name_path) {
  data <- read_tsv(file.path(
    folder_path,
    "NRC-Emotion-Lexicon/NRC-Emotion-Lexicon-v0.92/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt"
  ),
  col_names = FALSE, col_types = cols(
    X1 = col_character(),
    X2 = col_character(),
    X3 = col_double()
  )
  )
  data <- data[data$X3 == 1, ]
  data <- tibble(
    word = data$X1,
    sentiment = data$X2
  )
  write_rds(data, name_path)
}

process_nrc(folder_path, name_path) # process

# check if you now have access to the lexicon
get_sentiments("nrc") 
# now you can load it with tidytext :)
Alix
  • 1
  • 2