Questions tagged [tidytext]

The tidytext package provides tools for text mining using tidy data principles in R.

The R tidytext package, developed by Julia Silge and David Robinson, provides functions and supporting data sets to allow conversion of text to and from tidy formats, and to switch seamlessly between tidy tools and existing text mining packages. When text is in a tidy data structure, tools from the R tidyverse ecosystem like dplyr can be used for effective data handling and analysis.

Repositories

Vignettes

Other resources

Text Mining with R: A Tidy Approach

Related tags

R's tm, quanteda, dplyr, tidyr, and broom packages

294 questions

votes

1 answer

Loading Loughran finance sentiment into Tidytext

I'm using the sentiment tools in Tidytext for the first time, and would like to use the Loughran dictionary. After several attempts, the closest I get is this error: get_sentiments("loughran") Error in get_sentiments("loughran") : could not…

tidytext

asked Apr 07 '17 at 16:13

LCC

votes

1 answer

R code suddenly stopped working in tidy text

I am trying to do word analysis on some data in R. I imported one column of data that was text responses from a survey into R using read.csv. I named one of the columns "text" . This code was working fine a few days ago and now it suddenly is…

r csv text-mining tidytext

asked Apr 07 '17 at 02:32

Laura Albrecht

votes

0 answers

R's tm_map is creating non-existing words

I'm using tm package to find associations between words in a text. This is what I did (I'm also using tidytext package) book <- Corpus(VectorSource(c(part1,part2,part3,part4,part5))) book <- tm_map(book, content_transformer(tolower)) book <-…

r tm tidytext

asked Dec 03 '16 at 16:14

pachadotdev

3,345
6
33
60

-1

votes

1 answer

Extract proper nouns from text in R?

Is there any better way of extracting proper nouns (e.g. "London", "John Smith", "Gulf of Carpentaria") from free text? That is, a function like proper_nouns <- function(text_input) { # ... } such that it would extract a list of proper nouns from…

r nlp tidytext

asked Apr 25 '21 at 04:54

stevec

41,291
27
223
311

-1

votes

1 answer

Warning Message: pairwise_count Function

I'm attempting to follow this tutorial on using the pairwise_count function in the widyr package. In particular, consider this line of code, where data is a tibble which includes the columns "word" and "section": data %>% pairwise_count(word,…

r tidyverse tidytext

asked Sep 20 '20 at 09:31

dext

-1

votes

1 answer

How can i convert character object (web-page parsed) to tidy object in R?

Using library(htm2txt) url <- 'https://en.wikipedia.org/wiki/Alan_Turing' clear.text <- gettxt(url) code i'm getting clear.text [1] "Alan Turing\n\nFrom Wikipedia, the free encyclopedia\n\nJump to navigation\tJump to search\n\n\"Turing\" redirects…

r character tidyr tidytext

asked Nov 29 '18 at 14:50

kwadratens

-2

votes

1 answer

R - delete length-one strings and stopwords (using tidytext) in character

If I have a df: Class sentence 1 Yes there is p beaker on the table 2 Yes they t the frown 3 Yes so Z it was asleep How do I remove the length-one strings within "sentence" column to remove things like "t" "p" and "Z", and then do a…

r extract gsub tidytext

asked Jul 17 '21 at 19:39

aurelius_37809

-2

votes

1 answer

Why are these stop words not being removed from my data?

Tokenization of the data tidy_text <- data %>% unnest_tokens(word, q_content) Removal of stop words data("stop_words") stop_words tidy_text <- tidy_text %>% anti_join(stop_words, by ="word") tidy_text %>% count(word, sort = TRUE) Output…

r text tidyverse stop-words tidytext

asked Apr 15 '21 at 23:26

Scot Garrison

-2

votes

1 answer

R Loop over IDs

I'd like to run pairwise_count in a loop and my input looks like the table in the image. Each ID stands for a text and the rows contains the sentences of the text. My idea of a for loop doesn't work. Has someone maybe an idea, how that loop could…

r rapidminer tidytext

asked Mar 03 '18 at 18:35

Tobias Nehrig

Prev 1 2 3

…