Questions tagged [tidytext]

The tidytext package provides tools for text mining using tidy data principles in R.

The R tidytext package, developed by Julia Silge and David Robinson, provides functions and supporting data sets to allow conversion of text to and from tidy formats, and to switch seamlessly between tidy tools and existing text mining packages. When text is in a tidy data structure, tools from the R tidyverse ecosystem like dplyr can be used for effective data handling and analysis.

Repositories

Vignettes

Other resources

Text Mining with R: A Tidy Approach

Related tags

R's tm, quanteda, dplyr, tidyr, and broom packages

294 questions

votes

1 answer

Delete rows with blank values after performing unnest_tokens and remove stopwords?

Here is my df: df <- structure(list(id = 1:50, strain_id = c(6L, 6L, 7L, 12L, 19L, 35L, 81L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 202L, 202L, 202L, 202L,…

r text nlp tidytext

asked Aug 18 '19 at 09:09

SteveS

3,789
5
30
64

votes

1 answer

Remove rows with character(0) from a data.frame before proceeding to dtm

I'm analyzing a data frame of product reviews that contain some empty entries or text written in foreign language. The data also contain some customer attributes which can be used as "features" in later analysis. To begin with, I will first convert…

text-mining lda topic-modeling tidytext

asked Jul 30 '19 at 19:30

Chris T.

1,699
7
23
45

votes

4 answers

sentiments dataset in R throwing error with AFINN lexicon

Trying to access the sentiments data set for the "AFINN" lexicon using the function get_sentiments("afinn") R code : library(textdata) get_sentiments("afinn") Throwing below error message Do you want to download: Name: AFINN-111 Error in…

r tidytext

asked Jul 26 '19 at 18:15

sam

votes

1 answer

How to fix "no package called textdata" error?

I am trying to run sentiment analysis in R. I have installed tidytext and it is in the correct library with all other packages. However, when I run get_sentiments("afinn") I get the following error: Error in loadNamespace(name) : there is no…

r sentiment-analysis tidytext

asked Jul 18 '19 at 12:58

user10643490

votes

1 answer

Manually inserting topic-specific stopwords

I'm using tidytext's built-in anti_join(get_stopwords()) command to clean documents from a data of customer review of tech products, but I found out the output corpus consists primarily of tech specification (e.g., Windows 10, 720p Camera, 380.6 x…

dplyr text-mining stop-words tidytext

asked Jul 16 '19 at 00:40

Chris T.

1,699
7
23
45

votes

1 answer

Error when importing csv data into R for text mining

I keep getting this error when trying to import a csv document into R and trying to develop a corpus for topic modeling. I have used this approach successfully on 4 other projects but cannot get past this error. My data source has a doc_id column…

r tm tidytext

asked May 20 '19 at 10:44

Matthew Lawrence

votes

1 answer

Combine tidy text with synonyms to create dataframe

I have sample data frame as below: quoteiD <- c("q1","q2","q3","q4", "q5") quote <- c("Unthinking respect for authority is the greatest enemy of truth.", "In the middle of difficulty lies opportunity.", "Intelligence is the ability to…

r tidytext qdap

asked Apr 03 '19 at 11:34

R noob

votes

1 answer

Reading text files into tidytext and adding metadata

I have several thousand .txt files in a directory and would like to read them all into tidytext where I would then add columns of metadata. The filenames themselves contain all of the metadata and I have been successful in using substr to parse the…

tidytext

asked Mar 01 '19 at 21:28

AlanS

votes

0 answers

How to encode text correctly when importing word documents into R?

I am trying to import content of multiple word documents into the same object in R. I am following Julia Silge and David Robinson's guide (see here: https://www.tidytextmining.com/usenet.html). I am unable to figure out how to encode "text" column…

r tidyr tidytext

asked Feb 22 '19 at 23:25

Anavir

votes

1 answer

determine the temporality of a sentence with POS tagging

I want to find out whether an action has been carried out if will be carried out from a series of sentences. For example: "I will prescribe this medication" versus "I prescribed this medication" or "He had already taken the stuff" versus "he may…

r text-mining tidytext

asked Feb 18 '19 at 13:22

Sebastian Zeki

6,690
11
60
125

votes

0 answers

R Widyr Package (Correlation values NaN)

I am working in analyzing the pairwise correlation of words appearing in user reviews and plotting them in the form of the correlation network graph. My sample data is as follows: review_corwords Label Rating word 1 …

r tidyr tidytext

asked Feb 03 '19 at 01:06

IronMaiden

votes

1 answer

Extract text based on character position returned from gregexpr

I'm working in R, trying to prepare text documents for analysis. Each document is stored in a column (aptly named, "document") of dataframe called "metaDataFrame." The documents are strings containing articles and their BibTex citation info. Data…

r regex nlp tidytext

asked Feb 01 '19 at 18:37

JessJones

votes

3 answers

Apply Math calculation to all rows of DF by Column Values

I want to apply a math calculation which is (Occ_1+1)/(Totl_1+Unique_words) , (Occ_2+1)/(Totl_2+Unique_words) and (Occ_3+1)/(Totl_3+Unique_words) and create a new column as Probability_1, Probability_2, Probability_3 Right now i am doing every…

r dplyr tidyverse tidyr tidytext

asked Jan 14 '19 at 11:06

james joyce

votes

2 answers

Count the Occurence of word,Total words and total Unique words in R

I have a huge df which has a doc_id and word, and every word can contain multiple class(Class_1,Class_2,Class_3 ) so if a word is in that class i put 1 there or if not then 0 SAMPLE DF doc_id word Class_1 Class_2 Class_3 104 saturn…

r dplyr data.table tidyverse tidytext

asked Jan 10 '19 at 12:49

james joyce

votes

2 answers

Error: No tidy method for objects of class LDA_VEM§

I am literally following the steps as presented in chapter 6 of the "Text Mining in R: a Tidy Approach" book. See: https://www.tidytextmining.com/topicmodeling.html #import libraries library(topicmodels) library(tidytext) #access…

r tidytext topicmodels

asked Dec 11 '18 at 16:35

Vasino

Prev 1 2 3

…

19 20 Next