Trying to load many email files and let R learn what's spam or ham. First, I created a corpus, I want to create a term document, I received an error. How to fix it?
email_corpus <- Corpus(VectorSource(NA))
setwd("C:/ham_spam/")
library(tm)
library(stringr)
email_corpus <- Corpus(VectorSource(NA))
folders <- c("easy_ham/", "spam_2/")
for(n in 1:2){
folder <- folders[n]
for(i in 1:length(list.files(folder))){
email <- list.files(folder)[i]
tmp <- readLines(str_c(folder, email))
tmp <- str_c(tmp, collapse = "")
tmp_corpus <- Corpus(VectorSource(tmp))
email_corpus <- c(email_corpus, tmp_corpus)
}
}
dtm_email <- DocumentTermMatrix(email_corpus)
Here is the error i received
Error in UseMethod("TermDocumentMatrix", x) : no applicable method for 'TermDocumentMatrix' applied to an object of class "list"
below is an example of email_corpus, email_corpus is a list of data frames.
$meta
$language
[1] "en"
attr(,"class")
[1] "CorpusMeta"
$dmeta
data frame with 0 columns and 1 row
$content
[1] "From Steve_Burt@cursor-system.com Thu Aug 22 12:46:39 2002Return-Path: <Steve_Burt@cursor-system.com>Delivered-To: zzzz@localhost.netnoteinc.comReceived: from localhost (localhost [127.0.0.1])\tby phobos.labs.netnoteinc.com (Postfix) with ESMTP id BE12E43C34\tfor... <truncated>