0

I apologize if my formatting of this topic isn’t correct this is my first time posting in the community and I will try and do my best.I have been working on this problem for awhile but have been struggling to address it. I am currently following the book “Text Mining with R: A Tidy Data Approach” and am on the part that uses the ‘tm.plugin.webmining’ package to do a sentiment analysis on financial articles. The initial problem is that when I attempted to load the package from the library it would report and error as such.

Error: package or namespace load failed for ‘tm.plugin.webmining’: .onLoad failed in loadNamespace() for ‘rJava’, details: call: dyn.load(file, DLLpath = DLLpath, …) error: unable to load shared object ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so’: dlopen(/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so, 6): Library not loaded: @rpath/libjvm.dylib Referenced from: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rJava/libs/rJava.so Reason: image not found

After doing some research I found out that this had to do with the way R and Java communicate on mac OS High Sierra. So to fix this I followed the followed this article. and it appeared to work. once I fixed the issue with java and r I was finally able to load the ‘tm.plugin.webmining’ package. but when I tried to run the examples from the book to load the corpus, I got the following error.

StartTag: invalid element name Extra content at the end of the document Error in mutate_impl(.data, dots) : Evaluation error: 1: StartTag: invalid element name 2: Extra content at the end of the document

I cannot seem to find information on this anywhere and do not have enough experience in this to fix this issue myself, so any insight, or ideas I could attempt to fix this problem are greatly appreciated. Below I posted the code I ran that gave me this issue. Thank you in advance.

`library(tm.plugin.webmining)

library(purrr)

library(dplyr)

company <- c("Microsoft", "Apple", "Google", "Amazon", 
"Facebook","IBM", "Yahoo", "Netflix") 
symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB", "IBM", "YHOO", 
"NFLX")

download_articles <- function( symbol) { 
WebCorpus(GoogleFinanceSource(paste0("NASDAQ:", symbol)))
}

stock_articles <- data_frame(company = company, symbol = symbol) %>% 
mutate(corpus = map(symbol, download_articles))`
Joe C
  • 15,324
  • 8
  • 38
  • 50
  • 2
    Why is this tagged with [tag:java]? – Joe C Dec 26 '17 at 20:30
  • Dylan, any problems you have already resolved (like the R-Java interaction) should be removed from the question, as they are irrelevant. – Claus Wilke Dec 26 '17 at 21:44
  • 1
    I posted the same question 10 days ago with no answer... https://stackoverflow.com/questions/47790148/text-mining-with-tm-plugin-webmining-package-using-googlefinancesource-function – Scipione Sarlo Dec 26 '17 at 22:25
  • @joeC I tagged it as java because the original problem had to do with mac and java interaction and I assumed that the continued problem had to do with the way rJava is interacting with the 'tm.plugin,webmining' package – Dylan Edmonds Dec 27 '17 at 16:01
  • @DylanEdmonds that's ok, the tag can be removed. Java is not the best tag since its reserve to issues directly associated to the language which is not this case. – Necronet Feb 06 '20 at 21:39

1 Answers1

1

I had the same problem while executing the code, and found a workout, as shown below:

library(tm.plugin.webmining)
library(purrr)

company <- c("Microsoft", "Apple", "Google",
             "Amazon", "Facebook", "Twitter",
             "IBM", "Yahoo", "Netflix")

symbol <- c("MSFT", "AAPL", "GOOG", "AMZN", "FB",
            "TWTR", "IBM", "YHOO", "NFLX")

download_articles <- function(symbol) {
  WebCorpus(YahooFinanceSource(paste0("NASDAQ:", symbol)))
}

stock_articles <- data_frame(company = company,
                             symbol = symbol) %>%
  mutate(corpus = map(symbol, download_articles))

Inside the WebCorpus function, use YahooFinanceSource(), instead of GoogleFinanceSource().

Anamitra Musib
  • 336
  • 4
  • 5