0

I have a script to scrape tweets using rtweet package in R. i am using the following code.

rt <- search_tweets(
  q = ("اجرک"), 
  n = 5000, 
  include_rts = FALSE,
  geocode = lookup_coords(),
  parse = TRUE,
  lang = 'ur',
  retryonratelimit = TRUE, 
  token = create_token()
)

The code works fine in Rstudio (create_token and lookup_coords have respective inputs that are removed here). I am able to get a few hundred tweets containing the search query. The aim is to run this script using Windows task scheduler. However when the same script is run using command line, e.g.

Rscript -e "source('path\\to\\script.R')"

the script runs but the resulting data frame has zero rows. Using my very limited understanding of debugging, I pinpointed the problem to the type of query given as input in the above said function. If I use Latin characters, for example 'ajrak', it does return a data frame with tweets in command line. In short, the behaviour of the R script I have written is different in R studio versus Windows Command line. The main cause is the use of UTF-8 query. After searching around a lot, I could not find a solution. Any way to fix this problem?

Shakir
  • 343
  • 5
  • 23
  • 1
    Does `Sys.getlocale("LC_CTYPE")` return the same locale in Rstudio and R ran in CMD? – Mako212 Feb 13 '19 at 18:14
  • You could try setting `Sys.setlocale("LC_CTYPE", "en_US.UTF-8")` in the beginning of your script. – Mako212 Feb 13 '19 at 18:15
  • Both RStution and command line (type `R` in WIndows Console and then go to R prompt, type `Sys.getlocale("LC_CTYPE")`) return: `"English_United States.1252"`. Added `Sys.setlocale("LC_CTYPE", "en_US.UTF-8")` at the start of the script, no effect. – Shakir Feb 13 '19 at 19:25

1 Answers1

0
  1. Use Linux or Mac
  2. Use escaped unicode characters instead of utf8 text.
Shakir
  • 343
  • 5
  • 23