0

I have fetched the facebook post via R using facebook functions. I have the output. However, I want to save this result directly into Hive tables. Is there any way to connect the result of R to Hive. FYI, I don't want to generate my R result into CSV and then load CSV into hive tables.

1 Answers1

0

Update: The below holds for small amounts of data (see also this answer), but for a scalable solution see the response of @Samson

It very simple to load data from R into Hive.

The typical way one would do this is via a JDBC connection.

I recall doing this once, I believe it was with the RJDBC package.

Community
  • 1
  • 1
Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
  • 2
    I am not sure if this should be an answer if it is just a suggestion with no exact steps of how it is to be done. – Ronak Shah Feb 07 '17 at 12:24
  • 1
    @RonakShah After your comment have added the relevant package, but for this question it is hard to be more specific as 'it depends'. -- Hopefully mentioning JDBC and the relevant package is enough to guide someone away from CSV dump-upload methods. – Dennis Jaheruddin Feb 07 '17 at 12:29
  • I strongly doubt that it is "very simple" to load data to Hive with JDBC, because **Hive does not support transactions nor INSERT VALUES by default**. The setup for transactions in Hive is a complicated mess, it has to be done per table, and the result is neither elegant nor fast. Better create an EXTERNAL TABLE and push CSV files directly to the underlying HDFS directory, with `rhdfs` (which is also a complicated mess to set up) or via the WebHDFS REST API (i.e. plain HTTP sessions). – Samson Scharfrichter Mar 03 '17 at 09:52
  • @SamsonScharfrichter I updated my answer as it seems that I indeed oversimplified the situation. I admit that the bulk insert solutions are not too elegant, but as thats what we got I would like to invite you to upgrade your comment to an answer. – Dennis Jaheruddin Mar 07 '17 at 09:07