0

I am using gcp bigquery to store news streaming by google function and save it in bigquery.

How can I run a python script that is using the data from bigquery and finally writes back the result for score and magnitude to the related dataset?

I could not find anything in the google documentation about it, just how to run the sentiment analysis but not, how to get data from bigquery in and results out back to bigquery.

Thanks a lot for your support.

Move_On
  • 19
  • 3
  • https://cloud.google.com/bigquery/docs/reference/libraries#client-libraries-install-python should help you get you started – rtenha Nov 25 '19 at 21:55

1 Answers1

0

You didn't give us enough specifics for a specific answer, so let me give you my general way of trying this:

First, let's get the sentiment analysis of one arbitrary sentence with the gcloud CLI:

gcloud --format json ml language analyze-entity-sentiment --content "It's time we just let this thing go - it was a prett
y good bad idea, wasn't it though? -- Bad Idea, Sara Bareilles" | jq -c . > sentiments.json

Please notice that I removed the formatting of the output JSON with jq and stored the results in one file.

To load this file with maybe multiple JSON lines for each sentence into BigQuery:

bq load --autodetect --source_format=NEWLINE_DELIMITED_JSON temp.sentiments sentiments.json 

The question asks for "stream into BigQuery", but it might make more sense to batch the load like shown here.

Now we have a table with the results in BigQuery:

SELECT * FROM `fh-bigquery.temp.sentiments` LIMIT 1000

enter image description here

Btw, I added Sara Bareilles to the sentence to make sure that BigQuery got a full schema for auto-detection when creating the table the first time.

If you want to stream data into BigQuery, then look at the streaming into BigQuery docs. I wanted to isolate in this answer the basics of getting and looking at Cloud NLP data into BigQuery - the rest is just the basics of working with it.

Felipe Hoffa
  • 54,922
  • 16
  • 151
  • 325
  • yes, sorry, I like to process data per batch. the data come in by google function using news api and going to be stored in bigquery in a table called "news" and the field where I like to make the sentiment analysis is called "content" how can it be integrated into the script in order to process initial and from every new record in the table the field "content"? gcloud --format json ml language analyze-entity-sentiment --?bigquery table.field? thanks so much – Move_On Nov 26 '19 at 22:58