0

I am trying to find a way to automatically update a big query table using this link: https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:1

This link is updated with new data every week and I want to be able to replace the Big Query table with this new data. I have researched that you can export spreadsheets to Big Query, but that is not a streamlined approach.

How would I go about submitting a script that imports the data and having that data be fed to Big Query?

  • I can access to the link ...what contain this link ??? ... – GiovaniSalazar Dec 24 '19 at 16:26
  • sorry I mean ...I cant access.... is a json data? – GiovaniSalazar Dec 24 '19 at 16:49
  • 1
    @GiovaniSalazar It contains a txt file I just want to have the script pull data from that link every week and upload to google cloud automatically. – Data Science Acolyte Dec 24 '19 at 19:46
  • Can add your data in your question ... – GiovaniSalazar Dec 24 '19 at 19:51
  • @GiovaniSalazar the data is the link – Data Science Acolyte Dec 24 '19 at 23:21
  • it seems that you want to create a scheduled pipeline. There are many way you can orchestrate this depending on the technologies you want to use. Cloud composer for example might me quick and flexible. For something more robust you could simply use a cloud function triggered by cloud scheduler. In any case, you need a process that fetches data from an HTTP stream :) – Pievis Dec 29 '19 at 11:41
  • @Pievis thank you. Would you be willing to post an answer that lays out steps for this? – Data Science Acolyte Dec 29 '19 at 15:45
  • @Pievis just to be clear though, how would the function be designed? Could it be designed in any way you want or is there a predetermined format? The docs are not very helpful in explaining this and there are no practical examples to go off of. – Data Science Acolyte Dec 29 '19 at 15:48
  • @DataScienceAcolyte there are some restrictions but in general they are very powerful :) the only "format" that you have to follow is that an event triggers a specific function. – Pievis Dec 29 '19 at 16:49

2 Answers2

1

I assume you already have a working script that parses the content of the URL and places the contents in BigQuery. Based on that I would recommend the following workflow:

  • Upload the script as a Google Cloud Function. If your script isn't written in a compatible language (i.e. Python, Node, Go), you can use Google Cloud Run instead. Set the Cloud Function to be triggered by a Pub/Sub message. In this scenario, the content of your Pub/Sub message doesn't matter.
  • Set up a Google Cloud Scheduler job to (a) run at 12am every Saturday (or whatever time you wish) and (b) send a dummy message to the Pub/Sub topic that your Cloud Function is subscribed to.
vinoaj
  • 1,600
  • 10
  • 8
  • Please answer this question as I need further assistance. If you do so I will reward the bounty. https://stackoverflow.com/questions/59556262/creating-a-data-pipeline-to-bigquery-using-cloud-functions-and-cloud-scheduler – Data Science Acolyte Jan 01 '20 at 21:32
0

You can try using a HTTP request to the page using a programming language like Python with the Request library, save the data into a Pandas Dataframe or a CSV file, and then using the BigQuery libraries you can push that data into a BigQuery table.

Messier_31
  • 183
  • 6
  • Yes I know that. I was asking about using the link as a way to automatically update the table every single week. Whether it be with python or any other process I do not care. I would just like to have a streamlined process and I do not know how to create it. – Data Science Acolyte Dec 25 '19 at 03:21
  • Based in what you mentioned there is no streamlined process, another option is to schedule a cURL command in [Cloud Shell][https://cloud.google.com/shell/docs/] to save the data into a [Google Cloud Storage bucket][https://cloud.google.com/storage/docs/] and then a [BigQuery Load Job][https://cloud.google.com/bigquery/docs/loading-data] to load that data into your table. – Messier_31 Dec 25 '19 at 18:08
  • Can you update your answer and show me what that would look like? – Data Science Acolyte Dec 25 '19 at 18:11
  • Please answer this question as I need further assistance. If you do so I will reward the bounty. https://stackoverflow.com/questions/59556262/creating-a-data-pipeline-to-bigquery-using-cloud-functions-and-cloud-scheduler – Data Science Acolyte Jan 01 '20 at 21:33