-1

I have a python script (written in Jupyter notebook) and I would like to run this script in Azure. The python script basically gets data from API source (which updated every 24 hours) and updates the SQL database which is Azure. So this automated python script will update the database table whenever it runs

Can someone please me with this?

Below is the python code i have written,

import pyodbc
import requests
import json 
import pandas as pd

responses = requests.get("https://data.buffalony.gov/resource/d6g9-xbgu.json")

crime_data = json.loads(responses.text)

dic = {}

dic = crime_data

df = pd.DataFrame.from_dict(dic)

dff = df[['case_number','day_of_week','incident_datetime','incident_description','incident_id','incident_type_primary']].copy()

connection = pyodbc.connect ('Driver={ODBC Driver 17 for SQL Server};Server=servername;Database=Databasename;UID=admin;PWD=admin')

cur = connection.cursor()

row = []

for i in range(dff.shape[0]):

   row.append(dff.iloc[i].tolist())

sql = '''\
INSERT INTO [dbo].[FF] ([case_number],[day_of_week],[incident_datetime],[incident_description],[incident_id],[incident_type_primary]) values (?,?,?,?,?,?)
'''

for i in range(dff.shape[0]):

   cur.execute(sql,row[i])

connection.commit()
Joel
  • 1,564
  • 7
  • 12
  • 20
mavles
  • 103
  • 1
  • 7

2 Answers2

0

I don't use azure and jupyter notebook but I think I have a solution If you leave your computer run all night change your code into this :

import time
import pyodbc
import requests
import json 
import pandas as pd
while 1:
    responses = requests.get("https://data.buffalony.gov/resource/d6g9-xbgu.json")

    crime_data = json.loads(responses.text)

    dic = {}

    dic = crime_data

    df = pd.DataFrame.from_dict(dic)

    dff =  df    [['case_number','day_of_week','incident_datetime','incident_description','incident_i         d','incident_type_primary']].copy()

    connection = pyodbc.connect ('Driver={ODBC Driver 17 for SQL Server};Server=servername;Database=Databasename;UID=admin;PWD=admin')

    cur = connection.cursor()

    row = []

    for i in range(dff.shape[0]):
        row.append(dff.iloc[i].tolist())

    sql = '''\
    INSERT INTO [dbo].[FF] ([case_number],[day_of_week],[incident_datetime],    [incident_description],[incident_id],[incident_type_primary]) values (?,?,?,?,?,?)
    '''

    for i in range(dff.shape[0]):

        cur.execute(sql,row[i])

    connection.commit()
    time.sleep(86400)

if not create a new python program in the startup file like this:

import time, os
while 1:
    if time.ctime()[11:13] >= "update hour" and time.ctime()[0:4] != open("path/to/any_file.txt").read():
        file = open("path/to/any_file.txt", "w")
        file.write(time.ctime()[0:4])
        file.close()
        os.system("python /path/to/file.py")
Docaro
  • 60
  • 9
-1

A task scheduler like Azure WebJobs will do this for you.

pfcodes
  • 1,055
  • 2
  • 9
  • 15
  • Thanks. I tried this approach but when i run the webjob, i could see status "failed". My code is valid and it runs just fine jupyter notebook. What could be the reason for this? Should I install libraries like pandas as i have included the same in my script. if yes, where should i install these python libraries – mavles Aug 15 '18 at 02:41
  • Check the logs to see why it failed. Most likely involves the missing Python libraries. The article below should help you figure out how to install the libraries! https://stackoverflow.com/questions/45860272/python-libraries-on-web-job – pfcodes Aug 15 '18 at 03:33