1

I'm trying to fetch SQL Query to Pandas Data Frame using Jupyter Notebook.

I followed these instruction from beardc:

import pandas as pd

df = pd.read_sql(sql, cnxn)

cnxn = pyodbc.connect(connection_info) 
cursor = cnxn.cursor()
sql = """SELECT * FROM AdventureWorks2012.Person.Address 
WHERE City = 'Bothell' 
ORDER BY AddressID ASC"""
df = psql.frame_query(sql, cnxn)
cnxn.close()

However, whenever I run the code it shows:

NameError                                 
Traceback (most recent call last)
<ipython-input-5-4ea4efb152fe> in <module>()
  1 import pandas as pd
  2 
  3 df = pd.read_sql(sql, cnxn)
  4 
  5 cnxn = pyodbc.connect(connection_info)

NameError: name 'sql' is not defined

I'm using a monitored-network (company network if anyone asks).

There are some questions I want to ask:

  1. Do I have to change the connection_info into the info in my database?
  2. Is it important that I'm connected to a network that might have restriction on port connection? as the company sets up some of those.

I'm using the latest Anaconda Distribution.

girlvsdata
  • 1,596
  • 11
  • 21
aarnphm
  • 107
  • 4
  • 16

1 Answers1

1

The error you are receiving is being caused by the order of your code:

1  import pandas as pd
2  df = pd.read_sql(sql, cnxn)  ## You call the variable sql here, but don't assign it until line 6
3 
4  cnxn = pyodbc.connect(connection_info) 
5  cursor = cnxn.cursor()
6  sql = """SELECT * FROM AdventureWorks2012.Person.Address 
7  WHERE City = 'Bothell' 
8  ORDER BY AddressID ASC"""
9  df = psql.frame_query(sql, cnxn)
10 cnxn.close()
  • You are calling the variable sql on line 2, but you don't actually define the variable until line 6.
  • You are also missing a few libraries, and based off beardc's code it looks like you've meshed some of the wrong parts of his two answers together.

Try arranging the code like this:

(Please note this code is untested, and the other issues described below)

#Import the libraries
import pandas as pd
import pyodbc
#Give the connection info
cnxn = pyodbc.connect(connection_info) 
#Assign the SQL query to a variable
sql = "SELECT * FROM AdventureWorks2012.Person.Address WHERE City = 'Bothell' ORDER BY AddressID ASC"
#Read the SQL to a Pandas dataframe
df = pd.read_sql(sql, cnxn)

In answer to your questions:

  1. Yes, you need to change the connection_info to the info in your database. There is a good example of the text you need to put in there here
  2. This specific issue isn't being caused by your network restrictions.
girlvsdata
  • 1,596
  • 11
  • 21