Extract ADODB Recordset using Python and export as Dataframe

Question

I am trying to extract a Recordset object from an AS400 database and then export this as a Pandas DataFrame for further manipulation. Following a few guides on here and a few other forums that mentioned using the Excel Application features through Python I have something that almost works - but I am not sure if a better way exists.

So far I have accessed the database and have a ADODB.Recordset object in Python using the below:

`import win32com.client
import pandas as pd

# Initiate Excel Application
Excel = win32com.client.gencache.EnsureDispatch('Excel.Application')
win32c = win32com.client.constants

# Create and Open Connection
conn = win32com.client.Dispatch('ADODB.Connection')
cmd = win32com.client.Dispatch('ADODB.Command')
DSN = 'Provider=MSDataShape;Data Provider=IBMDA400;Data Source="source"; User Id="user"; Password="password";'
conn.Open(DSN)

# Define Command Properties
cmd.ActiveConnection = conn
cmd.CommandText = 'SELECT * FROM table LIMIT 100'

# Recordset without headers
#Create Recordset
RS = win32com.client.Dispatch('ADODB.Recordset')
RS.Open(cmd)
`

I've hit a blocker with formatting the Recordset as a usable table at this point.

I've been using the below link to try and make sense of the type of loop that I need to deploy to format the data as a table. https://www.w3schools.com/asp/ado_ref_recordset.asp

Lastly - looking at a guide I've see here (code below): Writing ADODB Recordset to Pivot Cache With Python for writing Recordset data to a PivotCache I can see the column headers are contained within the Pivot. Is there a way I can also retain them when exporting as a table/dataframe? Preferably using a dynamic set of columns, the dataset I'd like to extract could have all columns or reduced columns depending on the requirement.

#Recordset as a pivot with headers
# Create Recordset
RS = win32com.client.Dispatch('ADODB.Recordset')
RS.Open(cmd)

# Create Excel File
wb = Excel.Workbooks.Add()
Sheet1 = wb.Worksheets("Sheet1")

# Create Pivot Cache
PivotCache = wb.PivotCaches().Create(SourceType=win32c.xlExternal, Version=win32c.xlPivotTableVersion15)
PivotCache.Recordset = RS

# Create Pivot Table
Pivot = PivotCache.CreatePivotTable(TableDestination:=Sheet1.Cells(2, 2), TableName:='Python Test Pivot', DefaultVersion:=win32c.xlPivotTableVersion15)

# Close Connection
RS.Close()
conn.Close()

# View Excel
Excel.Visible = 1

Thanks!

I wouldn't call an ado `recordset` a useable table of any kind. It is a useable `recordset` (is this potato potahto?). Anyway you have the fields collections (that's your column names). and you could use GetRows() to dump the values into an array - maybe that would be enough for creating your dataframe? — topsail, Apr 26 '23 at 16:03
@topsail thanks very much for the advice! using `rCount = RecordCount` then `GetRows(rCount)` into `asarray` then `transpose` I have the data in a dataframe. The only item I am missing is the dynamic selection of columns - as a workaround I've hard-pasted the columns as static values and use this list to construct the dataframe. I'll leave the post open in case you have any more ideas on how to fix this, if not - thanks for getting me 90% of the way there. — Ryan Dray, Apr 27 '23 at 14:28

Extract ADODB Recordset using Python and export as Dataframe

0 Answers0