1

I am trying to read data from a MS SQL table, and one of the columns contains varchar's larger than 1024 characters. When I've read the query into a dataframe, the strings are truncated at 1023 characters. See the code below and the output. Does anyone know how I can get strings larger than 1023 chars into the dataframe ? I've been looking around, and in the DataFrame docs, but I haven't found an answer.

In the code below table test contains a column 's' where one of the rows has a string of length 1100. The column 'len_s' contains the length of the string 's' (calculated in SQL server).

import pypyodbc
from pandas import Series, DataFrame
import pandas as pd

#print("set connection string ...")
connection_string ='Driver={SQL Server Native Client 11.0};Server= ....'

#connect to SQL server
con = pypyodbc.connect(connection_string)

query="select a, s, len(s) as len_s from test"

df=pd.read_sql_query(query, con)

print(df)
print(len(df['s'][1]))

The output is:

   a                                                  s  len_s
0  1                                        01234567890     11
1  2  0123456789012345678901234567890123456789012345...   1100
1023

So in the dataframe the string is truncated... Any suggestions ?

Bas
  • 454
  • 1
  • 6
  • 14

1 Answers1

2

I found a solution. It appeared to have nothing to do with Dataframes / Pandas or even Python, but when I searched on 'pypyodbc' I found this solved problem:

how-to-get-entire-varcharmax-column-with-python-pypyodbc

They found that the problem was with the SQL Server native client driver. They recommended using the SQL Server standard driver instead.

So I also changed my driver in my ODBC connection string from SQL Server Native Client 11.0 to SQL Server and it's working perfectly! I'm getting the entire contents of the VARCHAR(MAX) column in my MSSQL data table.

Community
  • 1
  • 1
Bas
  • 454
  • 1
  • 6
  • 14