3

I have a database in MS Access. I am trying to query one table to Python using pypyodbc. I get the following error message:

ValueError: could not convert string to float: E+6

The numbers in the table are fairly big, with up to ten significant figures. The error message tells me that MSAccess is formatting them in scientific notation and Python is reading them as strings.

The fields in the table are formatted as singles with two decimal places. When I see the numbers in the table in the database they are not formatted using scientific notation. but the error message seems to indicate that they are.

Furthermore, if I change the numbers in the table (at lest for a test row) to small numbers (integers from 1 to 5) the query runs. Which supports my theory that the problem is scientific formatting of big number.

Any ideas of how to:

  1. write into the database table in a way that the numbers are not formatted in scientific notation, or
  2. make pypyodbc retrieve numbers as such and ignore any scientific notation.
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
David
  • 327
  • 1
  • 3
  • 11
  • Indeed. I just added 10 decimal digits to a number field in an Access table and used both pypyodbc and adodbapi for DSN and driver connections and could not recreate issue. By default, Access does not truncate large numbers like Excel in sci notation. Possibly you have the [field formatted](http://www.opengatesw.net/ms-access-tutorials/Microsoft-Access-Scientific-Notation.html) as such. – Parfait Nov 18 '15 at 03:28
  • @Parfait - *"Possibly you have the field formatted as [scientific notation]."* - Good thought, except that ODBC doesn't pay attention to the Formatting options in Access; it just returns the raw number. Still I'm glad to hear that I'm not the only one who cannot recreate this issue. I would really like to know what's going on in cases like this. – Gord Thompson Nov 19 '15 at 16:04
  • 1
    *"write into the database table in a way that the numbers are nor [sic] formatted in sci notation"* - Numbers are never *stored* in scientific notation; they are only *formatted* in scientific notation as a *string representation* of the number. That is, all `Single` values are stored in the database the same way, so there is never a case of some values being stored in scientific notation and others not. – Gord Thompson Nov 19 '15 at 16:11

3 Answers3

4

This appears to be a compatibility issue between and the Access ODBC driver when retrieving "large" or "small" numbers from a Single or Double field (column), where "large" means

  • Single values with more than 6 significant digits to the left of the decimal point, or
  • Double values with more than 14 significant digits to the left of the decimal point

and "small" means

  • Single values with more than 6 zeros immediately to the right of the decimal point, or
  • Double values with more than 14 zeros immediately to the right of the decimal point

when the numbers are represented as "normal" decimals (i.e., not in scientific notation).

Code to recreate:

import pypyodbc
cnxn = pypyodbc.connect(
    r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
    r"DBQ=C:\Users\Public\Database1.accdb")
crsr = cnxn.cursor()
try:
    crsr.execute("DROP TABLE tblJunk")
except pypyodbc.ProgrammingError as pe:
    # ignore "table does not exist"
    if pe.value[0] != '42S02':
        raise
crsr.execute("CREATE TABLE tblJunk (ID INT PRIMARY KEY, DoubleField DOUBLE, SingleField SINGLE)")
crsr.execute("INSERT INTO tblJunk (ID, DoubleField) VALUES (1, 12345678.9)")
crsr.execute("SELECT DoubleField, SingleField FROM tblJunk WHERE ID=1")
row = crsr.fetchone()
print(row)
# prints: (12345678.9, None)
crsr.execute("UPDATE tblJunk SET SingleField = DoubleField WHERE ID=1")
crsr.execute("SELECT DoubleField, SingleField FROM tblJunk WHERE ID=1")
row = crsr.fetchone()
# ValueError: could not convert string to float: E+7

Workaround 1: For Single values, using the CDbl() function can avoid the error:

crsr.execute("SELECT DoubleField, CDbl(SingleField) AS foo FROM tblJunk WHERE ID=1")
row = crsr.fetchone()
print(row)
# prints: (12345678.9, 12345679.0)

Workaround 2: Use the CStr() function to return the value as a string and then convert it to a float afterwards (works for both Single and Double):

crsr.execute("SELECT DoubleField, CStr(SingleField) AS foo FROM tblJunk WHERE ID=1")
row = crsr.fetchone()
print(row)
# prints: (12345678.9, u'1.234568E+07')
print(float(row[1]))
# prints: 12345680.0

Workaround 3: Use instead of pypyodbc.

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
  • Thanks Gord, this is very informative and useful. My own "solution" is really just another workaround in case I can use double rather than single type. I was just glad the thing was working and moved on without giving it much thought. But will definitely keep in mind this pypyodbc limitation. Regarding pyodbc vs pypyodbc, the only reason for using the latter is that it is "pure Python". I am working I a corporate system and installing pyodbc if a bit of a nightmare. – David Nov 25 '15 at 23:51
0

As I was putting together test files for you to try to reproduce, I noticed that two of the fields in the table were set to Single type rather than Double. Changed them to Double and that solved the problem. Sorry for the bother and thanks for the help.

David
  • 327
  • 1
  • 3
  • 11
0

Adding to Gord's answer, I suggest a 4th possible workaround if you don't mind losing some of the floating point precision, as it was my case.

Actually, I should have done this anyway to reduce my dataframe size, because 6 decimals were more than enough for my pandas calculations ... so I just used Round() function in my SQL query:

SELECT Round(my_field,6) AS rounded_field FROM mytable

As for the origin of the error, my problematic fields were latitude and longitude (stored in degrees defined as "Double with automatic number of decimals"). But they were calculated with a VBA conversion from integer degrees+minutes+seconds, which sometimes results in a recurring decimal number.

i.e. 15º 30' 40'' results in: 15 + 30/60 + 40/60/60 = 15.51111111111...

... and with Round(x,6) conversion, becames 15.511111 in my dataframe.

If I wouldn't want to lose precision, then I should rather store the original numbers in my database table instead of converting them to a float number.

abu
  • 422
  • 7
  • 14