2

Reading Excel file into Pandas dataframe like this:

import pandas as pd

df = pd.read_excel(open('C:/somedir/somefile.xlsx','rb'), sheet_name=0)
print(df)

If a value in a cell starts with ' (single-quote, apostrophe) - pandas omits it. For example, it converts '123 into 123. But I need to get it as is, '123.

I realize that Excel considers single quote as text qualifier. But, maybe there's some workaround to make pandas preserve single-quotes at the start of a cell value?

Tried fiddling with dtype - but still no luck.

  • As you say, I suspect you are running into Excel's separation of the `value` of a cell and how it is displayed. The `value` is always 123, and that is what pandas is picking up. Without wishing to discourage, I ran into a similar issue a while back, and my recollection is that I never solved it! – DS_London Aug 16 '23 at 17:02
  • You might need to use openpyxl directly (or maybe xlwings as an alternative) to do this. – BigBen Aug 16 '23 at 17:14
  • I've tried openpyxl directly - but **openpyxl.load_workbook(path).active.cell(row=2, column=1).value** also omits single quote at the beginning. In VBA we can use **Cells(2,1).PrefixCharacter** to get single quote - but couldn't find how to get respective property in Python. – Arseny Yershov Aug 17 '23 at 08:08
  • @ArsenyYershov, can you explain the reason why you need to keep those apostrophes ? A meaningful example/usecase would be great ;) – Timeless Aug 17 '23 at 09:16
  • It's the users' requirement - they need to distinguish between 123 and '123 (and to mark '123 as an error). – Arseny Yershov Aug 17 '23 at 10:19

1 Answers1

0

You can try using :

#pip install xlwings
import xlwings as xw
    
with xw.App(visible=False) as app:
    wb = xw.Book("file.xlsx")
    
    df = (
        wb
         .sheets["Sheet1"].used_range
         .options(pd.DataFrame, index=False, header=True)
         .value
    )

Output :

>>> print(df)

    col1
0    123
1 123.00

>>> df.values.ravel().tolist()

# ['123', 123.0]

Input used (file.xlsx) :

enter image description here

Timeless
  • 22,580
  • 4
  • 12
  • 30
  • Thanks, bro! Not exactly what I expected in output ('123), but a good solution anyway. Sadly, it requires Excel installation - and my final code needs to be running under Linux (in k8s Docker container) where there won't be any Excel. – Arseny Yershov Aug 17 '23 at 10:15