-1
from openpyxl import load_workbook

wb = load_workbook('file.xlsx')
ws = wb['Sheet1']

Is there any way to retrieve the xml code representing the ws object? NOTE: I want to avoid using the zipfile module. Instead, I'm trying to extract the xml directly from ws.

I read the openpyxl source code and was playing around with lxml - with no success.

kb2136
  • 47
  • 5
  • If `zipfile` would solve the problem, why don't you just use it? Avoiding a particular standard module seems like an odd requirement. – mzjn May 17 '23 at 13:36
  • First, I need to edit a workbook with `openpyxl`. Then, I want to replace a part of the xml file defining one of the worksheets. The problem is that `openpyxl` already uses `zipfile` in the background. Unzipping again after the xlsx has been "re-zipped" and saved is a very poor choice performance-wise. – kb2136 May 17 '23 at 14:09
  • There isn't any XML that represents the Worksheet object, which is why there are parsers and writers for it. – Charlie Clark May 19 '23 at 09:23

1 Answers1

-1

I figured it out myself. Instead of extracting the xml by unzipping the saved workbook, you can capture it while the workbook is being saved. The wb.save method makes use of the ExcelWriter class, which I modified to suit this purpose:

import openpyxl
from openpyxl.writer.excel import *

class MyExcelWriter(openpyxl.writer.excel.ExcelWriter):
    def write_worksheet(self, ws):
        ws._drawing = SpreadsheetDrawing()
        ws._drawing.charts = ws._charts
        ws._drawing.images = ws._images
        if self.workbook.write_only:
            if not ws.closed:
                ws.close()
            writer = ws._writer
        else:
            writer = WorksheetWriter(ws)
            writer.write()
        ws._rels = writer._rels
        
        # my addition starts here
        if ws.title == 'My Sheet':
            with open(writer.out, 'r', encoding='utf8') as file:
                xml_code = file.read()

            # my code continues here...
        # my addition ends here

        self._archive.write(writer.out, ws.path[1:])
        self.manifest.append(ws)
        writer.cleanup()

openpyxl.writer.excel.ExcelWriter = MyExcelWriter

The write_worksheet function creates a temporary xml file, whose path is stored in writer.out.
Remember that from <module> import * is considered bad practice – use with caution.

kb2136
  • 47
  • 5