I have a MS docx file and I need to extract text from it page-wise. I have tried with python-docx but it could extract the whole text but not pagewise. I have also converted my docx to pdf and then tried text extraction. The problem is, after conversion the page structure of docx got changed. For example, while converted,the font size got changed and the text content in one page of docx took more than one page in the pdf.
I was looking for a stable solution that would extract page wise text from docx (Without converting to pdf would be better for my whole solution). Can somebody help me on this?