I am trying to convert pdf to docx using soffice. It converts it into .docx but it gives textboxes which I am unable to read using the docx api provided by python. Is there any better way to read the file or any better way to convert pdf to docx so that I do not get textboxes?
soffice --infilter="writer_pdf_import" --convert-to docx "convert_this.pdf"