I want to convert .docx file to .txt and If .docx has tables I want to maintain them in good way in .txt file , so I am using pypandoc for this purpose . In my local this is working like charm. When I zip it with all dependencies and put it in s3 to run via aws lambda it fails with below error:
No pandoc was found: either install pandoc and add it to your PATH or or call pypandoc.download_pandoc(...) or install pypandoc wheels with included pandoc
My code is like :
import boto3
import logging
import pypandoc
local_file_docx = '/tmp/'+prefix+'german-de.docx'
local_file_txt = '/tmp/'+prefix+'german-de.txt'
def lambda_handler(event, context):
print(pypandoc.convert_file(local_file_docx, "plain+simple_tables", format="docx", extra_args=
(), encoding='utf-8', outputfile=local_file_txt))
Any help . Aprreciated in advance