I'm currently working to convert several different file formats (.csv, .xlsx, .docx, .one) to .pdf output using the Cloudmersive API (https://api.cloudmersive.com/docs/convert.asp). Their documentation does not detail the type of encoding from the API_response during the conversion. I've tried several different approaches to write the api_response (output: str (byte)). It appears to successfully write to a .pdf file, but when I go to open it, Adobe says that the file is corrupted.
I've tried detecting the type of encoding but chardet found no encoding.
configuration = cloudmersive_convert_api_client.Configuration()
configuration.api_key['Apikey'] = 'PUT YOUR KEY HERE' #individual user-id linked to the account
# create an instance of the API class
api_instance = cloudmersive_convert_api_client.ConvertDocumentApi(cloudmersive_convert_api_client.ApiClient(configuration))
# Convert Document to PDF
if (os.stat(input_file).st_size != 0): #api does not work on empty files
try:
api_response = api_instance.convert_document_ppt_to_pdf(input_file) #ONLY DIFFERENCE
os.remove(input_file)
output_file=os.path.splitext(input_file)[0]+".pdf"
with open(output_file, 'wb') as binary_file:
binary_file.write(bytearray(str(api_response),encoding='utf-8'))
print(input_file, 'was processed by ConvertDocumentPptToPdf.')
except ApiException as e:
print(input_file, 'was not processed.')
I've also tried but this does not work either:
with open(output_file, 'wb') as binary_file:
binary_file.write(bytearray(api_response))
Here is some sample output from the API response (api_response):
b'b\'%PDF-1.5\\n%\\xc3\\xa4\\xc3\\xbc\\xc3\\xb6\\xc3\\x9f\\n2 0 obj\\n<</Length 3 0 R/Filter/FlateDecode>>\\nstream\\nx\\x9c\\x85TM\\x8b\\xdc0\\x0c\\xbd\\xe7W\\xf8\\xbc\\x10\\xaf$
Also, when I've tried to detect the encoding, it says the following: detection = chardet.detect(test.encode()) print(detection)
{'encoding': None, 'confidence': 0.0, 'language': None}