9

I am new to the python language and I am given a task to convert rtf to pdf using python. I googled and found some code- (not exactly rtf to pdf) but I tried working on it and changed it according to my requirement. But I am not able to solve it.

I have used the below code:

import sys
import os
import comtypes.client
#import win32com.client
rtfFormatPDF = 17

in_file = os.path.abspath(sys.argv[1])
out_file = os.path.abspath(sys.argv[2])

rtf= comtypes.client.CreateObject('Rtf.Application')

rtf.Visible = True
doc = rtf.Documents.Open(in_file)
doc.SaveAs(out_file, FileFormat=rtfFormatPDF)
doc.Close()
rtf.Quit()

But its throwing the below error

Traceback (most recent call last):
  File "C:/Python34/Lib/idlelib/rtf_to_pdf.py", line 12, in <module>
    word = comtypes.client.CreateObject('Rtf.Application')
  File "C:\Python34\lib\site-packages\comtypes\client\__init__.py", line 227, in CreateObject
    clsid = comtypes.GUID.from_progid(progid)
  File "C:\Python34\lib\site-packages\comtypes\GUID.py", line 78, in from_progid
    _CLSIDFromProgID(str(progid), byref(inst))
  File "_ctypes/callproc.c", line 920, in GetResult
OSError: [WinError -2147221005] Invalid class string

Can anyone help me with this? I would really appreciate if someone can find the better and fast way of doing it. I have around 200,000 files to convert.

Anisha

ani
  • 191
  • 1
  • 3
  • 12
  • Where did you get the information that "Rtf.Application" was a valid com object? I would guess you found some code for converting a Word document to PDF and just replaced "Word.Application" by "Rtf.Application". – Carsten Apr 14 '15 at 21:31
  • yes. That is true! tried finding a replacement for this, but no luck! – ani Apr 14 '15 at 21:34
  • 6
    Do you require a _python_ solution or just a solution for your 200,000 files? If python is not a requirement, try LibreOffice: `libreoffice --headless -convert-to pdf filename.rtf` – John1024 Apr 14 '15 at 21:34
  • @Carsten so that makes a point, what if the ProgID were set back to "Word.Application", think it would work? – Mark Ransom Apr 14 '15 at 21:35
  • Well, python is not mandatory, I can try using LibreOffice. So this means there is no solution in Python? – ani Apr 14 '15 at 21:38
  • @MarkRansom Yep, just tried it. Works like a charm if you change the com object back to "Word.Application" to let Word handle the conversion. It can open RTFs without problems. Also, OP refers to the same variable once as `rtfFormatPDF` and once as `wdFormatPDF` (not sure why) so that would have to be changed as well. – Carsten Apr 14 '15 at 21:47
  • Thanks @Carsten !! That was a typo. Sorry! I will try working with "Word.Application" and see how it goes with Rtfs'. – ani Apr 14 '15 at 21:53
  • It worked! thanks! Will add the working code! – ani Apr 15 '15 at 15:07
  • Please don't edit your working code into the question. We like to keep questions and answers separate. Please edit it into the answer you provided below instead. – skrrgwasme Nov 10 '16 at 16:35
  • ok. changed it! Thanks – ani Nov 10 '16 at 16:36

2 Answers2

8

I used Marks's advice and changed it back to Word.Application and my source pointing to rtf files. Works perfectly! - the process was slow but still faster than the JAVA application which my team was using. I have attached the final code in my question.

Final Code: Got it done using the code which works with Word application :

import sys
import os,os.path
import comtypes.client

wdFormatPDF = 17

input_dir = 'input directory'
output_dir = 'output directory'

for subdir, dirs, files in os.walk(input_dir):
    for file in files:
        in_file = os.path.join(subdir, file)
        output_file = file.split('.')[0]
        out_file = output_dir+output_file+'.pdf'
        word = comtypes.client.CreateObject('Word.Application')

        doc = word.Documents.Open(in_file)
        doc.SaveAs(out_file, FileFormat=wdFormatPDF)
        doc.Close()
        word.Quit()
ani
  • 191
  • 1
  • 3
  • 12
1

If you have Libre Office in your system, you got the best solution.

import os
os.system('soffice --headless --convert-to pdf filename.rtf')
# os.system('libreoffice --headless -convert-to pdf filename.rtf')
# os.system('libreoffice6.3 --headless -convert-to pdf filename.rtf')

Commands may vary to different versions and platforms. But this would be the best solution ever I had.

Kuppusamy
  • 453
  • 3
  • 11