0

Years ago I began scanning my "important" documents using Pagis, software that came with my HP Scanner. Eventually I began to scan to PDF (as the scanner software became able), but I still had many old XIF file. The Pagis software would run only on 32 bit OS (Windows) which is now becoming less and less common. In fact I have a Win32 system I've kept alive just to retain access to the XIF files.

I can convert these files using Adobe Acrobat (or equivalent) "simply" by opening the XIF viewer, then printing the doc to the Adobe PDF "printer". Unfortunately I have enough files that this manual process would take many years.

So, what's the best way to convert a large number of XIF files to PDF?

Bill Cohagan
  • 379
  • 3
  • 9

1 Answers1

1

I recently found SikuliX, a scripting tool intended mainly for testing GUI. It is different from most such tools I have seen (e.g., Selenium) in that it is purely image based and cares not what the underlying technology is (HTML, XAML, etc.)

It took me about an hour to learn enough to write a script to open the XIF viewer, select the PDF "printer", click the button to print, fill in the desired output file name (XIF viewer truncated to short name if left alone), and then wait for the print to complete. The script then moved to the next XIF file. (I fed the script a file listing all of the XIF file paths on the drive.) I was using Nitro PDF rather than Adobe.

The script ran for a couple of days (I didn't say it was fast!), but converted all but a few of the files. From time to time it would stall and I'd have to modify the script a bit (increase wait time for UI to change, etc.)

There are probably not many folks facing this particular conversion problem, but I've been looking for a good solution literally for years. So, if you're in the same boat then this is a way to get to shore!

Bill Cohagan
  • 379
  • 3
  • 9
  • were you able to do a comparison of input/output file lists to see which ones had failed in order to try again after run had completed? – ljs.dev Mar 20 '15 at 01:29
  • 1
    Most failures (that couldn't be solved by reruns) were due to NitroPro print driver's inability to handle certain documents (or certain pages within a doc.) What appeared in the PDF in those cases was a solid black block. I intend to try Adobe Acrobat's driver on those at some point. Next on my agenda is to write another Sikuli script to iterate through the generated PDFs looking for these solid black blocks, but haven't done that yet. – Bill Cohagan Mar 20 '15 at 15:59
  • Sounds like a fun problem. It seems that the XIF format is a composite of layers, likely causing the block when it doesn't handle them well. Just like Sikuli works at an optical level, you could try screen capturing those which fail - for if you can see it in Pagis, you can extract it, one way or another :) – ljs.dev Mar 20 '15 at 16:55