Questions tagged [docsplit]

Docsplit is a command-line utility and Ruby library that converts documents into PDFs and breaks them apart into images and text.

Docsplit is a command-line utility and Ruby library that converts documents into PDFs and breaks them apart into images and text.

For more information see the library documentation page and the github project repository

21 questions
0
votes
0 answers

Read document (.doc) with images

I need to read document texts with ruby and then perform some operations on their contents. Some of these documents include images that I need to upload to my server and later show the data with images. Any idea on how I can achieve this? I'm…
llermaly
  • 2,331
  • 2
  • 16
  • 29
0
votes
1 answer

docsplit gem pdf to text

Well basically I have the same problems as discussed here: http://blog.joshsoftware.com/2014/08/13/pdf-to-plain-text-processing-using-docsplit/ But the solution that they propose in docsplit doesn't work. Docsplit.extract_text(filepath, {:pdf_opts…
Richardlonesteen
  • 584
  • 5
  • 18
0
votes
1 answer

docsplit conversion to PDF mangles non-ASCII characters in docx on Linux

My documentation management app involves converting a .docx file containing non-ASCII Unicode characters (Japanese) to PDF with docsplit (via the Ruby gem, if it matters). It works fine on my Mac. On my Ubuntu machine, the resulting PDF has square…
user663031
0
votes
1 answer

Docsplit works from console, not from Rails itself

I'm trying to figuring out a strange issue with Docsplit. I have a Rails 2.3.14 application where users can upload PPTs/PDFs and the system should extract cover images with Docsplit. I have an after_save callback into the model with this…
Mich Dart
  • 2,352
  • 5
  • 26
  • 44
0
votes
1 answer

how to configure CID fonts for docsplit (ghostscript)?

I have guide reference in url below. http://www.ghostscript.com/doc/9.06/Use.htm#CIDFonts But I think I'm not following it correctlly. What I'm trying to do is Convert Office files to Images using Docsplit. But some characters(Korean&Chinese) are…
Andrew
  • 11
  • 4
-1
votes
1 answer

Ghoshscript error : Error: /rangecheck in --.dicttomark--

I am trying to use ghoshscript to convert a pdf in image. The PDF is : http://www.coppernet.zm/MPLS.pdf $ sudo docsplit images -o /tmp/previews -p 1-5 -s 150,750,1000 -f png MPLS.pdf While reading gs_cidfm.ps: Error: /rangecheck in…
Natim
  • 17,274
  • 23
  • 92
  • 150
1
2