1

I'm trying to extract images from a Microsoft Office Word document with Docsplit and returns this error:

/home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:22:in `initialize': No such file or directory @ rb_sysopen - example.doc (Errno::ENOENT)
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:22:in `open'
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:22:in `is_pdf?'
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:11:in `block in ensure_pdfs'
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:10:in `map'
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit/transparent_pdfs.rb:10:in `ensure_pdfs'
from /home/deploy/.rvm/gems/ruby-2.1.2/gems/docsplit-0.7.5/lib/docsplit.rb:50:in `extract_images'
from test.rb:4:in `<main>'

This is the script:

require "docsplit"
Docsplit.extract_images('example.doc', :size => '1000x', :format => [:png, :jpg])

This is the line 22 on transparent_pdfs:

File.extname(doc).downcase == '.pdf' || File.open(doc, 'rb', &:readline) =~ /\A\%PDF-\d+(\.\d+)?/

I'm using Centos 6 with all libraries installed, on Mac Os X works great on the same way. If I try to convert a pdf works great, only fail with office documents.

Any ideas?

Thanks,

Kerm1t
  • 31
  • 1
  • 9
  • Have you checked the permissions on the file? Is it owned by the user which your program runs as? Does the user have read access to the file? – mcfinnigan Aug 29 '14 at 10:16
  • Yes, it have permissions. Maybe some path is failing? – Kerm1t Aug 29 '14 at 11:00
  • It just seems it cannot find the file. What happens if you do a regular `File.open('example.doc')` without the call to `Docsplit`? I assume that fails as well, which means the file is not in the current working directory. You can print the current directory with `puts Dir.getwd`. – Daniël Knippers Aug 29 '14 at 11:34
  • Ok, now it can open the file but throw another error: /docsplit-0.7.5/lib/docsplit/info_extractor.rb:27:in `extract_all': Error: Couldn't open file '/tmp/docsplit/document.pdf': No such file or directory. (Docsplit::ExtractionFailed) – Kerm1t Aug 29 '14 at 11:53
  • Its appear that can't save on /tmp/docsplit/ directory? I make an ls -la and there is nothing there. – Kerm1t Aug 29 '14 at 11:56
  • My guess is that `example.doc` is not the proper file path to the document. Try a `File.exist?('example.doc')` to see if it's the correct path. – Joshua Pinter Mar 10 '15 at 16:28

0 Answers0