Questions tagged [document-conversion]

Document conversion is the act of converting one document's format to another, which allows the document to be read in many more applications. Documents can be converted into other source document formats, consumer formats or structured data.

70 questions
2
votes
1 answer

How to convert .docx to .doc using apache poi

I need to know how to convert .docx to .doc using apache poi, maybe using XWPFDocument ,HWPFDocument classes, if not achievable please provide alternative solutions.
Jalal Sordo
  • 1,605
  • 3
  • 41
  • 68
1
vote
1 answer

Converting adobe inDesign to pptx (is it even possible?)

I'm struggling to find a solution. I have a bulk of Adobe inDesign files I'm trying to convert over as PDFs I know you can export to inDesign -> PDF then from Acrobat PDF -> PPTX. This would work well if it was just one or two files. I don't want to…
1
vote
1 answer

Document Conversion Code 400

When I do this command: C:\curl -X POST -u "User":"Pass" -F config="{\"conversion_target\":\"answer_units\"}" -F file="D:\PATH\QeA.pdf;type=application/pdf"…
Marco Oliveira
  • 167
  • 1
  • 10
1
vote
2 answers

Can the answer unit content array returned by the Watson Document Conversion service ever have more than one element?

I am writing a program that takes advantage of IBM Watson's Document Conversion service to convert documents of various types into answer units. Each answer unit that is returned by the service contains an array named content which is composed of…
David Powell
  • 537
  • 1
  • 4
  • 16
1
vote
1 answer

Creating classes for using Document Conversion and Concept Insights in Java

So I want to make classes for using Concept Insights on HTML documents converted from PDF thanks to Document Conversion. I am using an Eclipse IDE with a view of my Git directory. When I run it, I get no response. I want to keep it neat but make…
Tara E
  • 13
  • 5
1
vote
4 answers

Converting MS Office Docx with a good compatibility

After spending hours and hours on StackOverflow and programmers forum, i've decided to use the SyncFusion on our project. Our main target is : convert to PDF/directly print existing Doc And Docx this Document can be quite complexe (including…
sstassin
  • 398
  • 1
  • 3
  • 23
1
vote
1 answer

How to add a custom footer to pdfs created by Liferay DocumentConversionUtil (and open office)

I am trying to add a custom footer to pdfs created from docx files on my liferay6.2 installation. Specifically I have linked up open office, and I am successfully converting the documents from docx to pdf to embed them in my portal, but I want to…
1
vote
0 answers

LibreOffice(4.4.3) Headless PDF Conversion issue for some MSWords documents

I am able to convert most of the word documents(doc & docx) to PDF on windows. "soffice.exe" --headless --convert-to pdf --outdir "C:\Ok" "C:\Ok\Test_Original.doc" But a few documents are not getting converted and I see the following intermediate…
pingu
  • 645
  • 3
  • 11
  • 21
1
vote
2 answers

How does Apache commons IO convert my XML header from UTF-8 to UTF-16?

I’m using Java 6. I have an XML template, which begins like so However, I notice when I parse and output it with the following code (using Apache Commons-io 2.4) … Document doc = null; InputStream in…
Dave A
  • 2,780
  • 9
  • 41
  • 60
1
vote
4 answers

Which PHP API or library is the best for converting from HTML to PDF and DOCX?

First, I tried to use Cloudconvert. It can convert between so many fyletypes, but its PHP API causes memory leaks almost at all times. The second I tried was Pdfcrowd. It works perfectly, but it can convert only HTML to PDF. The third I tried was…
aleskva
  • 1,644
  • 2
  • 21
  • 40
1
vote
0 answers

Formatting lost after converting pdf file to docx file

I am using the following code to convert a PDF file into MS Word Document using the following code snippet. import java.io.FileOutputStream; import org.apache.poi.xwpf.usermodel.BreakType; import org.apache.poi.xwpf.usermodel.XWPFDocument; import…
Bhagyesh Jain
  • 323
  • 2
  • 10
1
vote
1 answer

Convert Word doc to ASPX?

Is there a simple way to do this that preserves formatting?
zsharp
  • 13,656
  • 29
  • 86
  • 152
1
vote
1 answer

Which is best approach (JODConverter+open Office or Apache POI HWPF+iText) to convert Microsoft word to PDF in java?

In my application I have to send automatic emails to the customer when customer status changes.I need to attach a document to that email which should be in the pdf format. I have to create this attached PDF document from a existing word…
SRy
  • 2,901
  • 8
  • 36
  • 57
1
vote
5 answers

Command line software to batch convert TIFF to indexable PDF

I need a utility to batch convert TIFF files to indexable PDF's. The software needs to run on linux and must work from the command line. The software does not need to be open source. I've tried the conversion using tesseract and hocr2pdf however…
William Seemann
  • 3,440
  • 10
  • 44
  • 78
1
vote
2 answers

Doc conversion using OpenOffice SDK

I have a need to be able to allow users to export their .doc files (which they upload) to a variety of formats. I got started on using OO SDK, and I set-up some custom filters using XSLT also. Everything works good and I am able to export word docs…
philly77
  • 519
  • 3
  • 8
  • 14