Questions tagged [doc]

Questions about the old Microsoft Word file format and how to use it.

The "doc" format is a proprietary file format used by Microsoft as their main file format for text documents from the 1980s till 2007. With the release of Word 2007 "docx" is the main file format.

All Questions about the older (pre .docx) Wordfiles should use this tag. Questions about .docx files can use this tag, but should use "docx" primarily. Questions about .odf .txt .rdf should not use this tag.

Wikipedia about "doc" http://en.wikipedia.org/wiki/DOC_%28computing%29

Decrypted file format: http://sc.openoffice.org/compdocfileformat.pdf

851 questions
6
votes
2 answers

PageBreak in converting HTML to doc and docx fromat - PHP

I have converted HTML to DOC format using PHP. kindly view the below screen shot. My Problem is i need to print 1st table in first page and 2nd table in second page. Here is my set of code. header("Content-type:…
Fero
  • 12,969
  • 46
  • 116
  • 157
6
votes
1 answer

What Linux/Unix software to use to convert html or pdf to doc?

I need to convert css styled (x)html or pdf to doc as accurately as possible and do it on Linux (and if possible also on Mac) from cli. Unfortunately OpenOffice can't handle the layout. Is there any such software or library, commercial of free?…
jcmunt
6
votes
5 answers

Convert html to doc in java

I would like to convert either an html or xhtml document (preferably with styles) to Microsoft .doc and/or .docx format. There seem to be plenty of examples for doing this the other way around but I haven't found any useful examples for converting…
Edd
  • 8,402
  • 14
  • 47
  • 73
6
votes
1 answer

Getting "Ole::Storage::FormatError: OLE2 signature is invalid" when trying to get content out of a Word doc

I'm using Rails 5. I want to get text out of a Word document (.doc) so I'm using this code text = nil MSWordDoc::Extractor.load(file_location) do |ctl00_MainContent_List1_grdData| text = contents.whole_contents end but I'm getting the…
Dave
  • 15,639
  • 133
  • 442
  • 830
6
votes
3 answers

Parse .doc & .docx for get all text using golang?

How can I parse word documents ".doc", ".docx" to get all the text using golang?
Alexander Barac
  • 125
  • 1
  • 1
  • 5
6
votes
5 answers

How to convert ODT to DOC/RTF without openoffice.org

Is there any way to convert odt documents to doc or rtf on linux without openoffice or any library that relies on having openoffice installed ?
ionelmc
  • 720
  • 2
  • 10
  • 29
6
votes
3 answers

How to read or copy text from .docx/.odt/.doc files

In my application, I want to read a document file (.doc or .odt or .docx) and store that text in a string. For that, I am using the code below: string text; using (var streamReader = new StreamReader(@"D:\Sample\Demo.docx",…
User805
  • 113
  • 1
  • 1
  • 11
6
votes
2 answers

Determine if the document is DOC or DOCX in Java app without knowing its extension

There is a constraint in the content management system that requires to store all word documents with specific extension (different from DOC or DOCX). However, when outputting the document to user we need to know if it is a DOC or DOCX file in order…
Andriy
  • 61
  • 1
  • 2
6
votes
7 answers

How to load text of MS Word document in C# (.NET)?

How do I load MS Word document (.doc and .docx) to memory (variable) without doing this?: wordApp.Documents.Open I don't want to open MS Word, I just want that text inside. You gave me answer for DOCX, but what about DOC? I want free and high…
Skuta
  • 5,830
  • 27
  • 60
  • 68
6
votes
2 answers

Programmatically convert docx file to doc

What options do I have to convert .docx documents to .doc document programmatically using C#? I'm looking to do this as cheaply as possible. Ideally I want to do this directly in code via libraries within the .net framework or via a well establish…
Peanut
  • 18,967
  • 20
  • 72
  • 78
6
votes
3 answers

Convert doc to pdf using Apache POI

I am trying to convert doc to pdf using Apache POI, but the resulting pdf document contains only text, it is not having any formating like images, tables alignment etc. How can I convert doc to pdf with having all formattings like tables, images,…
user1710922
6
votes
3 answers

C#/ASP.NET - Get thumbnail from PDF/DOC files

I have an ASP.NET WebForms application (written in C#) that allows users to upload files using the FileUpload control. What'd be great is if I could automatically generate thumbnails from files. Images such as JPG/PNG are trivial of course, but…
Chris
  • 7,415
  • 21
  • 98
  • 190
6
votes
2 answers

Mark a checkbox as checked in a word (.docx) form

I'm using ruby/nokogiri to parse a word form and fill the fields. I've already managed to fill the text fields but I'm having difficulties to check a checkbox. I've looked on the document.xml and didn't notice any different tags when the checkbox is…
Adriano Bacha
  • 1,114
  • 2
  • 13
  • 22
6
votes
2 answers

Java - Convert doc/docx file to chm file

I have an idea of converting Word document(.doc/.docx) files to Help file(.chm) format. I want to use Java for the conversion of files. My formula is simple. To make the Table of Contents page and other links in word document, as package explorer or…
Avadhani Y
  • 7,566
  • 19
  • 63
  • 90
5
votes
1 answer

Can I convert between .doc and .docx using PHP only?

I have search for this on SO but all posts relating to the issue seem to require the installation of software (like the Zend framework or PHPdocx) on the server - which I am not able to do. I need to be able to read and update text in templates from…
Joshua Bambrick
  • 2,669
  • 5
  • 27
  • 35