3

I am trying to replace strings in a word document by reading the file into a variable $content and then using str_ireplace() to change the string. I can read the content from the file but I str_ireplace() does not seem to be able to replace the string. I assumed it would because the string is 'binary safe' according to the PHP documentation. Sorry, I am a beginner with PHP file manipulation so all this is quite new to me.

This is what I have written.

copy('jack.doc' , 'newFile.doc');
$handle = fopen('newFile.doc','rb');
$content = '';

while (!feof($handle))
{
    $content .= fread($handle, 1);
}
fclose($handle);

$handle = fopen('newFile.doc','wb');
$content = str_ireplace('USING_ICT_BOX', 'YOUR ICT CONTENT', $content);
fwrite($handle, $content);
fclose($handle);

When I download the new file, it opens as it should in MS Word but it shows the old string and not the one that should be replaced.

Can I fix this issue? Is there any better tool I can use for replacing strings in MS Word thourgh PHP?

axiomer
  • 2,088
  • 1
  • 17
  • 26
Joshua Bambrick
  • 2,669
  • 5
  • 27
  • 35
  • 6
    You won't be able to replace contents in a DOC file this way. DOC is a proprietary file format that doesn't expose clear text like this. Can you use a more recent variation of the MS Word format (ie. docx)? It's probably possible there *somehow* because it's XML based (it won't be as easy as this though) – Pekka Feb 27 '12 at 18:23
  • I looked at the file in a plain text editor and can see that text strings are stored in plain text. The same is not true of .docx – Joshua Bambrick Feb 27 '12 at 18:30
  • Also when I echo `$content` I can see the text string I want to replace. – Joshua Bambrick Feb 27 '12 at 18:31
  • @Josh'Bambi'Bambrick - what happens when you write back to your file? Is your doc not corrupted? – afuzzyllama Feb 27 '12 at 18:36
  • no it works fine. It's just that the text has not been replaced – Joshua Bambrick Feb 27 '12 at 18:37
  • Even if it may be possible under some circumstances, it's really, really unsafe to replace plain text in Word documents. Handling a docx file would be much better - it's a renamed ZIP file with XML contents in it, so replacing isn't trivial but there are libraries for it. But to address your issue: Can you try using a different file name for the target file? – Pekka Feb 27 '12 at 18:42
  • I actually closed the file and reopened it in write mode (remember I said I was a noob :)). The problem with .docx is that my application is for teachers who (in Northern Ireland - where I am) all have Office 2003 on their school computers – Joshua Bambrick Feb 27 '12 at 18:43
  • @Pekka I am taking a look at PHPWord www.phpword.codeplex.com - do you think that will work? – Joshua Bambrick Feb 27 '12 at 19:04
  • @Josh I'm not sure: it seems to be a *generating* class only, not one for replacing values in an existing document. It describes a "templating" feature that may do what you need but it doesn't go into any more detail so it's impossible to tell without a closer look – Pekka Feb 27 '12 at 19:06
  • that may actually be exactly what I am looking for – Joshua Bambrick Feb 27 '12 at 19:12
  • If I could convert to and from .doc this would be good - I think teachers can open .docx but the computers definitely don't allow saving to .docx – Joshua Bambrick Feb 27 '12 at 19:12
  • google docs allows you to open .doc and save to .docx - I assume there is no way to use that functionality from PHP? – Joshua Bambrick Feb 27 '12 at 19:15

6 Answers6

2

I have same requirement for Edit .doc or .docx file using php and i have find solution for it. And i have write post on It :: http://www.onlinecode.org/update-docx-file-using-php/

copy('jack.doc' , 'newFile.doc');
$full_path =  'newFile.doc';
if($zip_val->open($full_path) == true)
{
    // In the Open XML Wordprocessing format content is stored.
    // In the document.xml file located in the word directory.

    $key_file_name = 'word/document.xml';
    $message = $zip_val->getFromName($key_file_name);               

    $timestamp = date('d-M-Y H:i:s');

    // this data Replace the placeholders with actual values
    $message = str_replace("client_full_name",      "onlinecode org",       $message);
    $message = str_replace("client_email_address",  "ingo@onlinecode.org",  $message);
    $message = str_replace("date_today",            $timestamp,         $message);      
    $message = str_replace("client_website",        "www.onlinecode.org",   $message);      
    $message = str_replace("client_mobile_number",  "+1999999999",          $message);

    //Replace the content with the new content created above.
    $zip_val->addFromString($key_file_name, $message);
    $zip_val->close();
}
JON
  • 965
  • 2
  • 10
  • 28
0

If you can reach a web-service, look at Docmosis Cloud services since it can mailmerge a doc file with your data and give you back a doc/pdf/other. You can https post to the service to make the request so is pretty straight forward from PHP.

Paul Jowett
  • 6,513
  • 2
  • 24
  • 19
0

There is many way to handle word document file on linux

  1. antiword - not much effective as it converts into plain text.
  2. pyODconvert
  3. open-office or liboffice - through UNO
  4. unoconv utility - need to installation permission on server

There is one python script which is most usable for online file conversion but you need to convert those file through command line.

There is no specific and satisfied solution to handle word files by only using php code.

I hunted for a long time to reach at this suggestion.

Prabhu
  • 178
  • 2
  • 12
0

Maybe this would point you to the right direction: http://davidwalsh.name/read-pdf-doc-file-php

Churk
  • 4,556
  • 5
  • 22
  • 37
  • thanks, that's clever. I took a look but I don't think I can use that tool to write to a .doc and unfortunately I am not able to install new software onto the server I use – Joshua Bambrick Feb 27 '12 at 18:31
  • This won't help the OP - antiword can turn a DOC file into plain text, but not the other way round. @Johs – Pekka Feb 27 '12 at 18:36
0

Solutions I've found so far (not tested though):
Docvert - works for Doc, free, but not directly usable
PHPWordLib - works for Doc, not free
PHPDocX - DocX only, needs Zend.

axiomer
  • 2,088
  • 1
  • 17
  • 26
  • Neither of these can replace a placeholder in a Word document – Pekka Feb 27 '12 at 18:37
  • Clever. Unfortunately, I cannot install any software onto the server I use. – Joshua Bambrick Feb 27 '12 at 18:39
  • Pekka: PHPDocX can generate XHTML, in which you can replace then you can convert it back with it as well. But that sadly works only with DocX as I said already. I doubt there are free and easy solutions in this for Doc in PHP. – axiomer Feb 27 '12 at 18:39
0

I am going to opt for PHPWord www.phpword.codeplex.com as I believe teachers are going to get Office 2007 next year and also I will try and find some way to convert between .docx and .doc through PHP to support them in the mean time.

Joshua Bambrick
  • 2,669
  • 5
  • 27
  • 35