With LibreOffice, I have designed and written a text document (ODT format). Now I want to find certain placeholders programmatically and replace them with text from a database.
I know there are some ODT libraries for PHP, but as ODT files are just ZIP files containing XML files (among others), I think this should be possible with basic PHP and without any libraries, shouldn't it?
So I've written a short script which unzips the ODT file, modifies the content.xml and then zips the folder again. You can see the full code below.
While I can do the unzip, replace, zip manually, it does not work when I let the PHP script below do the work. LibreOffice will tell me that it cannot open the document and that it could try to repair it (which does not work, either).
Are there any special requirements that I need to pay attention to? Do I have to modify any meta files apart from the content.xml?
if (unzipFolder('Template.odt', 'temp')) {
$source = file_get_contents('temp'.DIRECTORY_SEPARATOR.'content.xml');
$source = str_replace('XXXplaceholder1XXX', 'Example Value #1', $source);
$source = str_replace('XXXplaceholder2XXX', 'Example Value #2', $source);
file_put_contents('temp'.DIRECTORY_SEPARATOR.'content.xml', $source);
zipFolder('temp', 'output/Document.odt');
}
function unzipFolder($zipInputFile, $outputFolder) {
$zip = new ZipArchive;
$res = $zip->open($zipInputFile);
if ($res === true) {
$zip->extractTo($outputFolder);
$zip->close();
return true;
}
else {
return false;
}
}
function zipFolder($inputFolder, $zipOutputFile) {
if (!extension_loaded('zip') || !file_exists($inputFolder)) {
return false;
}
$zip = new ZipArchive();
if (!$zip->open($zipOutputFile, ZIPARCHIVE::CREATE)) {
return false;
}
$inputFolder = str_replace('\\', DIRECTORY_SEPARATOR, realpath($inputFolder));
if (is_dir($inputFolder) === true) {
$files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($inputFolder), RecursiveIteratorIterator::SELF_FIRST);
foreach ($files as $file) {
$file = str_replace('\\', DIRECTORY_SEPARATOR, $file);
if (in_array(substr($file, strrpos($file, '/')+1), array('.', '..'))) {
continue;
}
$file = realpath($file);
if (is_dir($file) === true) {
$dirName = str_replace($inputFolder.DIRECTORY_SEPARATOR, '', $file.DIRECTORY_SEPARATOR);
$zip->addEmptyDir($dirName);
}
else if (is_file($file) === true) {
$fileName = str_replace($inputFolder.DIRECTORY_SEPARATOR, '', $file);
$zip->addFromString($fileName, file_get_contents($file));
}
}
}
else if (is_file($inputFolder) === true) {
$zip->addFromString(basename($inputFolder), file_get_contents($inputFolder));
}
return $zip->close();
}
Edit #1: The code above does not even work if you just unzip and re-zip the contents of the ODT file, i.e. if you uncomment all the data manipulation. Is something wrong with the format of PHP's ZipArchive output?
Edit #2: More specifically, it is the zipFolder(...)
method that breaks everything. You can let PHP do the unzipping, the string manipulation works fine as well (str_replace(...)
), but when the zipFolder(...)
function creates the archive, it cannot be opened, while it works fine if you create the archive manually (with 7-Zip, e.g.).
Edit #3: I even got it working just by replacing the re-zipping part in PHP with a call to 7-Zip via exec(...)
. So the problem is definitely creating a proper ZIP archive here. For better portability and fewer dependencies, it would be better, of course, if the solution with PHP's ZipArchive
worked and we didn't need 7-Zip.