8

I am trying to merge two or more postscript files into one. I tried concatenation but it does not work as each postscript file may have different resource header.

Have anyone done this before? Are there any libraries (commercial or open source) out there? I do not mind C++, C# or even Java libraries.

Edited These are large postscript files (more than 200 Mb) and their purpose is only for color printing (not for online viewing).

Conclusion

  1. ps2write is not the answer as it does not support DSC.
  2. pswrite as reader pipitas has correctly pointed out produces L1 output. It is not the soluton.
  3. Using pdfwrite is workable. In this option, we convert two ps to a PDF and then convert the merged PDF to a ps. There may be a problem with this solution as there may be some information lost during the conversion. Besides the extra conversion steps take additional resources and time.
  4. If we do not need to view the output file, concatenating two postscript file together with the following line "false 0 startjob pop" inserted in between the files is also a solution. (See also this link)

In conclusion, the interim solution to merge two postscript files are option 3 or 4.

Syd
  • 1,526
  • 1
  • 15
  • 16

6 Answers6

13

Here is an example Ghostscript commandline, which would convert and merge the two (or more) PostScript files into one PDF in a one go:

 gswin32c.exe ^
   -o c:/path/to/output.pdf ^
   -sDEVICE=pdfwrite ^
   -dPDFSettings=/Screen ^
   [...more desired parameters (optional)...] ^
   /path/to/first.ps ^
   /path/to/second.ps ^
   /path/to/third.pdf

Edit: my first shot had falsely assumed PDF input files. It works of course with PostScript as well (or even a mix of PS/PDF)... And the output may also be PS.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
  • Hi pipitas. Thanks for the tips (I give you 1 point) however I cannot use the pdf approach as it would need me to convert from postscript to pdfs and then merge pdfs back to postscript before converting the merged pdf to postscript. I have gone down this path before and it was very very slow (the process did not complete; I stopped it after 6 hours). Note: I have very large postscript files. Are you aware of a pure postscript appraoch? – Syd Aug 10 '10 at 00:35
  • 1
    You've got quite a few "conversions" to many here: *would need me to convert from postscript to pdfs and then merge pdfs back to postscript before converting the merged pdf to postscript* ?? -- It also depends on your print service providers. Sometimes these guys **DO** prefer PDF (and PDF is smaller, since it is compressed), and sometimes they even do have printers which consume PDF (not PostScript). -- If your process did not complete, you didn't know switches to tweak Ghostscript for performance and higher RAM allowances? – Kurt Pfeifle Aug 10 '10 at 00:50
  • @papitas.. typo on my previous comment. the pdf approach is 1) convert each postscript to pdf 2) merge the pdfs into one pdf 3) convert the merged pdf to postscript. – Syd Aug 10 '10 at 00:58
  • @papitas. thanks for the updated answer. I remember I have tried this before but I lost the high resolution in the merged file --> gswin32c.exe -o C:\Temp\pscript\test\output.ps -sDEVICE=pswrite %1 %2. I also remember setting the "-r" option but it the output size became far too large. Are there any parameters I should be using to keep the high res of the original document? – Syd Aug 10 '10 at 01:25
  • 1
    @Syd: `pswrite` is the PostScript device that by default may write out PostScript Level 1 data. And PS L1 doesn't know some hi-level operators which Adobe added to the language later, at PS L2 and PS L3. Hence the bigger output sizes (f.e. by pixelization of some font types). You could try to add `-dLanguageLevel=3` to your command. (However, currently Ghostscript's level 3 generates the same output as its level 2 does...) – Kurt Pfeifle Aug 11 '10 at 12:46
  • 1
    @Syd: the `ps2write` output device also converts fonts into bitmap fonts if those are not available otherwise (f.e. if embedding is forbidden by its license). I don't have much time now, but later tonight I may be able to workout a complete commandline with all the options that may be useful for you to minimize the PS output filesize without compromizing the print quality... – Kurt Pfeifle Aug 11 '10 at 12:57
  • @Papitas, I will mark your answer as the best answer. From my research, ps2write does not support DSC. pswrite as you have pointed out produces L1 output. The large file is caused by the rastering. At this point, there are two options: 1) is to use pdfwrite to convert the two ps to a PDF and then convert use merged PDF to a ps. 2) to concat the two postscript files (with the "false 0 startjob pop" between the files). both are not ideal but at least will produce a working output. Until there is a better solution, your advise has given to me at least an interim solution. thank you very much :) – Syd Aug 19 '10 at 07:03
  • Hi pipitas, I am so sorry. I just realised that I have been mistyping your name all this while. My bad. My sincere apologies :) – Syd Aug 20 '10 at 00:47
4

Of course you can also merge various input files (PS, PDF or a mix of them) into one PostScript file. I'll include a few more tweaking parameters into the next example commandline, which will increase the RAM allowance for Ghostscript by 800 Mb (provided you have a machine with that much of memory):

 gswin32c.exe ^
   -o c:/path/to/output.ps ^
   -sDEVICE=ps2write ^
  -c "800000000 setvmthreshold" ^
   [...more desired parameters (optional)...] ^
   /path/to/first.ps ^
   /path/to/second.ps ^
   /path/to/third.ps

You should state which application did create your PostScripts, and with what kind of settings. Only then you can expect some more specific advice. Your PostScripts may f.e. include hi-res pictures (e.g. at 1200dpi) whereas your print device may only be capable of 600dpi. In that case downsampling to 600dpi would make the files considerably smaller without necessarily imposing quality penalties.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
  • Thanks pipitas. Saw your answer with ps2write instead of pswrite. Add one more point to you for the second answer. I will try that at the next convenience. Oops. regarding your question, the application that produces the postscript is a third party vendor product. – Syd Aug 10 '10 at 01:29
  • @papitas - Added my comments below. – Syd Aug 10 '10 at 02:50
2

FYI I've found that this does not work right in one case - if any file but the first file has links in it, they will not be correct in the final merged PDF. In particular, if say the second PDF has a link to its second page, it will end up being a link to the second page of the merged document, which is not right...

Note that pdftk (which can be downloaded free) will get the links right.

1

GhostScript on Linux comes with a shell script called psmerge (which is installed into the /usr/bin directory). After some simple trials, it appears that this program takes into account resource definitions. It does rely on the fact that your PostScript programs strictly conform to the Adobe DSC. The contents of the merge script reproduced here with consideration to the license:

© Angus J. C. Duggan 1991–1995

#!/usr/bin/perl
eval 'exec perl -S $0 "$@"'
    if $running_under_some_shell;

# psmerge: merge PostScript files produced by same application and setup
# usage: psmerge [-oout.ps] file1.ps file2.ps ...
#
# Copyright (C) Angus J. C. Duggan 1991-1995
# See file LICENSE for details.

use strict;
$^W = 1;
my $prog = ($0 =~ m,([^/\\]*)$,) ? $1 : $0;
my $outfile = undef;

usage() unless @ARGV;

while ($ARGV[0] =~ /^-/) {
   $_ = shift;
   if (/^-o(.+)/) {
      $outfile = $1;
   } elsif (/^-t(horough)?$/) {
      # This doesn't do anything, but we leave it for backward compatibility.
   } else {
      usage();
   }
}

my $gs = find_gs();
if (defined $gs)
{
   # Just invoke gs
   $outfile = '/dev/stdout' unless defined $outfile;
   exec +(qw(gs -q -dNOPAUSE -dBATCH -sDEVICE=pswrite),
      "-sOutputFile=$outfile", '-f', @ARGV);
   die "$prog: exec /usr/bin/gs failed\n";
}
else
{
   warn +("$prog: /usr/bin/gs not found; falling back to old," .
      " less functional behavior\n");
}

if (defined $outfile)
{
   if (!close(STDOUT) || !open(STDOUT, ">$outfile")) {
      print STDERR "$prog: can't open $1 for output\n";
      exit 1;
   }
}

my $page = 0;
my $first = 1;
my $nesting = 0;

my @header = ();
my $header = 1;

my @trailer = ();
my $trailer = 0;

my @pages = ();
my @body = ();

my @resources = ();
my $inresource = 0;

while (<>) {
   if (/^%%BeginFont:/ || /^%%BeginResource:/ || /^%%BeginProcSet:/) {
      $inresource = 1;
      push(@resources, $_);
   } elsif ($inresource) {
      push(@resources, $_);
      $inresource = 0 if /^%%EndFont/ || /^%%EndResource/ || /^%%EndProcSet/;
       } elsif (/^%%Page:/ && $nesting == 0) {
      $header = $trailer = 0;
      push(@pages, join("", @body)) if @body;
      $page++;
      @body = ("%%Page: ($page) $page\n");
       } elsif (/^%%Trailer/ && $nesting == 0) {
      push(@trailer, $_);
      push(@pages, join("", @body)) if @body;
      @body = ();
      $trailer = 1;
      $header = 0;
       } elsif ($header) {
      push(@trailer, $_);
      push(@pages, join("", @body)) if @body;
      @body = ();
      $trailer = 1;
      $header = 0;
       } elsif ($trailer) {
      if (/^%!/ || /%%EOF/) {
         $trailer = $first = 0;
      } elsif ($first) {
         push(@trailer, $_);
      }
       } elsif (/^%%BeginDocument/ || /^%%BeginBinary/ || /^%%BeginFile/) {
      push(@body, $_);
      $nesting++;
       } elsif (/^%%EndDocument/ || /^%%EndBinary/ || /^%%EndFile/) {
      push(@body, $_);
      $nesting--;
       }
}

print @trailer;

sub find_gs
{
   my $path = $ENV{'PATH'} || "";
   my @path = split(':', $path);
   foreach my $dir (@path)
   {
      return "$dir/gs" if -x "$dir/gs";
   }
   undef;
}

sub usage
{
   print STDERR "Usage: $prog [-oout] file...\n";
   exit 1;
}
dreamlax
  • 93,976
  • 29
  • 161
  • 209
  • Thanks for your response. psmerge is not available in the Windows world (not part of the ghost utilities). Probably it is available in one of the cgywin toolset. Having said this, thanks for point out to the strict conformance of DSC format before it could be used (1 point). My research has shown that many users do not have have much success with psmerge. Maybe I am better with just "false 0 startjob pop" command in between the postscript file as an interim solution. – Syd Aug 10 '10 at 04:53
  • 2
    @Syd: How about a simple `/saveobj save def` at the beginning of each document and `saveobj restore` at the end? I'm not sure whether that has an equivalent effect. – dreamlax Aug 10 '10 at 05:11
  • it does not make any difference to "false 0 startjob pop". but thanks for the suggestion (+1 to your comment). – Syd Aug 11 '10 at 22:42
  • psmerge can be obtained in ubuntu in the psutils package – stdcall Mar 29 '13 at 19:32
1

As OP mentioned in the question's conclusions, concatenating the files with the line

false 0 startjob pop

in between should do the trick. So in bash, one could write something like

mkdir merge
for ps in *.ps; do
    cat $ps >> merge/output.ps
    echo "false 0 startjob pop" >> merge/output.ps
done

However, as the question also mentions this is only useful for printing (or PDF conversion), a viewer will probably fail to display all but the first ps file. Some more details can be found here.

Tobias Kienzler
  • 25,759
  • 22
  • 127
  • 221
  • It doesn't work by me, opening the concated document with `evince` shows only the first doc. Btw, congrat for your creation of the Physics SE. – peterh Jun 26 '18 at 18:27
  • @peterh Why thanks, I was merely in the right place at the right time (for once). I added a disclaimer to my answer - indeed this won't work with most viewers, but printers and PDF converters should be able to cope with is. Nonetheless the accepted answer will probably be more reliable; this one's just a quick hack I also only tested once... – Tobias Kienzler Jun 27 '18 at 14:22
1

I've been able to successfully merge 100+ postscript files (1500+ pages) together using both %%Begin Document/ %%End Document and the false 0 startjob pop methods.

The problem I'm having is when printing the merged file the printer pauses for 20 - 45 seconds between the merged files.

Anyone had similar issues?

Abe
  • 11
  • 1