If you use the built-in print-to-PDF feature of Mac OS X for an MS Word document with multiple sections, you will get multiple PDFs, one per section. You will then have to combine them manually using a tool like Adobe Acrobat, PDFLab or Combine PDFs.
Hitting the “Preview” button instead of “Save to PDF” as explained in this tip might work, unfortunately it didn’t for me...
Having to merge the per-section PDFs with the tools mentioned above is admittedly convenient enough and a minor issue, but it does get old if you have to do it often. (Actually I just wanted something to tinker with :-). I thought this would be a perfect job for Mac OS X’s PDF Workflow feature and so I wrote this little AppleScript. In case you didn’t know, PDF Workflows allow you to send PDF files from the standard print dialog directly to your code which can be written in AppleScript or UNIX languages like Perl, bash or Python. Apple has some example code.
I compiled the following code in my AppleScript editor and saved it in compiled script form as “~/Library/PDF Services/Merge Word Sections.scpt”. Now I can choose it from the PDF drop down menu in the print dialog. It will churn for a bit and produce a PDF with all sections neatly merged into one file on the desktop. It picks up the output file name from the document automatically, replacing the extension with “.pdf”.
Because Word runs each print job for each section independently, including a fresh invocation of this script, there is no clean way to find out if the print job is the first or a later one of a series. The script simply checks for an existing output file which has been modified in the last 10 seconds. If it finds one, the current print job is appended, otherwise it creates/overwrites the output file and the following job in the series will append to it. What this means is that you don’t have a lot of time to confirm dialogs which Word sometimes pops up when it switches sections, for example if it warns you about narrow page margins.
The script is unfortunately a bit ugly because I embedded a python script for the actual PDF merge. I embedded it to make the AppleScript totally self-contained. I originally kept the python script in my home directory’s “bin” directory as “pdfmerge.py”. In case you’re interested in the python code alone, this is what it looks like:
Now that everything works, let’s rip it all apart and rewrite it completely in Python
I originally wrote this PDF workflow in AppleScript because it used some GUI / user interaction stuff which is no longer present in the version above. Without the need for AppleScript’s GUI features, we can just as well write everything in Python which is a lot cleaner:
You can store this in a .py file in the same location, e.g. “Merge Word Sections.py” and it will show up in the print dialog PDF services popup menu.
Thanks! What a fantastic solution. Had to put the timer on 30secs though because it took over 10secs between some sections I used. Still, that was pretty damn easy due to your explanation. So, once again, many many thanks!
This tool is great. Is there any reason why a landscape section sandwiched between two portrait sections would disappear? Any help greatly appreciated.
Thanks,
Harold
So, if I have 3 WORD documents (eg. f1.doc, f2.doc, and f3.doc, how can I merge them into a single PDF or WORD document?
Trackback URL for this entry: http://www.entropy.ch/blog/Mac+OS+X/2006/10/24/AppleScript-to-merge-MS-Word-Section-PDFs.html?tb=y
|













nifty; I had no idea you could merge PDFs by importing CoreGraphics right into python. Thanks!