1

Background:
I am the developer of IguanaTex, a PowerPoint add-in to insert LaTeX displays/equations into PowerPoint on Windows and Mac.
Many Mac users use another software, LatexIt, to insert PDF generated from LaTeX into PowerPoint (and other applications); the PDF stores the LaTeX source as metadata in a rather complex way, and this metadata is still accessible in the EMF file that PowerPoint uses internally to store the PDF (which can be obtained by unzipping the .pptx file). It is also preserved in a PDF created by using "Save picture as PDF" on the Mac.
I would like to allow IguanaTex users on Windows to retrieve that LaTeX information so that they can modify slides created by LatexIt users on the Mac.

Issue:
I thought I could extract the internal EMF corresponding to the inserted PDF using "Save picture as .emf", and parse it. LatexIt's developer kindly prepared a Windows executable that can retrieve that information from an EMF file such as the one stored internally by PowerPoint.
Unfortunately, I realized that using "Save picture as .emf" on Windows to get an EMF file from the picture obtained by inserting a PDF on Mac did not lead to the same EMF file that PowerPoint uses internally, and the LatexIt metadata is lost in the process.

I am pessimistic, but does anyone see a way around this? Either to get access to the internal EMF file somehow, or save as EMF using some other procedure?

To clarify the situation: I have an open .pptx file on Windows potentially with dozens of pictures/shapes/etc in it, one selected shape in that file which I know is internally an EMF picture; how can I extract that EMF file using VBA?

leilu
  • 367
  • 3
  • 10
  • I have a Windows-only macro that can do this, but your target is macOS. On a Mac, you would have to use AppleScript to access the command line version of the zip utility in macOS (The GUI Archive Utility is buggy and can't unzip Office files). Then you can unzip the file and get the EMF. Very kludgey, unfortunately. – John Korchok Mar 08 '21 at 01:09
  • My target is actually Windows, so that would be great! On the Mac, I can save as PDF. In VBA, if I use .export to save as emf, I get the same problem as with "save as emf" (they're the same thing, right?). Could you share that macro? Also, I would need to do this while the presentation is open. Is the idea to create a new presentation with only that picture in it, save it, then rename, unzip, and extract the media file? – leilu Mar 08 '21 at 01:31
  • Windows can natively unzip files; When I search on "vba unzip" (w/o quote marks) Google's first answer points to Ron DeBruin's site, a goldmine of useful code and the best source of "How to make it work on the Mac" vba info I've run across. There are also commercial code libraries that can extract files from Zips. – Steve Rindsberg Mar 08 '21 at 15:13
  • Thanks for the pointer! It just hadn't occurred to me that extracting the file by unzipping while the presentation is open was possible, but if I guess correctly, the idea would be to create a new presentation with that file in it. I'll try that. I'd love to hear from John if that's what his Macro does. – leilu Mar 08 '21 at 19:03
  • The macro has to work on a closed file. If the file is open in PowerPoint, you could save a copy with VBA, then run the macro on that to extract the EMF. – John Korchok Mar 08 '21 at 21:24
  • Thanks, John, that's exactly what I ended up doing. I was able to implement the procedure I described above, but instead of relying on Ron DeBruin's unzipping code, which briefly shows a file copy dialog (as he mentions), I simply invoked the command line "unzip", which actually also allowed me to filter for the unique file I was looking for, "image1.emf". I will clean the code and post as answer. Thanks everyone!! – leilu Mar 08 '21 at 22:24
  • If you need to extract more emf content you can use [POIs utility class](http://poi.apache.org/components/slideshow/ppt-wmf-emf-renderer.html) with the `-extractEmbedded` option on the unziped .emf. – kiwiwings Mar 11 '21 at 06:59
  • Thanks for the pointer. I'm already all set on the analysis of the unzipped .emf itself. – leilu Mar 11 '21 at 15:50

1 Answers1

0

Here is code to extract the file in the presentation's internal storage corresponding to a selected picture Shape (msoPicture, here of type EMF, but it can be any supported type: JPG, PNG, GIF, ...):

Option Explicit

Sub ExtractShapeImageFromZip() 
    ' This can be easily modified to take input arguments:
    ' ExtractShapeImageFromZip(vSh As Shape, Optional ImageType As String = "EMF")
    
    Dim Sel As Selection
    Set Sel = Application.ActiveWindow.Selection
    Dim vSh As Shape
    Set vSh = Sel.ShapeRange(1)
    
    Dim ImageType As String
    ImageType = "EMF" ' <- change the type here
    Dim ImageExt As String
    ImageExt = "." & LCase$(ImageType)
    Dim ImageFilter As String
    ImageFilter = "ppt\media\image1" & ImageExt
    
    Dim StartFolder As String
    StartFolder = ActivePresentation.Path

    Dim FilePrefix As String
    FilePrefix = StartFolder & "\ExtractFromZip_tmp"
        
    ' Variables for the Shell execution call
    Dim TimeOutTimeString As String
    TimeOutTimeString = "20" ' Wait N seconds for the processes to complete
    Dim TimeOutTime As Long
    TimeOutTime = Val(TimeOutTimeString) * 1000
    Dim debugMode As Boolean
    debugMode = False
    Dim RetVal As Long
    
    Dim fs As Object
    Set fs = CreateObject("Scripting.FileSystemObject")
    
    If vSh.Type = msoPicture Then
        ' Copy/Paste shape to new presentation
        Dim NewPres As Presentation
        Set NewPres = Presentations.Add(msoFalse)
        Dim NewSlide As Slide
        Set NewSlide = NewPres.Slides.Add(Index:=1, Layout:=ppLayoutBlank)
        Dim NewShape As Shape
        vSh.Copy
        NewPres.Slides(1).Shapes.Paste
        NewPres.SaveAs (FilePrefix & ".pptx")
        NewPres.Close
        Set NewPres = Nothing
        fs.CopyFile FilePrefix & ".pptx", FilePrefix & ".zip", True
        fs.DeleteFile FilePrefix & ".pptx"
        RetVal& = Execute("unzip -o " & FilePrefix & ".zip" & " " & ImageFilter _
                            & " -d " & FilePrefix, StartFolder, debugMode, TimeOutTime)
        If fs.FileExists(FilePrefix & ".zip") Then
            fs.DeleteFile FilePrefix & ".zip"
        End If
        If fs.FileExists(FilePrefix & "\" & ImageFilter) Then
            fs.CopyFile FilePrefix & "\" & ImageFilter, FilePrefix & ImageExt
            Dim picPath As String
            picPath = FilePrefix & ImageExt
            MsgBox "File of type " & ImageType & " successfully extracted to " & picPath
        End If
        If fs.FolderExists(FilePrefix) Then
            fs.DeleteFolder FilePrefix
        End If
    End If
    
End Sub

To run unzip, the above code uses the Shell execution code as implemented in IguanaTex (remove the ClipBoard call, which is unnecessary here and comes from another module), largely borrowed from Terry Kreft's "Shell and Wait" .

The code can be easily modified to extract any media file by changing the filter used in the unzip command to "ppt\media\*.*", but then the cleanup to move files from down the folder tree becomes a bit more tedious.

leilu
  • 367
  • 3
  • 10