0

Do PhpPowerpoint have the capacity to get the text from an existing pptx? Cause I want to get the text from an existing pptx to php. Is there a possibility to achieve it?

2 Answers2

0

Yes, it is.

PhpPresentation, oldly PHPPowerPoint, has some readers : PowerPoint2007, PowerPoint97 and ODPresentation. These readers permit to extract shapes with content and formatting.

Progi1984
  • 498
  • 3
  • 12
  • what type of functions do i need to use? because i don't have enough background in phppowerpoint. Thank you :))) – Maria Gabriel Jul 13 '16 at 08:03
  • Also, how can i link the existing pptx? do i need to specify the path? if that so, how can i do it? pardon me for having too much questions, i really do appreciate your answers. – Maria Gabriel Jul 13 '16 at 08:20
  • @MariaGabriel : Look at Sample12 : https://github.com/PHPOffice/PHPPresentation/blob/develop/samples/Sample_12_Reader_PowerPoint2007.php – Progi1984 Jul 13 '16 at 08:28
  • owww. Thank you! One last thing, is there a function getText() or some sort for me to get all the text of the existing pptx? – Maria Gabriel Jul 13 '16 at 08:54
  • No, you need to browse each slides and in each slide, you need to call the function toString of each shape – Progi1984 Jul 13 '16 at 12:50
  • Oww. Thanks! I am planning to get all the text of the existing pptx on each slide and put it in an array. Is that possible? and how can I do it? I'm very sorry, i only have little knowledge in PhpPowerpoint. – Maria Gabriel Jul 13 '16 at 23:42
0

here is the procedure, working even on unicode chars. It converst the LF in pptx text in CRLF to be read. Once done (save as AllUniB) open it with Word, convert Unicode, clean up multiple paragraphs (replace ^l with ^p and then ^p^p with ^p) and you're ready to go.

To convert add this code to the pptx macro and run it:

Sub ExportTextUnicodeBin()
  Dim oPres As Presentation
  Dim oSlides As Slides
  Dim oSld As Slide         'Slide Object
  Dim oShp As Shape         'Shape Object
  Dim iFile As Integer      'File handle for output
  iFile = FreeFile          'Get a free file number
   Dim adoStream As ADODB.Stream

  Dim PathSep As String
  Dim FileNum As Integer
  Dim sTempString As String
Dim bytes() As Byte

  #If Mac Then
    PathSep = ":"
  #Else
    PathSep = "\"
  #End If

  Set adoStream = New ADODB.Stream
  Set oPres = ActivePresentation
  Set oSlides = oPres.Slides

  FileNum = FreeFile

  'Open output file
  ' NOTE:  errors here if file hasn't been saved
'  Open oPres.Path & PathSep & "AllText.TXT" For Output As FileNum
    adoStream.Charset = "Unicode" 'or any string listed in registry HKEY_CLASSES_ROOT\MIME\Database\Charset

    'open sream
    adoStream.Open
    adoStream.Type = adTypeBinary

  For Each oSld In oSlides    'Loop thru each slide
    ' Include the slide number (the number that will appear in slide's
    ' page number placeholder; you could also use SlideIndex
    ' for the ordinal number of the slide in the file
bytes = StrConv("Slide:" & vbTab & CStr(oSld.SlideNumber) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes

    'Print #iFile, "Slide:" & vbTab & CStr(oSld.SlideNumber)

    For Each oShp In oSld.Shapes                'Loop thru each shape on slide
      'Check to see if shape has a text frame and text
      If oShp.HasTextFrame And oShp.TextFrame.HasText Then
        If oShp.Type = msoPlaceholder Then
            Select Case oShp.PlaceholderFormat.Type
                Case Is = ppPlaceholderTitle, ppPlaceholderCenterTitle
bytes = StrConv("Title:" & vbTab & Strings.Replace(oShp.TextFrame.TextRange, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
                Case Is = ppPlaceholderBody
bytes = StrConv("Body:" & vbTab & Strings.Replace(oShp.TextFrame.TextRange, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
                Case Is = ppPlaceholderSubtitle
bytes = StrConv("SubTitle:" & vbTab & Strings.Replace(oShp.TextFrame.TextRange, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
                Case Else
bytes = StrConv("Other Placeholder:" & vbTab & Strings.Replace(oShp.TextFrame.TextRange, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
            End Select
        Else

bytes = StrConv("NoS:" & vbTab & Strings.Replace(oShp.TextFrame.TextRange, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
        End If  ' msoPlaceholder
      Else  ' it doesn't have a textframe - it might be a group that contains text so:
        If oShp.Type = msoGroup Then
            sTempString = TextFromGroupShape(oShp)
            If Len(sTempString) > 0 Then
bytes = StrConv("Group: " & vbTab & Strings.Replace(sTempString, vbCr, vbCrLf) & vbCrLf, vbFromUnicode)
   adoStream.Write bytes
            End If
        End If
      End If    ' Has text frame/Has text

    Next oShp
  Next oSld

  'Close output file
  'Close #iFile
    adoStream.SaveToFile oPres.Path & PathSep & "AllUniB.TXT"

    adoStream.Close

End Sub


Function TextFromGroupShape(oSh As Shape) As String
' Returns the text from the shapes in a group
' and recursively, text within shapes within groups within groups etc.

    Dim oGpSh As Shape
    Dim sTempText As String

    If oSh.Type = msoGroup Then
        For Each oGpSh In oSh.GroupItems
            With oGpSh
                If .Type = msoGroup Then
                    sTempText = sTempText & TextFromGroupShape(oGpSh)
                Else
                    If .HasTextFrame Then
                        If .TextFrame.HasText Then
                            sTempText = sTempText & "(Gp:) " & .TextFrame.TextRange.Text & vbCrLf
                        End If
                    End If
                End If
            End With
        Next
    End If

    TextFromGroupShape = sTempText

NormalExit:
    Exit Function

Errorhandler:
    Resume Next

End Function

Rember to add the ADODB library to the resources in VBA or you'll get an error on run