Coding a PDF Text Parser in swift

Asked Jan 17 '17 at 15:34

Active Jul 26 '18 at 14:49

Viewed 407 times

I'm currently developing a pdf text parser completely in swift. I was looking trough the PDFKittens code and found this in the stringwithpdfstring method (In SimpleFont.m) taking a CGPDFStringRef as parameter.

  const unsigned char *bytes = CGPDFStringGetBytePtr(pdfString);
  NSUInteger length = CGPDFStringGetLength(pdfString);


    // Translate to Unicode
    for (int i = 0; i < length; i++)
    {
        unichar cid = bytes[i];
        unichar uni = [self.toUnicode unicodeCharacter:cid];
    }

From my understanding *bytes is a CChar, what is this method exactly iterating trough? When I translate this code to swift I receive the error that Type UnsafePointer? has no subscript members. What is the equivalent of that objective c code in swift...?

asked Jan 17 '17 at 15:34

Michael Schmid

When reading from a file, often it uses bytes and byte pointers. This is simply moving the byte pointer to the beginning of the data and looping through each data byte, and decoding the byte into a unicode character. – pbush25 Jan 17 '17 at 17:31
So it's basically just calling the UnsafePointers *advanced(by:)* Method? – Michael Schmid Jan 17 '17 at 19:55
You will need to parse and embed CMaps. (Character Maps). Here's a tutorial: https://pspdfkit.com/blog/2018/pdf-text-extraction/ – steipete Oct 20 '18 at 07:16

Coding a PDF Text Parser in swift

0 Answers0