2

My goal is to parse a TableOfContents element in a Google Document and write it to another one. I want to do this for every document in a folder. Having gone to the bother of converting each document to the type generated by DocsList just so I can use this method [ which a document generated by DocumentApp does not have. Why, I don't understand, because otherwise the two 'documents' are similar when it comes to finding parts. ], I find that what I get back is a SearchResult. How is this elusive construction used? I've tried converting it into a TableOfContents element [ ele = searchResult.asTableOfContents() ], which does not error out, but nothing I do allows me parse through its child elements to recover their text works. Interestingly enough, if you get a TableOfContents element by parsing through the document's paragraphs to get it, THAT let's you parse the TOC.

Would someone speak to this question. I sure would appreciate a code snippet because I'm getting nowhere, and I have put some hours into this.

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
Mona Everett
  • 49
  • 1
  • 2
  • 10
  • Your question isn't very clear, would you be able to edit it to clarify? Since it takes hours to get to where you are, could you also share YOUR code as a starting point? – Mogsdad Aug 20 '13 at 15:52

2 Answers2

4

The asTableOfContents() method is only there to help the editor's autocomplete function. It has no run-time impact, and cannot be used to cast to a different type. (See ContainerElement documentation.)

To parse the table of contents, start by retrieving the element from the SearchResult. Below is an example that goes through the items in a document's table of contents to produce an array of item information.

Example Document

Screenshot

Parsing results

On a simple document with a few headings and a table of contents, here's what it produced:

[13-08-20 16:31:56:415 EDT] 
[
  {text=Heading 1.0, linkUrl=#heading=h.50tkhklducwk, indentFirstLine=18.0, indentStart=18.0},
  {text=Heading 1.1, linkUrl=#heading=h.ugj69zpoikat, indentFirstLine=36.0, indentStart=36.0},
  {text=Heading 1.2, linkUrl=#heading=h.xb0y0mu59rag, indentFirstLine=36.0, indentStart=36.0},
  {text=Heading 2.0, linkUrl=#heading=h.gebx44eft4kq, indentFirstLine=18.0, indentStart=18.0}
]

Code

function test_parseTOC() {
  var fileId = '--Doc-ID--';
  Logger.log( parseTOC( fileId ) );
}

function parseTOC( docId ) {
  var contents = [];
  var doc = DocumentApp.openById(docId);

  // Define the search parameters.
  var searchElement  = doc.getBody();
  var searchType = DocumentApp.ElementType.TABLE_OF_CONTENTS;

  // Search for TOC. Assume there's only one.
  var searchResult = searchElement.findElement(searchType);

  if (searchResult) {
    // TOC was found
    var toc = searchResult.getElement().asTableOfContents();

    // Parse all entries in TOC. The TOC contains child Paragraph elements,
    // and each of those has a child Text element. The attributes of both
    // the Paragraph and Text combine to make the TOC item functional.
    var numChildren = toc.getNumChildren();
    for (var i=0; i < numChildren; i++) {
      var itemInfo = {}
      var tocItem = toc.getChild(i).asParagraph();
      var tocItemAttrs = tocItem.getAttributes();
      var tocItemText = tocItem.getChild(0).asText();

      // Set itemInfo attributes for this TOC item, first from Paragraph
      itemInfo.text = tocItem.getText();                // Displayed text
      itemInfo.indentStart = tocItem.getIndentStart();  // TOC Indentation
      itemInfo.indentFirstLine = tocItem.getIndentFirstLine();
      // ... then from child Text
      itemInfo.linkUrl = tocItemText.getLinkUrl();      // URL Link in document
      contents.push(itemInfo);
    }
  }

  // Return array of objects containing TOC info
  return contents;
}

Bad news

The bad news is that you are limited in what you can do to a table of contents from a script. You cannot insert a TOC or add new items to an existing one.

See Issue 2502 in the issue tracker, and star it for updates.

If you can post code or explain your issue with DocsList vs DocumentApp, it could be looked at. The elements of a Google Document can only be manipulated via DocumentApp.

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
  • one should read the documentation more often... It's a chance for us that you do ! thanks for all this, +1 – Serge insas Aug 20 '13 at 21:13
  • This doesn't work for me anymore. In my case, the linkUrl fields are all 'null'. I updated the TOC as well. If I hover on the TOC entries, it does show me a link to the heading, so I don't know what's wrong. – Sujay Phadke May 14 '20 at 06:03
  • Ok I got it. It does work. But you need to insert the ToC as the second option (insert ToC > blue links). If you insert with page numbers, it doesn't work, even though it shows links when you hover on the ToC items. – Sujay Phadke May 16 '20 at 07:38
0

I modified the above code to re-create the TOC in a table only with the desired levels(i.e. h1, h2). The only caveat is that TOC must be present & updated before running this.

function findToc(body, level = 2) {
  const indent = 18;
  let contents = [];

  const tocType = TABLE_OF_CONTENTS;
  const tocContainer = body.findElement(tocType);

  if (tocContainer) {
    // TOC was found
    const toc = tocContainer.getElement().asTableOfContents();
    const totalLines = toc.getNumChildren();

    for (let lineIndex = 0; lineIndex < totalLines; lineIndex++) {
      const tocItem = toc.getChild(lineIndex).asParagraph();
      const { INDENT_START } = tocItem.getAttributes();

      const isDesiredLevel = Number(INDENT_START) <= indent * (level - 1);

      if (isDesiredLevel) {
        contents.push(tocItem.copy());
      }
    }

  }

  return contents;
}

function addToTable(cellText) {
  body = DocumentApp.openById(docId).getBody();

  const table = body.appendTable();
  const tr = table.insertTableRow(0);
  const td = tr.insertTableCell(0);

  cellText.forEach(text => {
    td.appendParagraph(text);
  })
}

function parseTOC(docId) {
  body = DocumentApp.openById(docId).getBody();
  const contents = findToc(body);
  addToTable(contents);
}
techmsi
  • 433
  • 1
  • 5
  • 21