Get the first hyperlink and its text value

Question

I hope everyone is in good health health and condition. Recently, I have been working on Google Docs hyperlinks using app scripts and learning along the way. I was trying to get all hyperlink and edit them and for that I found an amazing code from this post. I have read the code multiple times and now I have a good understanding of how it works.
My confusion
My confusion is the recursive process happening in this code, although I am familiar with the concept of Recursive functions but when I try to modify to code to get only the first hyperlink from the document, I could not understand it how could I achieve that without breaking the recursive function.
Here is the code that I am trying ;

/**
 * Get an array of all LinkUrls in the document. The function is
 * recursive, and if no element is provided, it will default to
 * the active document's Body element.
 *
 * @param {Element} element The document element to operate on. 
 * .
 * @returns {Array}         Array of objects, vis
 *                              {element,
 *                               startOffset,
 *                               endOffsetInclusive, 
 *                               url}
 */
function getAllLinks(element) {
  var links = [];
  element = element || DocumentApp.getActiveDocument().getBody();
  
  if (element.getType() === DocumentApp.ElementType.TEXT) {
    var textObj = element.editAsText();
    var text = element.getText();
    var inUrl = false;
    for (var ch=0; ch < text.length; ch++) {
      var url = textObj.getLinkUrl(ch);
      if (url != null) {
        if (!inUrl) {
          // We are now!
          inUrl = true;
          var curUrl = {};
          curUrl.element = element;
          curUrl.url = String( url ); // grab a copy
          curUrl.startOffset = ch;
        }
        else {
          curUrl.endOffsetInclusive = ch;
        }          
      }
      else {
        if (inUrl) {
          // Not any more, we're not.
          inUrl = false;
          links.push(curUrl);  // add to links
          curUrl = {};
        }
      }
    }
    if (inUrl) {
      // in case the link ends on the same char that the element does
      links.push(curUrl); 
    }
  }
  else {
    var numChildren = element.getNumChildren();
    for (var i=0; i<numChildren; i++) {
      links = links.concat(getAllLinks(element.getChild(i)));
    }
  }

  return links;
}

I tried adding

if (links.length > 0){
     return links;
}

but it does not stop the function as it is recursive and it return back to its previous calls and continue running. Here is the test document along with its script that I am working on.
https://docs.google.com/document/d/1eRvnR2NCdsO94C5nqly4nRXCttNziGhwgR99jElcJ_I/edit?usp=sharing

I hope you will understand what I am trying to convey, Thanks for giving a look at my post. Stay happy :D

Tanaike · Accepted Answer · 2021-01-06T06:49:56.137

I believe your goal as follows.

You want to retrieve the 1st link and the text of link from the shared Document using Google Apps Script.
You want to stop the recursive loop when the 1st element is retrieved.

Modification points:

I tried adding

  if (links.length > 0){
       return links;
  }

but it does not stop the function as it is recursive and it return back to its previous calls and continue running.

About this, unfortunately, I couldn't understand where you put the script in your script. In this case, I think that it is required to stop the loop when links has the value. And also, it is required to also retrieve the text. So, how about modifying as follows? I modified 3 parts in your script.

Modified script:

function getAllLinks(element) {
  var links = [];
  element = element || DocumentApp.getActiveDocument().getBody();
  
  if (element.getType() === DocumentApp.ElementType.TEXT) {
    var textObj = element.editAsText();
    var text = element.getText();
    var inUrl = false;
    for (var ch=0; ch < text.length; ch++) {

      if (links.length > 0) break; // <--- Added

      var url = textObj.getLinkUrl(ch);
      if (url != null) {
        if (!inUrl) {
          // We are now!
          inUrl = true;
          var curUrl = {};
          curUrl.element = element;
          curUrl.url = String( url ); // grab a copy
          curUrl.startOffset = ch;
        }
        else {
          curUrl.endOffsetInclusive = ch;
        }          
      }
      else {
        if (inUrl) {
          // Not any more, we're not.
          inUrl = false;

          curUrl.text = text.slice(curUrl.startOffset, curUrl.endOffsetInclusive + 1); // <--- Added

          links.push(curUrl);  // add to links
          curUrl = {};
        }
      }
    }
    if (inUrl) {
      // in case the link ends on the same char that the element does
      links.push(curUrl); 
    }
  }
  else {
    var numChildren = element.getNumChildren();
    for (var i=0; i<numChildren; i++) {

      if (links.length > 0) { // <--- Added  or if (links.length > 0) break;
        return links;
      }

      links = links.concat(getAllLinks(element.getChild(i)));
    }
  }

  return links;
}

In this case, I think that if (links.length > 0) {return links;} can be modified to if (links.length > 0) break;.

Note:

By the way, when Google Docs API is used, both the links and the text can be also retrieved by a simple script as follows. When you use this, please enable Google Docs API at Advanced Google services.

  function myFunction() {
    const doc = DocumentApp.getActiveDocument();
    const res = Docs.Documents.get(doc.getId()).body.content.reduce((ar, {paragraph}) => {
      if (paragraph && paragraph.elements) {
        paragraph.elements.forEach(({textRun}) => {
          if (textRun && textRun.textStyle && textRun.textStyle.link) {
            ar.push({text: textRun.content, url: textRun.textStyle.link.url});
          }
        });
      }
      return ar;
    }, []);
    console.log(res)  // You can retrieve 1st link and test by console.log(res[0]).
  }

ok thanks the code is giving the desired output, but I still couldn't understand how you managed to break the out of recursive loop. Also you mentioned that we may use `if (links.length > 0) break;` instead of `if (links.length > 0) {return links;}` What the difference between the two. thanks :D — abdulsamad, Jan 06 '21 at 11:45
@abdulsamad Thank you for replying. I apologize for the inconvenience and my poor English skill. In your script, there are 2 loops. So I thought that these are required to be stopped when 1 value is retrieved. For example, when one of `if (links.length > 0) break;` is removed, the links are retrieved from one paragraph. In your sample Document, 3 links are retrieved. So I proposed above modification by using 2 `if (links.length > 0) break;`. — Tanaike, Jan 06 '21 at 12:12
@abdulsamad And, when `if (links.length > 0) {return links;}` is used, the loop is finished here. When `if (links.length > 0) break;` is used, `return links;` at the last line is used. So the same result can be obtained. I apologize for my poor English skill again. — Tanaike, Jan 06 '21 at 12:12
Ohh I understand it now, also I am currently understanding the code you tried with document API, it is much faster but also new to me but I am getting hold of it. Also I clearly understand your English and thanks for all clarification. Stay Happy always :) — abdulsamad, Jan 06 '21 at 13:13
One more question where I can find the complete documentation for Google Doc API for example I want to set the new url for selected hyperlink text Is there a function in textRun to do that If yes where did you find it from , kindly do share the source . — abdulsamad, Jan 06 '21 at 13:23
https://stackoverflow.com/questions/65610308/google-docs-api-complete-documentation-hyperlink-issue @Tanaike — abdulsamad, Jan 07 '21 at 10:09

Get the first hyperlink and its text value

1 Answers1

Modification points:

Modified script:

Note:

Linked