3

I'm trying to extract (where possible) or generate (where breadcrumbs not available) breadcrumbs from url in spreadsheets. So for pages that have it, I'm just using a IMPORTXML (refined by case)

=TEXTJOIN(" > ", 1,importxml(A66,"//*[@id='breadcrumbs']/div/div/ul",1))

Page Example

But I'm trying to see if there's a fast way to generate them from given page where not available (like this one, but where the navigation path might not be visible in link either) . I was looking into this similar question. Is there a way to use it in google aps script?

NightEye
  • 10,634
  • 2
  • 5
  • 24
Debora S.
  • 115
  • 8

1 Answers1

1

That would be difficult since you don't specifically know what to find. Have you thought of a logic/algorithm that will define how will it be found? If not, then that is near impossible. You could try and read about XMLService and use it to navigate the site.

The problem is that websites doesn't have a standard on the structure of it and will vary from each other. Unless you are viewing sites from a single domain (which is also a bit difficult), i don't think it would be that easy and direct to create a general script for all sites.

NightEye
  • 10,634
  • 2
  • 5
  • 24
  • I'm wandering more about url's that don't have the path in the link, so like this one https://www.sss.gov.ph/sss/appmanager/viewArticle.jsp?page=sscommission > where the breadcrumb is not clear - but you have to manually inspect HTML elements to see which divs are open and so on – Debora S. Dec 15 '21 at 01:50