0

So I am upgrading a Chrome extension to MV3 and the background page now becomes a Service Worker. From this question apparently there is no access to any DOM element. I need to upgrade this function that basically get an HTML snippet and get its text content:

    const htmlStripper = document.createElement("template")
    const striphtml = html => {
        htmlStripper.innerHTML = html
        return htmlStripper.content.textContent
    }

I tried using DocumentFragment as well but it's also not available to Service Worker. Unlike the other question, I do not have access to any foreground HTML page so message passing is not possible as well.

What is a solution for this? Beside this specific problem (nice if I can solve this one), is there a generic solution for all kind of HTML processing we could have done as if we had access to a document?


For my specific case, this solution is good enough, stealing from a C# solution:

    const striphtml = html => {
        return html.replace(/<.*?>/g, "").trim();
    }

Be warned that this is not perfect.

Luke Vo
  • 17,859
  • 21
  • 105
  • 181
  • 1
    Use a `+` instead of `*` in the regexp to avoid replacing of `<>`. – Akxe Sep 09 '22 at 07:49
  • @Akxe thanks for the suggestion. Applied the fix to my code though in my case the input unlikely has it (just a normal HTML request parsing) – Luke Vo Sep 09 '22 at 08:17

1 Answers1

2

Off-screen document.
For now you can only do some testing in Chrome Canary.
You have to create first a browser shortcut to Chrome Canary with the command line option --enable-features = ExtensionsOffscreenDocuments

const myHtml = "<div>Ciao<span>amico</span>come <b id="foo">va?</b></div>";
await chrome.offscreen.createDocument({
    url: 'osd.html',
    justification: 'ignored',
    reasons: ['dom_scraping']
});
let hd = await chrome.offscreen.hasDocument();
if (hd) {
    let reply = await chrome.runtime.sendMessage({'html': myHtml});
    await chrome.offscreen.closeDocument();                         
    console.log(reply)
}

//manifest.json
...
"permissions": [
    ...
    "offscreen"
]
...

//osd.html
<html><head><script src="osd.js"></script></head></html>

//osd.js
chrome.runtime.onMessage.addListener((msg, sender, sendResponse) => {
    const htmlStripper = document.createElement("template")
    const striphtml = html => {
        htmlStripper.innerHTML = html
        return htmlStripper.content.textContent
    }
    sendResponse({'reply': striphtml(msg.html)})
})
Robbi
  • 1,254
  • 2
  • 8
  • 11
  • Thanks. I cannot find any documentation on this. Is there any discussion thread for this? Any timeline when it will go into production? Unfortunately this answer is great but we can't apply this for our product right if it requires this special switch. – Luke Vo Sep 09 '22 at 09:30
  • 2
    The documentation is scarce and not definitive. Some code snippet here: [LINK](https://chromium.googlesource.com/chromium/src.git/+/HEAD/chrome/test/data/extensions/api_test/offscreen/basic_document_management/) Discussion thread on Google Group [LINK](https://groups.google.com/a/chromium.org/g/chromium-extensions/c/PJPCn0_k5Pk/m/V7PBJgTLAQAJ) – Robbi Sep 09 '22 at 09:39
  • Thank you, that doesn't look like it (or even doc) will be available any time soon. This whole process of deprecating MV2 is absurd... – Luke Vo Sep 09 '22 at 09:51
  • 1
    it's a bit like the Davos agenda. "They" go on without looking anyone in the face – Robbi Sep 09 '22 at 10:07
  • I cannot set the reason `dom_scripting`. I'm getting this error `Uncaught (in promise) TypeError: Error in invocation of offscreen.createDocument(offscreen.CreateParameters parameters, function callback): Error at parameter 'parameters': Error at property 'reasons': Error at index 0: Value must be one of TESTING. ` – Zuhair Taha Oct 12 '22 at 08:45
  • it's a "working in progess" api... – Robbi Oct 12 '22 at 13:04