0

I am writing a simple Azure Function app that should receive a string containing HTML markup remove the HTML tags and return the "sanitized" text.

The code would be really simple, like

module.exports = async function (context, req) {
    if (req.body) {
        context.res = {
            body: req.body.replace(... something)
        };
    }
};

As far as I can see on SO, using RegEx to do this is a big NO-GO, but the other solutions I can find to this are all based on the DOM (working on the documentobject, like adding a DIV with the req.body contents in it and getting the clean text from that.

But in my Azure function, the DOM is not available to me (since there is no browser executing the request.

So what are my options?

Jesper Lund Stocholm
  • 1,973
  • 2
  • 27
  • 49

1 Answers1

1

For the benefit of others coming across this, like Carlos and Kryten mentioned, you could use one of the many npm modules available for sanitizing text.

As for adding these dependencies (refer docs), you could either

  1. When working locally, just npm install what you need and when using func to deploy, it will include node_modules too. This would help with cold starts since it runs the package as is. The same applies when using docker-based deployment too with the default Dockerfile.

  2. Using Kudu as mentioned in the SO thread you've shared. But would not recommended this since its something you'll always have to do on changes in dependencies.

PramodValavala
  • 6,026
  • 1
  • 11
  • 30