2

I am trying to develop a web service that takes a url, and add a google search link to every word in the webpage returned from that url, and then return the modified html as the response.

say if the page has

<html>
  <body>
    <h1>the title</h3>
    <p>the long content</p>
  </body>
</html>

I'd like to return something like

<html>
  <body>
    <h1>
      <a href='https://www.google.com/?q=the'>the</a>
      <a href='https://www.google.com/?q=title'>title</a>     
    </h1>
    <p>
      <a href='https://www.google.com/?q=the'>the</a>
      <a href='https://www.google.com/?q=long'>long</long>
      <a href='https://www.google.com/?q=content'>content</a>
    </p>
  </body>
</html>

I'm seeing a lot of questions asking about the similar thing. Most of them do something like this:

$('h3').html(function(i, v) {
  return v.replace(/(\s*)(\w+)(\s*)/g, '$1<a href="https://google/com/?q=$2">$2</a>$3');
});

However, I am trying to handle over 10000 concurrent requests per minute(The original number in the question was 10000 per second). I feel that the snippet above may not be a good fit.

Currently I am trying to implement this web service with node.js, and more specifically express-mung.

My question has two parts:

1.What would be a more performant way to implement the add-a-tag-to-every-word logic?

2.(Optional)What stack adjustment do I need if answering the first question is not enough to solve this problem?(I am open to learning anything new)

Thanks a lot for answering this question.

Lil E
  • 388
  • 5
  • 23
  • 2
    Your rate requirement (10 000/sec) is really enormous. To implement a nodejs service to retrieve that many web pages, mung their markup, and send them to user browsers will require a large number of servers or a server less (aws lambda-style) approach, and may require an elaborate caching scheme. Doing it with a web extension in users' browsers may be a better way to deliver this functionality. It's called "sticking your users with the power bill." – O. Jones Mar 15 '20 at 12:16
  • I guess I'll put more effort to the caching, and less effort to the concurrency. Thanks O. Jones! – Lil E Mar 15 '20 at 12:58

0 Answers0