11

Bounty

The bounty will go to the fastest solution, as demonstrated by jsPerf, across the latest release versions of Firefox, Chrome and Internet Explorer at time of testing or the answer most useful in creating such a solution at my discretion. Mwahahaha!

I'll be mostly satisfied with a solution that takes all of the offsets and an unprocessed <span> and adds the highlighting to that, so that parent.textContent = parent.textContent followed by running the solution on an updated list of offsets will re-highlight, but this has unfavourable time complexity so is not preferred.


Related questions


I have an element containing nothing but text, which I would like to highlight. I also have an array of [startline, startcol, endline, endcol] which, knowing the lengths of each line from .textContent, I can normalise to [startoffset, endoffset]. How can I highlight between each pair of offsets?

This problem is harder than it seems because:

  • the content is not guaranteed to have no repeats (so no find / replace), and
  • highlighting must ultimately be performed on already highlighted text, sometimes intersecting with text that has already been highlighted, and
  • highlighting must be performed based on the index of the parent element's .textContent property.

Definitions

  • highlight: to place a subset of the text from an element's textContent in one or more <span class="highlighted"> without changing the parent element's textContent value, such that text that is highlighted n times is within n nested <span class="highlighted"> elements.
  • offset: a non-negative integer representing the number of characters before a certain point (which is between two characters).
  • character: an instance of whatever JavaScript gives you as the value at a given index of a .textContent string (including whitespace).

MCVE

function highlight(parent, startoff, endoff) {
  // Erm...
  parent.textContent;
}

// Test cases

var starts = [
  5,  44, 0, 50, 6,  100, 99,  50, 51, 52
];
var ends = [
  20, 62, 4, 70, 10, 100, 101, 54, 53, 53
];
for (var i = 0; i < 10; i += 1) {
  highlight(document.getElementById("target"),
            starts[i], ends[i]);
}
#target {
  white-space: pre-wrap;
}
<span id="target">
'Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
All mimsy were the borogoves,
  And the mome raths outgrabe.

"Beware the Jabberwock, my son!
  The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
  The frumious Bandersnatch!"

He took his vorpal sword in hand:
  Long time the manxome foe he sought --
So rested he by the Tumtum tree,
  And stood awhile in thought.

And, as in uffish thought he stood,
  The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
  And burbled as it came!

One, two! One, two! And through and through
  The vorpal blade went snicker-snack!
He left it dead, and with its head
  He went galumphing back.

"And, has thou slain the Jabberwock?
  Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!'
  He chortled in his joy.

'Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe;
All mimsy were the borogoves,
  And the mome raths outgrabe.
</span>
wizzwizz4
  • 6,140
  • 2
  • 26
  • 62
  • Is that a homework question? What is the use-case of this? – Munim Munna Apr 11 '18 at 11:40
  • @MunimMunna I'm creating a system which is similar to, but has a completely different use-case, platform etc. to, Microsoft Word's comment feature. I couldn't work out how to do this part of it. – wizzwizz4 Apr 11 '18 at 11:43
  • Please provide some useful explanation aside from bookish definitions. Number of character means what? including or excluding white-spaces/new-lines? Comma, period and other punctuation? How do you get the `textContent`, `starts` and `ends` array can help understand it better. – Munim Munna Apr 11 '18 at 11:53
  • @MunimMunna Loosely processed from a flat database containing these in no particular order (in the format of line / column); there isn't a pattern to them. – wizzwizz4 Apr 11 '18 at 17:04
  • @MunimMunna I've added the definition of "character". Is this clear enough? – wizzwizz4 Apr 11 '18 at 18:25
  • Please provide an example, this is unclear to me. I have done works on higlighting already, something similar to `ctrl + f` in browser with high recursive performance, but can't tell if it suits your problematic. Thanks – NVRM Apr 15 '18 at 21:18
  • @Cryptopat This isn't highlighting based on a search string. This is selection between a start point and an end point. Could you be a bit more specific than "this is unclear to me"? – wizzwizz4 Apr 16 '18 at 07:06
  • Seeing the response already given, this is still very mysterious! Why ranges aren't in order. At least a screenshot can help. I understand that this numbers are position in a string but wtf 50,51,52? I am not english native, just like many here. Maybe the issue? – NVRM Apr 16 '18 at 16:00
  • @Cryptopat It's not because you're not a native English speaker. The reason they're out of order is because they are added by users, translated through a layer of markup into HTML attributes, then rendered on screen. The rendering on screen is the bit I couldn't do, because it's not possible to simplify. (Or so I thought...) By the way, you should probably read downwards (start / end pair, next start / end pair...) because it will make slightly more sense. – wizzwizz4 Apr 16 '18 at 16:07

1 Answers1

5

Make normalization to start/end positions to avoid overlapping.

  1. Merge starting and ending positions to single list with opposite values(say, -1 and 1)
  2. Sort list by position value and then - by marker value(and based on second level sorting you can either distinguish sequential ranges or merge them)
  3. go through list of positions and add current position's value marker to current sum; once it's "0" - it means you have just found ending for some set nested/intersected sections;

This way you will get positions for highlighting without nested/overlapped ranges.

To replace text node with mix of text nodes and HTML elements(like <span>) documentFragment and .replaceChild() will help:

let starts = [
    5,  44, 0, 50, 6,  100, 99,  50, 51, 52
];
let ends = [
    20, 62, 4, 70, 10, 100, 101, 54, 53, 53
];

let positions = [];
let normalizedPositions = [];
starts.forEach(function(position) {
    positions.push({position, value: 1});
});
ends.forEach(function(position) {
    positions.push({position, value: -1});
});
positions = positions.sort(function(a, b) {
    return a.position - b.position || 
        b.value - a.value
});

var currentSection = {from: 0, counter: 0};

for(position of positions) {
    if (!currentSection.counter) {
        if (position.value === -1) {
            throw `inconsistent boundaries: closing before opening ${position.position}`;
        }
        currentSection.from = position.position;  
    }
    currentSection.counter += position.value;

    if (!currentSection.counter) { 
        normalizedPositions.push({
            from: currentSection.from, 
            to: position.position
        });
    }
}
if (currentSection.counter) {
    throw "last section has not been closed properly";   
}


let parentNode = document.querySelector('p');
let textNodeToReplace = parentNode.childNodes[0];
let sourceText = textNodeToReplace.nodeValue;

let documentFragment = document.createDocumentFragment();
let withoutHighlightingStart = 0;

normalizedPositions.forEach(function (highlightRange) {
    if (highlightRange.from> withoutHighlightingStart) {
      let notHighlighted = createTextNode(sourceText.slice(withoutHighlightingStart, highlightRange.from));
      documentFragment.appendChild(notHighlighted);
    }
    let highlighted = createHighlighted(sourceText.slice(highlightRange.from, highlightRange.to));
    documentFragment.appendChild(highlighted);
    withoutHighlightingStart = highlightRange.to;
});
let lastNotHighlighted = createTextNode(sourceText.slice(withoutHighlightingStart));
documentFragment.appendChild(lastNotHighlighted);

parentNode.replaceChild(documentFragment, textNodeToReplace);

function createTextNode(str) {
   return document.createTextNode(str);
}

function createHighlighted(str) {
   let span = document.createElement('span');
   span.classList.add('highlight');
   span.appendChild(createTextNode(str));
   return span;
}
.highlight {
    background-color: yellow;
    color: dark-blue;
}
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
skyboyer
  • 22,209
  • 7
  • 57
  • 64
  • This is interesting. I know how it's doing it; you've explained that well. What I don't understand is _what_ it's doing. Is it finding the positions to insert the `` and `` tags? – wizzwizz4 Apr 10 '18 at 15:54
  • 1
    yes, it normalizes nested/overlapped ranges so in result you will have just positions you are able to wrap into `` without any inconsistencies just in `O(N)` operations – skyboyer Apr 10 '18 at 15:59
  • This is really clever! But how can these values actually be used to add the `` tags into the DOM? – wizzwizz4 Apr 10 '18 at 16:10
  • I've enhaced code block with text replacement itself. in brief: I'm replacing(`.replaceChild()` to rescue!) original text node with documentFragment contains all the chunks: both are highlighted and are not. – skyboyer Apr 10 '18 at 18:21
  • This works. (Sorry for the understatement!) I won't accept it just yet, in case a better answer comes in when the bounty is set... but the bounty will probably go to your answer unless Jon Skeet comes along and learns JavaScript in order to beat it. – wizzwizz4 Apr 10 '18 at 18:45
  • I'm curious; why are you using a `for of` in the `normalizedPositions` loop but `.forEach` everywhere else? It doesn't seem that I can easily port that to an ES5 `for (var i = 0; i < positions.length; i += 1)`, so there must be a catch I haven't noticed. – wizzwizz4 Apr 15 '18 at 10:53
  • @wizzwizz4 it was not done by intention. just was writing different parts in different time(and have not used to ES6 things so far) – skyboyer Apr 15 '18 at 13:09
  • Ok. So it should work fine if I change it to `forEach`? I made it duplicate the entire thing repeatedly, but that's probably entirely my fault. – wizzwizz4 Apr 15 '18 at 13:10
  • @wizzwizz4 yes, it should – skyboyer Apr 15 '18 at 13:21