Keyword highlighting in html Part Two

This is Part Two of a post about making a simple HTML input control that highlights key-words as you type them. It’s a follow-on from a post I made recently that walked through underlining text with wavy squiggles, which glossed over the trickiest part – actually updating the entered text as it is typed.

Now that Part One has covered why you might want to do this, and what some better alternatives are if you’re looking for something more sophisticated, I’m just gonna go ahead and present ALL the code and then walk through it with explanations:

the html:

<div id="editor" contenteditable="true" spellcheck="false"></div>

the css:


[contenteditable] {
    border: 1px solid black;
    padding: 3px;
}

.hilight {
    border: 3px double magenta; 
    border-width: 0px 0px 3px 0px; 
    -moz-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Firefox */
    -webkit-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Safari 5 */
    -o-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Opera */
    border-image: url(wavyunderline.png) 0 0 3 repeat;
}

the javascript (where the magic happens!):

<!-- dependencies -->
<script src="rangy-core.js"></script>
<script src="rangy-selectionsaverestore.js"></script>
<script src="jquery-1.10.2.js"></script>
<script src="keywordHighlighter.js"></script>
<script>
/* intialise everything here */
$(document).ready( function() {
	$("#editor").on("keyup", function(e) {
		if(keyTriggersUpdate(e)){
                   var regex = new RegExp("\\b(dog|dyslexic)\\b", "g");
                   highlightKeywords($("#editor")[0], regex);
		}   
	});
});
</script>

keywordHilighter.js:


/* check to see if key triggers update */
function keyTriggersUpdate(e) {
	return e.keyCode == 8 || e.keyCode == 32 || e.keyCode == 46
         || (e.keyCode > 47 && e.keyCode < 91 &&
	    !e.ctrlKey && !e.shiftKey && !e.altKey);
}

/* highlight keywords */
function highlightKeywords(el, kwRegex) {
	var sel = saveSelection(el);
	el.innerHTML = el.innerHTML.replace(/<span[\s\S]*?>([\s\S]*?)<\/span>/g,"$1");
	el.innerHTML = el.innerHTML.replace(kwRegex, "<span class='hilight'>$1</span>");
	restoreSelection(el, sel);
}	

/* save selection */
function saveSelection(containerElement) {

	var charsCounted = 0;
	var selection = rangy.getSelection();
	var range = selection.getRangeAt(0);
	var foundStart = false;
	var extents = { line:0, start:0, end:0 };
	
	var nodelist = [containerElement];
	
	while( nodelist.length > 0 ) {
		var node = nodelist.pop();	
		var lastChildNodeIndex = node.childNodes.length -1;
		for (var i = lastChildNodeIndex; i >= 0; --i) {
		    nodelist.push(node.childNodes[i]);
		}

 		if( node.nodeType == 1 && node.nodeName == "DIV" ) {
			extents.line += 1;
		}

		if (!foundStart && node == range.startContainer) {
			extents.start = charsCounted + range.startOffset;
			foundStart = true;
		}
		if (foundStart && node == range.endContainer) {
			extents.end = charsCounted + range.endOffset;
			console.log("extents: " + extents.start + " to " + extents.end);
			return extents;
		}

		if( node.nodeType == 3 ) { 
			// we're visiting a text node
			charsCounted += node.length;		
		}

	}// end while we have nodes to process
	
	return extents;
}

/* restore selection */
function restoreSelection(containerElement, savedSel) {

	var charsCounted = 0;
        var line = 0;
	var range = rangy.createRange();
	var foundStart = false;

	range.collapseToPoint(containerElement, 0);
	
	var nodelist = [containerElement];
	
	while(nodelist.length > 0) {
		var node = nodelist.pop();
		var lastChildNodeIndex = node.childNodes.length -1;
		for (var i = lastChildNodeIndex; i >= 0; --i) {
		    nodelist.push(node.childNodes[i]);
		}

 		if( node.nodeType==1 && node.nodeName == "DIV") {
			// we've encountered a newline, increment the current line count
			line += 1;
		}
		
		var endOfSpan = charsCounted;

		if( node.nodeType == 3 ) { 
			// we're visiting a text node
			endOfSpan += node.length;
		}
		
		if (!foundStart && 
			line == savedSel.line &&
			savedSel.start >= charsCounted && 
			savedSel.start <= endOfSpan) {
			range.setStart(node, savedSel.start - charsCounted);
			foundStart = true;
		}
		if (foundStart && 
			savedSel.end >= charsCounted && 
			savedSel.end <= endOfSpan) {
			range.setEnd(node, savedSel.end - charsCounted);
			break;
		}
			
		charsCounted = endOfSpan;

	}// end while we have nodes to process

	rangy.getSelection().setSingleRange(range);

}

Here’s a breakdown of what everything does and how it fits together:

The html is straightforward. The div has the contenteditable=”true” attribute that transforms an ordinary div into a style-capable input control. The other important feature is that we turn off spellcheck to stop the browser applying spelling underlines which clash with the underlines that our code inserts.

The css gives all contenteditable elements borders to make the user aware that the content is indeed editable. Furthermore we add some padding to allow space for the underlines to be displayed. The highlight class is applied to spans which our javascript automatically inserts around any keywords typed in the user-entered text.

The javascript dependencies for this little demo are:

jQuery (although it isn’t used extensively)
rangy-core.js
rangy-selectionsaverestore.js

jQuery is hardly used, and could easily be removed entirely, but since most of my work already has jQuery as a dependency I’ve left it in there.

Rangy is a handy library that makes selection-range and caret position manipulation easier to do. You can get Rangy from here https://code.google.com/p/rangy/. Selections within the DOM are sadly not very cross-browser standardized, so this is where a library is the way to go. Rangy was written by Tim Down who seems to know more than the average coder about cross browser selection nuances!

Now let’s look at each of the javascript methods in turn:

keyTriggersUpdate(e) is a simple method that we call from the keyup handler of our contenteditable div to detect whether the key should trigger an update. If the keycode is 8 or 46 the user is pressing delete or backpace. If the keyCode is 32 they’ve typed a space. Otherwise they’ve just typed a character of some kind that ought to trigger highlighting.

highlightKeywords(el, kwRegex) does the bulk of the work. First we save the position of the caret (the text cursor). Then we remove all existing spans in the content editable using a regex replace:

el.innerHTML = el.innerHTML.replace(/<span[\s\S]*?>([\s\S]*?)<\/span>/g,"$1");

This basically says find every <span>…</span>, store the part between the spans as a named group: “$1″, (ie. the part in brackets in the regex pattern is the ‘named group’), then put that bit back without the spans.

The next line of code does the opposite. Basically it uses a passed in regex to find keywords, and wraps them with a span which applies the class “highlight”. By passing the regex into the method we can change what words get highlighted. The regex I’m passing in is:

"\\b(dyslexic|dog)\\b"

This will match every occurrence of dyslexic or dog which is bounded by a word boundary (that’s what the \\b’s mean). This prevents the code highlighting up “dog” if its part of the word “doggy” for example.

Finally the method restores the selection back to where it was with the call to restoreSelection(). Here’s a very quick explanation of how selection ranges work in HTML:

Consider the following contenteditable div:

Example div explaining selections

Suppose our caret is between the L and E in the word selections in the above div. You might expect the browser’s selection object to return 26 as the position of the caret:

<div>Example div explaining sel|ections</div>
0123456789012345678901234567890123

However, the browser actually sees the div as its markup, so the caret position returned is actually 4:

<div>Example div <span>explaining</span> sel|ections</div>
012345678901 0123456789 012345678901

This is why the saveSelection() code is so convoluted. Essentially I am using a stack-based (non-recursive) depth-first traversal to calculate the caret position relative to the text in the containing div, instead of relative to its node in the DOM. I am performing a similar process in restoreSelection().

Well that’s about it for now. Here’s another look at the finished product with dyslexic and dog as the keywords:

dyslexic dog‘s simple keyword highlighting text input.

Feel free to use the code in some creative and imaginative ways.

Jacob says:

February 4, 2015 at 00:01

Excellent post. Thank you, just what I needed to implement simple on the fly syntax highlighting without using a heavy library.

Matteo Cacciola says:

June 23, 2016 at 14:28

Congratulations for your work. Just a notice. I am testing the code with Firefox, and I am experiencing a bug, I suppose. I mean, after I erase a line by deleting char by char, the caret doesn’t go to the end of the previous line, but to the beginning of text. It seems that saveSelection function returns {start: 0, end: 0} without running the while block.
Do you know why? I am not able to find the reason, please need your help

the big dog says:

July 4, 2016 at 00:50

Hi Matteo, thanks for notifying me about the bug – I’d tried to trim the code down to the minimum for didactic purposes, but in so doing, broke it for multi-line contentEditables – It was failing to take empty lines into account when performing the stack-based save and restore methods. The solution is simply to track the line number of the selection as well as its position within the text. I’ve updated the code with a fix for Chrome, but it seems Chrome and Firefox handle linebreaks within ContentEditables differently – Chrome treats new lines as separate divs, whereas Firefox is inserting br elements. I will correct the code for cross-browser support shortly.

Giannis says:

January 13, 2018 at 02:57

More than tree years after what you wrote, and your code remains not only useful, but one of the best solutions while searching of the topic

I have a question (maybe not as simple as it sounds): Could i (and how you’d propose i should do it) use your code for acronyms?

I need to pass an array of words to be searched, and another (with the same key) with the explanation of the acronym – for example, every use of the word MIT should bring MIT Any ideas?

January 13, 2018 at 03:45

(the editor hide the code, but point is to have MIT )

Keyword highlighting in html Part Two

5 Responses to “Keyword highlighting in html Part Two”

Leave a Comment

Latest from my Devlog

Want to see more?

Links

Admin