Keyword highlighting in html Part Two

This is Part Two of a post about making a simple HTML input control that highlights key-words as you type them. It’s a follow-on from a post I made recently that walked through underlining text with wavy squiggles, which glossed over the trickiest part – actually updating the entered text as it is typed.

Now that Part One has covered why you might want to do this, and what some better alternatives are if you’re looking for something more sophisticated, I’m just gonna go ahead and present ALL the code and then walk through it with explanations:


the html:

<div id="editor" contenteditable="true" spellcheck="false"></div>

the css:


[contenteditable] {
    border: 1px solid black;
    padding: 3px;
}

.hilight {
    border: 3px double magenta; 
    border-width: 0px 0px 3px 0px; 
    -moz-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Firefox */
    -webkit-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Safari 5 */
    -o-border-image: url(wavyunderline.png) 0 0 3 repeat; /* Opera */
    border-image: url(wavyunderline.png) 0 0 3 repeat;
}

the javascript (where the magic happens!):

<!-- dependencies -->
<script src="rangy-core.js"></script>
<script src="rangy-selectionsaverestore.js"></script>
<script src="jquery-1.10.2.js"></script>
<script src="keywordHighlighter.js"></script>
<script>
/* intialise everything here */
$(document).ready( function() {
	$("#editor").on("keyup", function(e) {
		if(keyTriggersUpdate(e)){
                   var regex = new RegExp("\\b(dog|dyslexic)\\b", "g");
                   highlightKeywords($("#editor")[0], regex);
		}   
	});
});
</script>

keywordHilighter.js:


/* check to see if key triggers update */
function keyTriggersUpdate(e) {
	return e.keyCode == 8 || e.keyCode == 32 || e.keyCode == 46
         || (e.keyCode > 47 && e.keyCode < 91 &&
	    !e.ctrlKey && !e.shiftKey && !e.altKey);
}

/* highlight keywords */
function highlightKeywords(el, kwRegex) {
	var sel = saveSelection(el);
	el.innerHTML = el.innerHTML.replace(/<span[\s\S]*?>([\s\S]*?)<\/span>/g,"$1");
	el.innerHTML = el.innerHTML.replace(kwRegex, "<span class='hilight'>$1</span>");
	restoreSelection(el, sel);
}	

/* save selection */
function saveSelection(containerElement) {

	var charsCounted = 0;
	var selection = rangy.getSelection();
	var range = selection.getRangeAt(0);
	var foundStart = false;
	var extents = { line:0, start:0, end:0 };
	
	var nodelist = [containerElement];
	
	while( nodelist.length > 0 ) {
		var node = nodelist.pop();	
		var lastChildNodeIndex = node.childNodes.length -1;
		for (var i = lastChildNodeIndex; i >= 0; --i) {
		    nodelist.push(node.childNodes[i]);
		}

 		if( node.nodeType == 1 && node.nodeName == "DIV" ) {
			extents.line += 1;
		}

		if (!foundStart && node == range.startContainer) {
			extents.start = charsCounted + range.startOffset;
			foundStart = true;
		}
		if (foundStart && node == range.endContainer) {
			extents.end = charsCounted + range.endOffset;
			console.log("extents: " + extents.start + " to " + extents.end);
			return extents;
		}

		if( node.nodeType == 3 ) { 
			// we're visiting a text node
			charsCounted += node.length;		
		}

	}// end while we have nodes to process
	
	return extents;
}

/* restore selection */
function restoreSelection(containerElement, savedSel) {

	var charsCounted = 0;
        var line = 0;
	var range = rangy.createRange();
	var foundStart = false;

	range.collapseToPoint(containerElement, 0);
	
	var nodelist = [containerElement];
	
	while(nodelist.length > 0) {
		var node = nodelist.pop();
		var lastChildNodeIndex = node.childNodes.length -1;
		for (var i = lastChildNodeIndex; i >= 0; --i) {
		    nodelist.push(node.childNodes[i]);
		}

 		if( node.nodeType==1 && node.nodeName == "DIV") {
			// we've encountered a newline, increment the current line count
			line += 1;
		}
		
		var endOfSpan = charsCounted;

		if( node.nodeType == 3 ) { 
			// we're visiting a text node
			endOfSpan += node.length;
		}
		
		if (!foundStart && 
			line == savedSel.line &&
			savedSel.start >= charsCounted && 
			savedSel.start <= endOfSpan) {
			range.setStart(node, savedSel.start - charsCounted);
			foundStart = true;
		}
		if (foundStart && 
			savedSel.end >= charsCounted && 
			savedSel.end <= endOfSpan) {
			range.setEnd(node, savedSel.end - charsCounted);
			break;
		}
			
		charsCounted = endOfSpan;

	}// end while we have nodes to process

	rangy.getSelection().setSingleRange(range);

}


Here’s a breakdown of what everything does and how it fits together:

The html is straightforward. The div has the contenteditable=”true” attribute that transforms an ordinary div into a style-capable input control. The other important feature is that we turn off spellcheck to stop the browser applying spelling underlines which clash with the underlines that our code inserts.

The css gives all contenteditable elements borders to make the user aware that the content is indeed editable. Furthermore we add some padding to allow space for the underlines to be displayed. The highlight class is applied to spans which our javascript automatically inserts around any keywords typed in the user-entered text.

The javascript dependencies for this little demo are:

  • jQuery (although it isn’t used extensively)
  • rangy-core.js
  • rangy-selectionsaverestore.js

jQuery is hardly used, and could easily be removed entirely, but since most of my work already has jQuery as a dependency I’ve left it in there.

Rangy is a handy library that makes selection-range and caret position manipulation easier to do. You can get Rangy from here https://code.google.com/p/rangy/. Selections within the DOM are sadly not very cross-browser standardized, so this is where a library is the way to go. Rangy was written by Tim Down who seems to know more than the average coder about cross browser selection nuances!

Now let’s look at each of the javascript methods in turn:

keyTriggersUpdate(e) is a simple method that we call from the keyup handler of our contenteditable div to detect whether the key should trigger an update. If the keycode is 8 or 46 the user is pressing delete or backpace. If the keyCode is 32 they’ve typed a space. Otherwise they’ve just typed a character of some kind that ought to trigger highlighting.

highlightKeywords(el, kwRegex) does the bulk of the work. First we save the position of the caret (the text cursor). Then we remove all existing spans in the content editable using a regex replace:

el.innerHTML = el.innerHTML.replace(/<span[\s\S]*?>([\s\S]*?)<\/span>/g,"$1");

This basically says find every <span>…</span>, store the part between the spans as a named group: “$1″, (ie. the part in brackets in the regex pattern is the ‘named group’), then put that bit back without the spans.

The next line of code does the opposite. Basically it uses a passed in regex to find keywords, and wraps them with a span which applies the class “highlight”. By passing the regex into the method we can change what words get highlighted. The regex I’m passing in is:

"\\b(dyslexic|dog)\\b"

This will match every occurrence of dyslexic or dog which is bounded by a word boundary (that’s what the \\b’s mean). This prevents the code highlighting up “dog” if its part of the word “doggy” for example.

Finally the method restores the selection back to where it was with the call to restoreSelection(). Here’s a very quick explanation of how selection ranges work in HTML:

Consider the following contenteditable div:

Example div explaining selections

Suppose our caret is between the L and E in the word selections in the above div. You might expect the browser’s selection object to return 26 as the position of the caret:


<div>Example div explaining sel|ections</div>
     0123456789012345678901234567890123

However, the browser actually sees the div as its markup, so the caret position returned is actually 4:


<div>Example div <span>explaining</span> sel|ections</div>
     012345678901      0123456789       012345678901

This is why the saveSelection() code is so convoluted. Essentially I am using a stack-based (non-recursive) depth-first traversal to calculate the caret position relative to the text in the containing div, instead of relative to its node in the DOM. I am performing a similar process in restoreSelection().

Well that’s about it for now. Here’s another look at the finished product with dyslexic and dog as the keywords:

dyslexic dog‘s simple keyword highlighting text input.

Feel free to use the code in some creative and imaginative ways.