Getting the Height of a Character

Posted Wed Feb 20 @ 07:26:10 PM PDT 2013

My next project idea is to implement an algorithm that builds word clouds. There are a bunch out there, from the (seemingly) simple, to the absolutely crazy amazing. This seems like a cool project for a few reasons:

Back to the point of this post...how do you find a bounding box for a character displayed on an HTML5 canvas element? It's absolutely necessary to know that when building a word cloud.

It might seem easy at first. There is a method you can call on the canvas context named measureText(). It returns the width of the string (or character) drawn with the current font. But the genius behind that method didn't think things all the way through. The method doesn't return the height of the text. Just the width. So you're on your own for that.

Getting an upper bound for the height, and the location of the font's baseline

You can use HTML to get an upper bound on the height of a character, and the vertical offset of the baseline. You create a span element (which I will call my_span), with the style attribute set to the exact font you want to measure (in this example, it is "72px 'Arial'"). Simply calling the my_span.getBoundingClientRect() method will get you an object with a height property. That tells us the upper bound on the height of the font. We will also need my_span.getBoundingClientRect().top to calculate the baseline...

To get the baseline of the font, create a div (which I will call my_div) and make it the sibling of the span you created earlier. Set the style attribute such that you align the element vertically with the baseline (which requires you to use display: inline-block). To calculate how far the baseline is from the top of the font's bounding box, just do my_div.getBoundingClientRect().top - my_span.getBoundingClientRect().top.

This snippet of HTML does the trick:

Closing in on the character

At this point, we know the maximum height of the character, and where the baseline is from the top and bottom of the bounding box. But we want a tighter bound. For example, an "o" is shorter than an "H", but the height we calculated earlier doesn't reflect that.

To get a tighter bound, we have to draw the character on a canvas element, loop through all the pixels on the canvas, and find the first, and last colored pixels. The function below does the job:

Finally, we get the height of a character. Putting it all together can draw bounding boxes around characters:

Hg

<< Home