Which glyph pairs most frequently have an entry in a font’s kerning table?
I surveyed a corpus of respected and popular professional fonts. Here are the results:
Q: When kerning glyphs, should the glyphs ever overlap?
It certainly happens in well-known professional fonts. Here’s an example I came across while working on TypeFacet Autokern:
In Helvetica LT Std Roman from Adobe, slashes overlap. Error or by design?
I was curious about typical side bearing values are, so I conducted an audit of a large corpus of popular fonts from prestigious foundries. Here are the results:
Total fonts scanned: 3,851
Average Left Side Bearing: 0.037 em
Average Left Side Bearing / Ascender Ratio: 0.045
Average Right Side Bearing: 0.017 em
Average Right Side Bearing / Ascender Ratio: 0.021
Should font contour directions be clockwise or counter-clockwise?
The font forge documentation states:
“All paths must be drawn in a consistent direction. Clockwise for external paths, anti-clockwise for internal paths.”
They state that this is the TrueType/OpenType convention, and that the Postscript convention is the opposite.
A separate section of the documentation continues:
“TECHNICAL AND CONFUSING: the exact behavior of rasterizers varies. Early PostScript rasterizers used a “non-zero winding number rule” while more recent ones use an “even-odd” rule. TrueType uses the “non-zero” rule. The description given above is for the “non-zero” rule. The “even-odd” rule would fill the “o” correctly no matter which way the paths were drawn (though there would probably be subtle problems with hinting).
Filling using the even-odd rules that a line is drawn from the current pixel to infinity (in any direction) and the number of contour crossings is counted. If this number is even the pixel is not filled. If the number is odd the pixel is filled. In the non-zero winding number rule the same line is drawn, contour crossings in a clockwise direction add 1 to the crossing count, counter-clockwise contours subtract 1. If the result is 0 the pixel is not filled, any other result will fill it.”
I conducted an audit of a large corpus of fonts to see how this played out in practice. Here are the results:
Total fonts scanned: 62,192
.otf: L contour is clockwise: 200
.otf: L contour is not clockwise: 18,337
.otf: O has 0 clockwise contours and 2 counter-clockwise contours: 259
.otf: O has 1 clockwise contours and 1 counter-clockwise contours: 17,584
.otf: O has 2 clockwise contours and 0 counter-clockwise contours: 23
.ttf: L contour is clockwise: 24,021
.ttf: L contour is not clockwise: 3,268
.ttf: O has 0 clockwise contours and 2 counter-clockwise contours: 290
.ttf: O has 1 clockwise contours and 1 counter-clockwise contours: 25,691
.ttf: O has 2 clockwise contours and 0 counter-clockwise contours: 618
Q: When designing a font, which glyphs should one design first?
My first thought was: which glyphs are used most frequently?
I found lists like this but they aren’t precise enough.
You could easily analyze a large corpus - say, all wikipedias - and do a frequency count on glyph usage. But I’m skeptical - any text corpus would be biased to the point of being useless.
So I decided instead to survey a large corpus of fonts and see which glyphs are most frequently implemented by their designers.
I wrote a set of Python scripts to survey most of the fonts I have on my laptop. I’ve open-sourced the scripts; they are available here.
The corpus of 19,826 fonts surveyed is necessarily a motley lot. I didn’t filter or weight by prestige, quality, etc. Moreover, no effort was made to group or de-duplicate fonts by typeface. Each font file (ie. Helvetica Medium Bold) in a given type family (ie. Helvetica) was counted separately. Therefore families with more weights or styles have disproportionate weight in the results.
There are two sets of results.
First, on the left side of the results page is a list of the first 10,000 Unicode glyphs (i.e. ‘A’) in order of code point (ie. 0x41), with the assumption that nearly all “high usage” code points glyphs have low code points. More common glyphs are darker. To see exact frequency statistics, you can mouseover any glyph.
Secondly, on the right side of the results page is a list of the 1,000 most commonly implemented glyphs. It’s a bit rough on the eyes, but could easily be parsed by anyone with a mind to do so.
The next step might be to write a script that, given a .UFO, .TTF or .OTF file, suggests what glyphs might be most useful to add.