UK TeX FAQ -- question 41

How does hyphenation work in TeX?

Everyone knows what hyphenation is: we see it in most books we read, and (if we're alert) often spot ridiculous mis-hyphenation from time to time (at one time, British newspapers were a fertile source).

Hyphenation styles are culturally-determined, and the same language may be hyphenated differently in different countries - for example, British and American styles of hyphenation of English are very different. As a result, a typesetting system that is not restricted to a single language at a single locale needs to be able to change its hyphenation rules from time to time.

TeX uses a pretty good system for hyphenation (originally designed by Frank Liang), and while it's capable of missing "sensible" hyphenation points, it seldom selects grossly wrong ones. The algorithm matches candidates for hyphenation against a set of "hyphenation patterns". The candidates for hyphenation must be sequences of letters (or other single characters that TeX may be persuaded to think of as letters) - things such as TeX's \accent primitive interrupt hyphenation.

Sets of hyphenation patterns are usually derived from analysis of a list of valid hyphenations (the process of derivation, using a tool called patgen, is not ordinarily a participatory sport).

The patterns for the languages a TeX system is going to deal with may only be loaded when the system is installed. To change the set of languages, a partial reinstallation is necessary.

TeX provides two "user-level" commands for control of hyphenation: \language (which selects a hyphenation style), and \hyphenation (which gives explicit instructions to the hyphenation engine, overriding the effect of the patterns).

The ordinary LaTeX user need not worry about \language, since it is very thoroughly managed by the babel package; use of \hyphenation is discussed in hyphenation failure.