Each glyph in the Voynich script has a different distribution. Some occur in a particular position, such as the start, middle, or end of words, or adjacent to specific glyphs, such as [q] before [o]. Some glyphs may appear adjacent to many others, some only a few. We can think of glyphs as having a more or less diverse distribution based on how many glyphs they occur next to.
The distribution of glyphs in relation to other glyphs also differs according to the direction of that relationship. The glyphs which come before [ch] are not the same as those which come after it. There may be both a different set of glyphs and a different number of glyphs which come before and after.
The chart below shows the number of different glyphs which come before and after the 22 most common glyphs in the text. The number of glyphs is for each is counted up to at least 95%+ of its distribution, so rare adjacent glyphs are not included. Also, I have taken [ch, sh, ckh, cth, cfh, cph] to each be single glyphs, and also counted a word break (or space) as a glyph to indicate the position at the start or end of a word.
Eighteen of the 22 glyphs have either the same or nearly the same level of diversity before and after. The highest is [o] which can take any of 10 glyphs before and 11 glyphs after. The lowest are [n] and [q] which both take one before and one after.
Four for of the 22 glyphs in the above chart should be looked at in more detail.
[y] and [a]
The glyphs [y] and [a] both show a distribution with significantly lower diversity of glyphs after them than before. Although they show significant overlap in the glyphs which come before, they have no overlap in the glyphs which come after.
These two glyphs are so far different from others that they raise the question of why. I’ve previously discussed the relationship between [y] and [a], and the possibility that [y] is sometimes deleted.
Although the difference in the diversity of glyphs before and after [m] is small in number it is relatively big. The glyph [m] has three different glyphs which can come before, but only one after (in fact, not a glyph, but a space.)
The strict word-final position of [m] is probably mostly to blame. Though [q] and [n], which are strongly tied to the start and end of words respectively, have the same number of glyphs appearing on both sides. It could be that, as suggested by a number of researchers, [m] is a word-final variant of another glyph.
The most interesting result from measuring the diversity of adjacent glyphs is that [l] is totally different from every other glyph. While only three glyphs come before [l] there are eight glyphs which can come after. It is the second most diverse glyph with regard to what glyphs can follow it.
This result is reminiscent of those from first-last combinations where [l] didn’t fit well into the proposed division between ‘strong’ and ‘weak’ glyphs. There is clearly something about the way [l] interacts with other glyphs which makes it unlike others.
The reason for this difference is unknown and deserves further investigation. I’ll likely follow up this post with a more in-depth look at [l], including breaking down its distribution into Currier A and B.