After writing the article in which I proposed that [y] and [a] are equivalent in some way, I realized that the outcome posed a difficult question. The reason why I first considered that the characters [y] and [a] could be related — but not stated in that article — is that they both occur in some of the same contexts as [o]. In some places [o] can be swapped for [y], in others for [a], but nowhere either [y] or [a] equally. Each character alone matches part of the distribution of [o] but not the whole, and together they match an even greater part of that distribution. But curiously still not the whole.
If we accept that [y] and [a] are equivalent we end up with a combined character which comes near to matching [o] in the way it is used in the text, but falls short. Why is this so, and how can we explain it? This is the question which bothered me and which I seek to answer in this article.
The character [o] occurs practically in any position in a word: beginning, end, or middle. It has few restrictions on which characters it can occur before or after, though it does not occur before [g] or [n]. Yet [y/a] occurs at the beginning and end of words in either of its two forms, but only in the middle of words as [a], in positions before [l, r, m, n, i]. In other middle positions — where [o] freely occurs — [a] does not and [y] only sporadically. It is this part of the distribution of [o] which is thus unaccounted for by [y/a] and which I would like to explain.
Although it could be argued that there is no reason why [y/a] should match [o] in distribution and so there is nothing to explain, I believe it is far more productive to assume that there was regularity in whatever process made the text of the Voynich manuscript. So if we observe that [y/a] is similar to [o] in some ways we are right to question why it is not in all ways. The alternative is to assume that a fairly regular text was made by an irregular process, which is logically worse. The lack of [y/a] in these middle positions may teach us something about the Voynich text and the underlying language.
To solve the problem of the missing [y] we must have some way of alternating the occurrence of the two environments: one where [y/a, o] occur, and the other where only [o] occurs: between the middle and final or initial positions. Such an alternation will let us compare the environments in which [y/a] does and does not occur and see what differs.
The most obvious choice is to use the characters [dy] as they are a very common ending to words in the text. Many instances of [dy] occur after the character [o], and by removing those characters we can change the environment of the [o] from middle to final. When we do so, we find that — although less common than the words ending in [dy] — the resulting words ending in [o] all occur. Thus, taking a series of words and their frequencies: [okeody] 37, [okeeody] 16, [oteody] 39, and [oteeody] 11; but also [okeo] 14, [okeeo] 15, [oteo] 13, and [oteeo] 12. None of the resulting words are overly common, but taken together represent a valid pattern. The environment of a middle [o] before [dy] is similar to that of a final [o] once [dy] is removed.
But few examples of [dy] after [y/a] exist, as we expected. Taking the same series of words as above, but with [y] in the place of [o], we get the following: [okeydy] 0, [okeeydy] 0 , [oteydy] 1, and [oteeydy] 0. And, of course, the striking part comes when we remove [dy] and count the resulting words: [okey] 64, [okeey] 177, [otey] 57, and [oteey] 140. Alternating between middle and final environments for [y] gives the expected outcome, namely that [y/a] does not occur in one but does in the other. Although the examples given are only four words, the same rule goes for words ending with and without [dy] throughout the Voynich text.
So here we’ve been able to find an environment where a middle [o] occurs but where matching words with a middle [y] do not. Yet when we transform that environment to produce a final [o] we not only find valid words, but also ones which match words with a final [y]. Now we must ask ourselves what differs between the two environments that can help explain the lack of [y].
The obvious difference is that, in the text as a whole, the most common character before [dy] is not [o], but [e]. Again, using the same example words as above, we get the following counts: [okedy] 118, [okeedy] 105, [otedy] 155, and [oteedy] 100. It is tempting to think that [e] is somehow linked to [y/a], being maybe a third reflex of that character in the middle of words. But the many occurrences of [e] before [y] makes this highly unlikely as the character does not double up elsewhere: there are thousands of [ey] at the ends of words, but very few [ay, ya, yy, aa] at all.
But what happens when we transform the environment of the [e] to a final position by removing the [dy]? Well, this: [oke] 1, [okee] 0, [ote] 1, and [otee] 0. Indeed, final [e] occurs hardly anywhere in the whole text, the only exceptions being [she] and [shee]. It would seem that [e] can no more be final than [y] can be middle. As in the earlier article when we saw that [y] and [a] do not occur in the same environments, we seem to have here another case of complementary distribution and the same suggestion of a link: [e] in the middle of words without a following [o] must have some kind of relationship with [y/a].
(Before anybody thinks I have based this on four words alone, the same figures repeat for any word undergoing the same transformation. Wherever we find an [e] not followed by [y/a, o], removing the rest of the word and adding [y] always results in a valid word. Yet removing the rest of the word and leaving [e] as the final letter mostly results in invalid words. This is true whether there is a single [e] or double [ee].)
Thus we come to the conclusion that [ey] is a normal and regular sequence which occurs at the end of words, but — for some reason — when the environment is changed to the middle of a word the [y] is deleted leaving [e] alone. The only exceptions being where the environment produces an [a] instead. So just as we consider [cheo] to be the same string as occurs in the words [cheor], [cheol], [cheody], and [cheoky], so [chey] must give rise not only to [chear] and [cheal], but also [chedy] and [cheky].
With this we have found our ‘missing’ [y]: it is deleted in certain environments. This [y]–deletion, along with the equivalence of [y] and [a], makes an almost exact counterpart for [o], occurring in all the same environments. Although seemingly complex with three different expressions, the rules governing the character [y] are very regular:
1. [y] at the end of words, and at the beginning if not in the context for [a].
2. [a] before [l, r, m, n, i].
3. [Ø] when after [e] and before any character not giving rise to [a].
If true, the implications are significant to the structure of words. The knowledge that [o] and [y/a/Ø] finally have the same distribution lets us put them together in the same class of character: they are not the same nor have the same value, but occur in the same positions. Further, most words in the Voynich text contain at least one instance of [o] or [y/a/Ø], the majority which do not being short, often single characters (indeed, it seems that the longer a word is the more occurrences of [o, y/a/Ø] it will have). These two characters are thus essential for typical words, and give us a potential opening into a new way of understanding the structure of Voynich words.
I hope in a future article to explore word structure using this class of characters as a starting point.