The Existence of [y] Deletion

After writing the article in which I proposed that [y] and [a] are equivalent in some way, I realized that the outcome poses a difficult question. The reason why I first considered that the characters [y] and [a] could be related — but not stated in that article — is that they both occur in some of the same contexts as [o]. In some places [o] can be swapped for [y], in others for [a], but nowhere either [y] or [a] equally. Each glyph alone matches part of the distribution of [o] but not the whole. Together they match an even greater part of that distribution but still not the whole.

If we accept that [y] and [a] are equivalent then we end up with a combined glyph nearly matches the distribution [o] but falls short. Why is this so, and how can we explain it? This is the question which which I seek to answer in this article.

The glyph [o] occurs in any position in a word: start, end, or middle. It has no restrictions on which glyphs it can occur before or after, though some are rare. Yet [y/a] occurs at the start and end of words in one or the other of its two forms, but only in the middle of words as [a], in positions before [l, r, m, n, i]. In other middle positions [a] does not and [y] only sporadically. It is this part of the distribution of [o] which is thus unaccounted for by [y/a] and which I would like to explain.

It could be argued that there is no reason why [y/a] should match [o] in distribution and so there is nothing to explain. I believe it is far more productive to assume that there was regularity in whatever process made the text of the Voynich manuscript. If we observe that [y/a] is similar to [o] in some ways we are right to question why it is not in all ways. The lack of [y/a] in these middle positions may teach us something about the Voynich text and the underlying language.

To solve the problem of the missing [y] we must have some way of alternating the occurrence of the two environments: one where [y/a, o] occur, and the other where only [o] occurs. Alternation between the middle and start/end positions will let us compare the environments in which [y/a] does and does not occur and see what differs.

The most obvious choice is to use the very common suffix [dy]. Many instances of [dy] occur after the glyph [o], and by removing the suffix we can change the environment of the [o] from middle to end. When we do so we find that  the resulting words ending in [o] all occur: [okeody] has 37 tokens , [okeeody] 16, [oteody] 39, and [oteeody] 11; while [okeo] has 14 tokens, [okeeo] 15, [oteo] 13, and [oteeo] 12. Other examples of the same pattern can be found to illustrate the same idea: [dy] adds to an existing word, changing the previous final glyph into a middle glyph.

As expected, relatively few examples of [dy] after [y/a] exist. Taking the same series of words as above but with [y] in the place of [o], we get the following: [okeydy] 0, [okeeydy] 0 , [oteydy] 1, and [oteeydy] 0. Yet when [dy] is removed the resulting words are all common: [okey] 64, [okeey] 177, [otey] 57, and [oteey] 140. Alternating between middle and final environments for [y] gives the expected outcome, namely that [y/a] does not occur in one but does in the other. Although the examples given are only four words the pattern is repeated throughout the text.

So here we’ve been able to find an environment where a middle [o] occurs but [y] do not. While the removal of the suffix produces valid words ending both [o] and [y]. Now we must ask ourselves what differs between the two environments that can help explain the lack of [y] in the middle position.

The obvious difference is that the most common glyph before [dy] throughout the text is not [o] but [e]. Again, using the same example words as above, we get the following counts: [okedy] 118, [okeedy] 105, [otedy] 155, and [oteedy] 100. But what happens when we transform the environment of the [e] to a final position by removing the [dy]? Well, this: [oke] 1, [okee] 0, [ote] 1, and [otee] 0. Indeed, final [e] occurs hardly anywhere in the whole text, the only exceptions being [she] and [shee]. It would seem that [e] can no more be final than [y] can be middle.

As in the earlier article when we saw that [y] and [a] do not occur in the same environments, we seem to have here another case of complementary distribution and the same suggestion of a link: [e] in the middle of words without a following [o] must have some kind of relationship with [y/a]. It is tempting to think that [e] is somehow linked to [y/a], being maybe a third reflex of that glyph in the middle of words. But the many occurrences of [e] before [y] makes this highly unlikely as the glyph does not double up elsewhere: there are thousands of [ey] at the end of words, but very few [ay, ya, yy, aa] at all.

(Before anybody thinks I have based this on four words alone, the same figures repeat for any word undergoing the same transformation. Wherever we find an [e] not followed by [y/a] or [o], removing the rest of the word and adding [y] always results in a valid word. Yet removing the rest of the word and leaving [e] as the final letter mostly results in invalid words. This is true whether there is a single [e] or double [ee].)

It appears that [ey] is a normal and regular sequence which occurs at the end of words, but — for some reason — when the environment is changed to the middle of a word the [y] is deleted leaving [e] alone. The only exceptions being where the environment produces an [a] instead. So just as we consider [cheo] to be the same string as occurs in the words [cheor], [cheol], [cheody], and [cheoky], so [chey] must give rise not only to [chear] and [cheal], but also [chedy] and [cheky].

With this we have found our ‘missing’ [y]: it is deleted in certain environments. This [y]–deletion, along with the equivalence of [y] and [a], makes an almost exact counterpart for [o], occurring in all the same environments. Although seemingly complex with three different expressions, the rules governing this compound glyph are very regular:

1. [y] at the end of words, and at the beginning if not in the context for [a].

2. [a] before [l, r, m, n, i].

3. [Ø] when after [e] and before any character not giving rise to [a].

If true, the implications are significant to the structure of words. The knowledge that [o] and [y/a/Ø] finally have the same distribution lets us put them together in the same class of glyph: they are not the same nor have the same value, but occur in the same positions. Further, most words in the Voynich text contain at least one instance of [o] or [y/a/Ø], the majority which do not being short, often single glyphs (indeed, it seems that the longer a word is the more occurrences of [o] and [y/a/Ø] it will have). These two glyphs are thus essential for typical words, and give us a potential opening into a new way of understanding the structure of Voynich words.

I hope in a future article to explore word structure using this class of glyphs as a starting point.

2 thoughts on “The Existence of [y] Deletion”

1. Emma: rather than concluding that word terminal ‘y’ can extremely often be replaced by a valid word end block, isn’t it just as straightforward to conclude that any valid word end block can be replaced by ‘y’? That is, that word-terminal ‘y’ seems to function as a shorthand token marking the truncation of a longer word? (‘Truncatio’).

Word-initial ‘y’ seems to have a quite different function, in particular when it pairs with gallows characters. But that’s another story entirely.

2. EMSmith says:

Hi Nick, the problem is with the persistence of [o]. Although your solution may, in truth, be right, it does not answer the question of why [y/a] and [o] are similar in distribution but not in the middle of words. While we could simply say that their similarity is an illusion, or that they are similar but not wholly so, the problem of the article is an attempt to make them as similar as can be. I begin with the assumption of regularity and follow that through to some kind of theory about how the script (and/or language) works.

It may not be right but it does give us a workable theory which can be taken forward for further research.

