The Equivalence of [a] and [y]

The Voynich script constitutes just one part of the overall puzzle which surrounds the text of the Voynich manuscript, along with the contents of the text and the way in which it is encoded. There is no clear relationship between the Voynich script and any other known writing system. Despite the similarity of some glyphs to letters in other scripts there is no accepted proposal linking the Voynich script with any other. All existing knowledge about the script therefore comes from evidence internal to the manuscript itself.

The script consists of an unknown number of glyphs, the total depending on how they are counted. Typical counts range in the twenties, but a higher total is possible if rare glyphs are included, and a lower one if combinations and modifications are not counted as separate glyphs. For example, the glyphs [cfh, ckh, cph, cth] appear to be combinations of the glyphs [f, k, p, t] with the glyph [ch]. Whether these glyphs should be counted separately or only their constituents is debatable, and thus the total number of glyphs in the Voynich script is variable.

As part of this debate I want to argue that two of the commonly distinguished glyphs, namely [a] and [y], are in fact related and may simply be graphical variants with the same value. Although I have not seen a case for this equivalence argued elsewhere, I am willing to accept that I may not be the first to propose or argue the fact.

Complementary Distribution

The glyphs [a] and [y] are both common in the Voynich text, each occurring thousands of times and in all parts of the manuscript. They are part of the core script. However, they occur in clearly different environments: they take different positions within words and occur next to different glyphs. Their distribution within words is particularly easy to notice by casual observation, but simple tabulation of that distribution is still revealing.

To do this I generated a list of the most common words in the Voynich text using a widely available transliteration. Each word on the list occurred at least 10 times within the text, and so cannot be considered a writing or reading mistake nor unusual in the underlying language. The list contained 508 entries in all.

Using the list I generated the table below showing whether [a] or [y] occurred at the start, end, or in the middle of common words. Thus the table gives the typical (though not exhaustive) distribution of the glyphs within words.

Glyph Start Middle End
[a] Yes Yes No
[y] Yes No Yes

From the table above it is clear that while both [a] and [y] occur at the start of words, only [a] occurs in the middle and only [y] at the end. Even in the whole text this distribution holds well, with less than 3% of instances of [y] occurring in the middle, and <1% of [a] occurring at the end of a word. The distributions of [a] and [y] only overlap significantly at the start of words and not typically elsewhere.

Yet a closer look at the start of words shows that the overlap between [a] and [y] is apparent and not real. The glyphs following [a] and [y] in words fall into two separate groups, with no glyphs in both groups. Below is a table showing what glyphs come after [a] or [y] as the second glyph in a word.

Glyph After [a] After [y]
[ch] No Yes
[d] No Yes
[i] Yes No
[k] No Yes
[l] Yes No
[m] Yes No
[p] No Yes
[r] Yes No
[sh] No Yes
[t] No Yes

Even at the start of words [a] and [y] do not occur in the same environments. Each one only occurs before certain glyphs and not before others. Add this to the already established lack of overlap in the middle and end of words, and we can see that the distribution of [a] and [y] do not overlap at all in typical words. They are in complementary distribution and nowhere can either glyph occur: the choice of [a] or [y] is determined by the environment.

The exact relationship between [a] and [y] in the Voynich script, whether it is a graphical difference, a sound change, or something else, is unknown. Not enough is currently known about the script or the way in which it works to make a firm argument. However, we can examine the graphical aspect of the difference between [a] and [y] to make a tentative argument.

Different Strokes

The glyphs [a] and [y] are naturally easy to differentiate, and all transcriptions I know have classed them as separate glyphs. Even so, they bear some graphical similarity. Both consist of two strokes, the first in both cases being a semicircular stroke open to the right, being roughly the same as the glyph [e]. The second strokes of both glyphs lie directly to the right of the first, but differ in shape. For [a] the second stroke is a short stroke beginning from the mean line and running down and right toward the baseline. It is identical to the glyph [i]. For [y] the second stroke begins from the mean line, runs down and right toward the baseline, but then curves leftward and continues a for a significant length below the baseline.

The most obvious difference between [a] and [y] is the downward curved reach of the second stroke, being otherwise rather similar. The likeness strengthens the identity of the glyphs as related, though this is only impressionistic.

A more significant relationship lies between the shape of the second stroke and that of the following glyphs. The glyph [a] occurs at the beginning of a word only before the glyph [i, l, m, r]. Along with [n], these glyphs make up the bulk of all those which follow [a] anywhere in the text. That [a] is followed only by a limited a limited set of glyphs was noted by Prescott Currier, who also noted that those glyphs included a stroke identical to [i], which is the second stroke of [a]. There is thus good reason to believe that the choice of [a] rather than [y] is conditioned by the presence of an [i] stroke in the glyph directly following.

This seems to be good evidence that [a] is a graphical variant of [y]: it occurs in a very specific environment conditioned on a graphical basis. However, it is not possible to say that the conditioning of [a] is simply graphical and otherwise without meaning. There could be an underlying relationship between the [i] stroke and another feature which links the observed relationship as an explaining factor. We can only state that there is a relationship while remaining agnostic about its features or meaning.

In the Wild

If the glyphs [a] and [y] are related to each other there should be evidence for this within the text. The hypothesis is that the glyphs are conditioned by different environments, so by controlling for the environment within the text we should be able to control the appearance of either [a] or [y]. Two kinds of evidence will be given: the lack of [a] as a standalone glyph and the possibility of [y] becoming [a] with the addition of a suffix.

Glyphs in the Voynich script sometimes stand alone within the text, clearly set apart from other glyphs. In such circumstances we would expect [y] to appear and not [a] due to the lack of following glyphs to condition its use. Within the body text a single glyph can appear alone as a word in itself. The glyph [y] appears alone in such circumstances about 150 times, whereas [a] not more than three times.

Another similar environment are the so–called “key–like sequences” where a series of individual glyphs is written in a row or column. The meaning of these sequences is unknown, but they focus on the glyphs alone rather than as part of a word. There are four such sequences considered to be original to the manuscript: 49v shows multiple [y] but no [a]; the repeating sequence on 57v contains [y] but not [a]; 66r shows multiple [y] but no [a]; and 76r contains neither [y] nor [a].

Although the evidence is not strong, the lack of [a] as a standalone glyph outside of the conditioning environment where it is usually found reinforces the idea that it is a variant of [y].

The second way of controlling the environment within which the two glyphs appear is by the use of an affix. The Voynich language is well known for the apparent ‘modularity’ of words, with many longer words seemingly built from affixes. A useful fact for the study of [a] and [y] is that while [y] is a common word ending, [a] is often the first glyph of a suffix. This gives us an opportunity for testing the relationship.

We would expect that the root of a word ending in [y] to be found with some fixed frequency to that of the same root ending with a suffix with an initial [a]. This is because according to the hypothesis [y] and [a] are the same glyph in a different environment, and rather than being part of the suffix, [a] appears because the final [y] is transformed by the following glyphs of the suffix.

To make this clearer, let us formulate a test based on the most common suffix beginning with [a]: [–aiin]. If [a] and [y] are the same glyph then a word such as [oky] is in fact the root of [okaiin], rather than the two words sharing the common root [ok–]. The suffix is thus [–iin] and not [–aiin], with the first glyph [i] of the suffix causing the final [y] of [oky] to transform into [a].

Below is a table of the twenty most common word ending in [–aiin] and their counterparts ending in [–y], showing the number of tokens for each.

Word Count Word Count
[daiin] 858 [dy] 269
[aiin] 465 [y] 148
[qokaiin] 262 [qoky] 147
[okaiin] 206 [oky] 102
[otaiin] 153 [oty] 114
[saiin] 123 [sy] 33
[qotaiin] 79 [qoty] 87
[kaiin] 65 [ky] 25
[raiin] 61 [ry] 13
[odaiin] 56 [ody] 45
[olaiin] 48 [oly] 55
[lkaiin] 46 [lky] 17
[chaiin] 45 [chy] 152
[ykaiin] 44 [yky] 17
[ytaiin] 43 [yty] 24
[taiin] 42 [ty] 16
[chodaiin] 42 [chody] 92
[qodaiin] 40 [qody] 17
[oraiin] 33 [ory] 17
[chedaiin] 31 [chedy] 500

Although the above table does not present a clear and unambiguous relationship between the two sets of words, it does allow for the possibility of a relationship. In all cases the most common words ending in [–aiin] have counterparts ending in [–y] which are themselves common (that is, with more than ten occurrences). Likewise, in 15 of the 20 cases the word ending in [–aiin] is more common than that ending in [–y], and within a fairly narrow range.

In comparison, the counterpart words with no ending at all—as though [ok] were the root of [oky] and [okaiin]—are mostly rare or non–existent. Indeed, in the case of [y] and [aiin] there is no possible word without an ending. These two words have sometimes been considered curious because they seem to be endings without a root, but the equivalence of [a] and [y] shows that they have a single glyph root which transforms with the addition of the [–iin] suffix. The redefining of that suffix as [–iin] also brings into line the less common series of words which end with [–oiin]. Rather than being a different suffix, it is rather the same suffix attached to a word ending in a different glyph. Words ending [–o] are not as common as [–y] but occur nonetheless.

Conclusion & Implications

The equivalence of [a] and [y] in the Voynich script seems possible or even likely. The two glyphs have been shown to have a complementary distribution and their appearance dependent on the following character. A graphical link has been highlighted between all the glyphs which follow [a] but not [y], which [a] itself also shares. Two sets of words where [y] is transformed into [a] by the addition of a suffix have been shown to be possibly related.

Although the evidence and argument set out here is not exhaustive, it is suggestive. More work is needed on those words which might show a transformation between [y] and [a], as the treatment here is only preliminary. Hopefully the outcome will strengthen the evidence available to make this equivalence. Undoubtedly there will be objections to the hypothesis I have not dealt with here which may yet prove it wrong.

Should the equivalence be accepted there are a number of implications that it could have upon research into the Voynich script and language:

1. Any theory which proposes radically different values for [a] and [y] must not be correct. While the two glyphs need not have the same value, they must have values which are conditionally linked.

2. Any theory which proposes non–interaction between characters and their context must not be correct. The Voynich text is not a string of single glyphs, but must include a plausible system of interaction.

3. Glyphs may have more than one shape depending on their context.

4. The strokes which form glyphs are important in some way. At the very least the seeming identity of the [i] stroke in different glyphs is not simply appearance but must be an actual fact of the script.

Implications 1 and 2 are the most significant and may prove damaging to a number of other theories. Implications 3 and 4 have long been proposed but receive confirmation from this hypothesis.


19 thoughts on “The Equivalence of [a] and [y]

  1. I want to note here that Jacques Guy and Robert Firth may have proposed this equivalence at some time in the 90s. Even though I cannot find a full exposition of their thoughts, mention of this idea can be found in a few places of the Voynich Mailing List archive.


  2. Emma, I just came across this post of yours, and now I think it may seem like I’m stealing your ideas.
    To me, the Voynichese scrips looks like a very limited set of glyphs, of which many (not all) have an “ornate” and a “normal” form. This, together with “a” generally not turning up at word-final position, led me to believe that was just an with a flourish. I’ll link to this post and mention that you were the first to suggest that and may just have the same sound value.


  3. Hmm, something went very wrong in that post, it took my EVA notations as some sort of programming code and made links of it. So I meant that “y” is an “a” with a flourish 🙂


  4. Hmm. Your point about the complementary distribution of [a] and [y] is persuasive, and the idea of [a] being selected by a following line-letter ties in nicely with Bruce Cham’s curve-line hypothesis.

    This leads me to a follow-on question. Cham suggests that [l] is the line-letter counterpart to [y], which is a curve letter. Let ( represent an [e] stroke, \ an [i] stroke, and J a downward tail. Then Cham suggests that [y] is (J, while [l] is \J.

    So here’s my question. As you’ve demonstrated that a following \ seems to transform (J [y] into (\ [a], do you know if anything similar can be observed about [l]? One might expect that before \, \J [l] might transform into \\ [ii]; do you know if the data back this up?

    If not, I’ll run some frequency counts and report back!


    • The idea that [a] and [y] are related comes from a questioning of their distribution within words and not from the curve-line hypothesis. The two characters don’t seem to readily occur in the same environments, which is what led me to the conclusion. That the environments happen to coincidence with a difference in graphical stroke is intriguing, but I don’t believe in the curve-line hypothesis.


      • The idea that [a] and [y] are related comes from a questioning of their distribution within words

        I’m aware of that, of course. All the same, it’s interesting that the environments that seem to select [a] instead of [y] are precisely those where an [i]-stroke follows. This would tend to corroborate the curve-line hypothesis, if rather weakly.

        Regardless of the CL hypothesis, though, if one tail can turn into an [i]-stroke, perhaps another one can too, which is why I plan to investigate complementary distribution of [l] and [ii]. It’s not so much about the CL hypothesis as about seeing if there’s anything generalizable in your findings here.

        The CL hypothesis looks somewhat persuasive to me, but of course I haven’t done the level of research that you have. What do you see as problematic for the curve-line hypothesis, if you don’t mind my asking? (I haven’t seen it addressed on your blog, unless I missed something.)


        • The main problem is that it doesn’t describe word structure all that well. It has numerous exceptions, some of which are substantial. The character [o], for example, doesn’t behave like a curve and the author is forced to ‘patch’ the hypothesis to dismiss this. Yet the patch is inadequate and still fails to explain why [o] behaves how it does. Given that [o] is the most common character, this is a problem. There’s also a significant problem with [l].

          It seems to me that the [i] or [e] stroke within some characters may be important. But I don’t think it is a key organizing principle of the text, rather than of the script.


          • Thanks for the explanation! I will certainly have another look at the behavior of [o] in light of that.

            Meanwhile, I’ll post back here when I look at [ii] and [l] in more detail. There seems to be *something* to the idea, but I’m not sure how much yet.


  5. There is a key to cipher the Voynich manuscript.
    The key to the cipher manuscript placed in the manuscript. It is placed throughout the text. Part of the key hints is placed on the sheet 14. With her help was able to translate a few dozen words that are completely relevant to the theme sections.
    The Voynich manuscript is not written with letters. It is written in signs. Characters replace the letters of the alphabet one of the ancient language. Moreover, in the text there are 2 levels of encryption. I figured out the key by which the first section could read the following words: hemp, wearing hemp; food, food (sheet 20 at the numbering on the Internet); to clean (gut), knowledge, perhaps the desire, to drink, sweet beverage (nectar), maturation (maturity), to consider, to believe (sheet 107); to drink; six; flourishing; increasing; intense; peas; sweet drink, nectar, etc. Is just the short words, 2-3 sign. To translate words with more than 2-3 characters requires knowledge of this ancient language. The fact that some symbols represent two letters. In the end, the word consisting of three characters can fit up to six letters. Three letters are superfluous. In the end, you need six characters to define the semantic word of three letters. Of course, without knowledge of this language make it very difficult even with a dictionary.
    Much attention in the manuscript is paid to the health of women for the purpose of giving birth to healthy offspring.
    And most important. In the manuscript there is information about “the Holy Grail”.
    I am ready to share information, but only with those who are seriously interested in deciphering the Voynich manuscript.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s