The Voynich script constitutes just one part of the overall puzzle which surrounds the text of the Voynich manuscript, along with the contents of the text and the way in which it is encoded. There is no clear relationship between the Voynich script and any other known writing system. Despite the similarity of some glyphs to letters in other scripts there is no accepted proposal linking the Voynich script with any other. All existing knowledge about the script therefore comes from evidence internal to the manuscript itself.
The script consists of an unknown number of glyphs, the total depending on how they are counted. Typical counts range in the twenties, but a higher total is possible if rare glyphs are included, and a lower one if combinations and modifications are not counted as separate glyphs. For example, the glyphs [cfh, ckh, cph, cth] appear to be combinations of the glyphs [f, k, p, t] with the glyph [ch]. Whether these glyphs should be counted separately or only their constituents is debatable, and thus the total number of glyphs in the Voynich script is variable.
As part of this debate I want to argue that two of the commonly distinguished glyphs, namely [a] and [y], are in fact related and may simply be graphical variants with the same value. Although I have not seen a case for this equivalence argued elsewhere, I am willing to accept that I may not be the first to propose or argue the fact.
The glyphs [a] and [y] are both common in the Voynich text, each occurring thousands of times and in all parts of the manuscript. They are part of the core script. However, they occur in clearly different environments: they take different positions within words and occur next to different glyphs. Their distribution within words is particularly easy to notice by casual observation, but simple tabulation of that distribution is still revealing.
To do this I generated a list of the most common words in the Voynich text using a widely available transliteration. Each word on the list occurred at least 10 times within the text, and so cannot be considered a writing or reading mistake nor unusual in the underlying language. The list contained 508 entries in all.
Using the list I generated the table below showing whether [a] or [y] occurred at the start, end, or in the middle of common words. Thus the table gives the typical (though not exhaustive) distribution of the glyphs within words.
Glyph Start Middle End [a] Yes Yes No [y] Yes No Yes
From the table above it is clear that while both [a] and [y] occur at the start of words, only [a] occurs in the middle and only [y] at the end. Even in the whole text this distribution holds well, with less than 3% of instances of [y] occurring in the middle, and <1% of [a] occurring at the end of a word. The distributions of [a] and [y] only overlap significantly at the start of words and not typically elsewhere.
Yet a closer look at the start of words shows that the overlap between [a] and [y] is apparent and not real. The glyphs following [a] and [y] in words fall into two separate groups, with no glyphs in both groups. Below is a table showing what glyphs come after [a] or [y] as the second glyph in a word.
Glyph After [a] After [y] [ch] No Yes [d] No Yes [i] Yes No [k] No Yes [l] Yes No [m] Yes No [p] No Yes [r] Yes No [sh] No Yes [t] No Yes
Even at the start of words [a] and [y] do not occur in the same environments. Each one only occurs before certain glyphs and not before others. Add this to the already established lack of overlap in the middle and end of words, and we can see that the distribution of [a] and [y] do not overlap at all in typical words. They are in complementary distribution and nowhere can either glyph occur: the choice of [a] or [y] is determined by the environment.
The exact relationship between [a] and [y] in the Voynich script, whether it is a graphical difference, a sound change, or something else, is unknown. Not enough is currently known about the script or the way in which it works to make a firm argument. However, we can examine the graphical aspect of the difference between [a] and [y] to make a tentative argument.
The glyphs [a] and [y] are naturally easy to differentiate, and all transcriptions I know have classed them as separate glyphs. Even so, they bear some graphical similarity. Both consist of two strokes, the first in both cases being a semicircular stroke open to the right, being roughly the same as the glyph [e]. The second strokes of both glyphs lie directly to the right of the first, but differ in shape. For [a] the second stroke is a short stroke beginning from the mean line and running down and right toward the baseline. It is identical to the glyph [i]. For [y] the second stroke begins from the mean line, runs down and right toward the baseline, but then curves leftward and continues a for a significant length below the baseline.
The most obvious difference between [a] and [y] is the downward curved reach of the second stroke, being otherwise rather similar. The likeness strengthens the identity of the glyphs as related, though this is only impressionistic.
A more significant relationship lies between the shape of the second stroke and that of the following glyphs. The glyph [a] occurs at the beginning of a word only before the glyph [i, l, m, r]. Along with [n], these glyphs make up the bulk of all those which follow [a] anywhere in the text. That [a] is followed only by a limited a limited set of glyphs was noted by Prescott Currier, who also noted that those glyphs included a stroke identical to [i], which is the second stroke of [a]. There is thus good reason to believe that the choice of [a] rather than [y] is conditioned by the presence of an [i] stroke in the glyph directly following.
This seems to be good evidence that [a] is a graphical variant of [y]: it occurs in a very specific environment conditioned on a graphical basis. However, it is not possible to say that the conditioning of [a] is simply graphical and otherwise without meaning. There could be an underlying relationship between the [i] stroke and another feature which links the observed relationship as an explaining factor. We can only state that there is a relationship while remaining agnostic about its features or meaning.
In the Wild
If the glyphs [a] and [y] are related to each other there should be evidence for this within the text. The hypothesis is that the glyphs are conditioned by different environments, so by controlling for the environment within the text we should be able to control the appearance of either [a] or [y]. Two kinds of evidence will be given: the lack of [a] as a standalone glyph and the possibility of [y] becoming [a] with the addition of a suffix.
Glyphs in the Voynich script sometimes stand alone within the text, clearly set apart from other glyphs. In such circumstances we would expect [y] to appear and not [a] due to the lack of following glyphs to condition its use. Within the body text a single glyph can appear alone as a word in itself. The glyph [y] appears alone in such circumstances about 150 times, whereas [a] not more than three times.
Another similar environment are the so–called “key–like sequences” where a series of individual glyphs is written in a row or column. The meaning of these sequences is unknown, but they focus on the glyphs alone rather than as part of a word. There are four such sequences considered to be original to the manuscript: 49v shows multiple [y] but no [a]; the repeating sequence on 57v contains [y] but not [a]; 66r shows multiple [y] but no [a]; and 76r contains neither [y] nor [a].
Although the evidence is not strong, the lack of [a] as a standalone glyph outside of the conditioning environment where it is usually found reinforces the idea that it is a variant of [y].
The second way of controlling the environment within which the two glyphs appear is by the use of an affix. The Voynich language is well known for the apparent ‘modularity’ of words, with many longer words seemingly built from affixes. A useful fact for the study of [a] and [y] is that while [y] is a common word ending, [a] is often the first glyph of a suffix. This gives us an opportunity for testing the relationship.
We would expect that the root of a word ending in [y] to be found with some fixed frequency to that of the same root ending with a suffix with an initial [a]. This is because according to the hypothesis [y] and [a] are the same glyph in a different environment, and rather than being part of the suffix, [a] appears because the final [y] is transformed by the following glyphs of the suffix.
To make this clearer, let us formulate a test based on the most common suffix beginning with [a]: [–aiin]. If [a] and [y] are the same glyph then a word such as [oky] is in fact the root of [okaiin], rather than the two words sharing the common root [ok–]. The suffix is thus [–iin] and not [–aiin], with the first glyph [i] of the suffix causing the final [y] of [oky] to transform into [a].
Below is a table of the twenty most common word ending in [–aiin] and their counterparts ending in [–y], showing the number of tokens for each.
Word Count Word Count [daiin] 858 [dy] 269 [aiin] 465 [y] 148 [qokaiin] 262 [qoky] 147 [okaiin] 206 [oky] 102 [otaiin] 153 [oty] 114 [saiin] 123 [sy] 33 [qotaiin] 79 [qoty] 87 [kaiin] 65 [ky] 25 [raiin] 61 [ry] 13 [odaiin] 56 [ody] 45 [olaiin] 48 [oly] 55 [lkaiin] 46 [lky] 17 [chaiin] 45 [chy] 152 [ykaiin] 44 [yky] 17 [ytaiin] 43 [yty] 24 [taiin] 42 [ty] 16 [chodaiin] 42 [chody] 92 [qodaiin] 40 [qody] 17 [oraiin] 33 [ory] 17 [chedaiin] 31 [chedy] 500
Although the above table does not present a clear and unambiguous relationship between the two sets of words, it does allow for the possibility of a relationship. In all cases the most common words ending in [–aiin] have counterparts ending in [–y] which are themselves common (that is, with more than ten occurrences). Likewise, in 15 of the 20 cases the word ending in [–aiin] is more common than that ending in [–y], and within a fairly narrow range.
In comparison, the counterpart words with no ending at all—as though [ok] were the root of [oky] and [okaiin]—are mostly rare or non–existent. Indeed, in the case of [y] and [aiin] there is no possible word without an ending. These two words have sometimes been considered curious because they seem to be endings without a root, but the equivalence of [a] and [y] shows that they have a single glyph root which transforms with the addition of the [–iin] suffix. The redefining of that suffix as [–iin] also brings into line the less common series of words which end with [–oiin]. Rather than being a different suffix, it is rather the same suffix attached to a word ending in a different glyph. Words ending [–o] are not as common as [–y] but occur nonetheless.
Conclusion & Implications
The equivalence of [a] and [y] in the Voynich script seems possible or even likely. The two glyphs have been shown to have a complementary distribution and their appearance dependent on the following character. A graphical link has been highlighted between all the glyphs which follow [a] but not [y], which [a] itself also shares. Two sets of words where [y] is transformed into [a] by the addition of a suffix have been shown to be possibly related.
Although the evidence and argument set out here is not exhaustive, it is suggestive. More work is needed on those words which might show a transformation between [y] and [a], as the treatment here is only preliminary. Hopefully the outcome will strengthen the evidence available to make this equivalence. Undoubtedly there will be objections to the hypothesis I have not dealt with here which may yet prove it wrong.
Should the equivalence be accepted there are a number of implications that it could have upon research into the Voynich script and language:
1. Any theory which proposes radically different values for [a] and [y] must not be correct. While the two glyphs need not have the same value, they must have values which are conditionally linked.
2. Any theory which proposes non–interaction between characters and their context must not be correct. The Voynich text is not a string of single glyphs, but must include a plausible system of interaction.
3. Glyphs may have more than one shape depending on their context.
4. The strokes which form glyphs are important in some way. At the very least the seeming identity of the [i] stroke in different glyphs is not simply appearance but must be an actual fact of the script.
Implications 1 and 2 are the most significant and may prove damaging to a number of other theories. Implications 3 and 4 have long been proposed but receive confirmation from this hypothesis.