In my last post I spoke about the Transformation Theory, and how I believe that the shape of words may be influenced by their surroundings. I speculated that, in light of First–Last Combinations, certain glyphs may be used to break up unwanted combinations. My example was that in a phrase such as [dar kedy], [d k] are both ‘strong’ glyphs and a ‘weak’ glyph is inserted between them. One such glyph could be [o], which indeed comes at the beginning of many words.
In this post I would like to look further into this suggestion. We already know that certain word–end glyphs prefer to match up with certain word–start glyphs, but these statistics are very general. Though we can say that [r o] is more common than [r t], we can’t be sure that these two facts are related through transformation. It could simply be that the phrase [or oraiin] is really common and [or tchedy] isn’t. We would need to compare them to [or raiin] and [or otchedy] to get a better insight.
I took six pairs of words: one starting with a strong glyph (in this case two [t], two [k], one [r], and one [l]) and the same word with an [o] added to the start. (I must be clear that by ‘same word’ I simply mean the same string of characters and I make no claim as to actual relatedness here.) So, for example, one pair was [tchey] and [otchey].
I then went through the manuscript and recorded the word which came before each instance of such words, dismissing those which had no word before them and those at the beginning of a line (we know that word statistics are different here). I noted the last glyph of the word which came before, and counted it as strong if it was [n, r, s], and also counted [d] as strong for following words starting with [k, t].
So, for instance, [ar, otain, cheos] would always be strong, and [otey, sho, qokal] always weak. A word such as [qoked] would be strong for the [t, k] word pairs, but weak for the [l, r] pairs.
Here are the results:
||Tokens After Strong
||After Strong %age|
The strong percentages for words not beginning [o] range from 4% to 24%, for the words beginning [o] from 33% to 67%. These are quite wide spreads, but the two ranges do not overlap. Also, in each pair the word with [o] has at least double the strong percentage of the one without [o] at the start.
Some of the words occur relativel few times, which may make the statistics unreliable in places. But this is a necessary problem as the total number of any word is uncontrollable. Despite this, the pattern is consistent. Of course, running such statistics for more word pairs would provide greater evidence.
I hope that it is, however, enough for us to consider the hypothesis that in some instances a word–start [o] is used to break up a strong–strong sequence. Many instances of words beginning [o] obviously don’t do this but that needn’t worry us. A word such as [okedy] can be a word in its own right as well as a version of [kedy].
This asks the question as to which of the two versions is original. There are two pieces of evidence, though they contradict one another. The first is that words beginning [o] are more common as labels than in the text, and it is in labels we ought find the lowest level of influence from surrounding words. The other evidence is that the first glyph of Voynich words contains less information than in a natural language, suggesting that it is less integral to the word.
No doubt the ideas of this post will be controversial, so thoughts are very welcome.