In my last post I spoke about my Transformation Theory, and how I believe that the shape of words may be influenced by their surroundings. I speculated that, in light of First–Last Combinations, certain characters may be used to break up unwanted combinations. My example was that, in a phrase such as [dar kedy], [d k] are both ‘strong’ characters and a ‘weak’ character is inserted between them. One such character could be [o], which indeed comes at the beginning of many words.
In this post I would like to look further into this suggestion. We already know that certain word–end characters prefer to match up with certain word–start characters, but these statistics are very general. Though we can say that [r o] is more common than [r t], we can’t be sure that these two facts are related by a transformation. It could simply be that the phrase [or oraiin] is really common and [or tchedy] isn’t. We need to compare them to [or raiin] and [or otchedy] to get a better insight.
I took six pairs of words, one beginning with a strong character (in this case two [t], two [k], one [r], and one [l]) and the same word with an [o] added to the beginning. (I must be clear that by ‘same word’ I simply mean the same string of characters and I make no claim as to actual relatedness here.) So, for example, one pair was [tchey] and [otchey].
I then went through the manuscript and recorded the word which come before each instance of such words, dismissing those which had no word before them and those at the beginning of a line (we know that word statistics are different here). I noted the last character of the word which came before, and counted it as strong if it was [n, r, s], and also counted [d] as strong for the following words beginning [k, t].
So, for instance, [ar, otain, cheos] would always be strong, and [otey, sho, qokal] always weak. A word such as [qoked] would be strong for the [t, k] word pairs, but weak for the [l, r] pairs.
Here are the results:
|Word||Total||Strong No||Strong %age|
The strong percentages for words not beginning [o] range from 4% to 24%, for the words beginning [o] from 33% to 67%. These are quite wide spreads, but the two ranges do not overlap. Also, in each pair the word with [o] has at least double the strong percentage of the one without [o] at the start.
Some of the words occur only a few times, which may make the statistics unreliable in places. But this is a necessary problem as the total number of any word is uncontrollable. Despite this, the pattern is consistent. Of course, running such statistics for more word pairs would provide greater evidence.
I hope that it is, however, enough for us to consider the hypothesis that in some instances a word–start [o] is used to break up a strong–strong sequence. Many instances of words beginning [o] obviously don’t do this but that needn’t worry us. A word such as [okedy] can be a word in its own right as well as a version of [kedy].
Of course, this ask the question as to which of the two versions is original. There are two pieces of evidence, but which contradict one another. The first is that words beginning [o] are more common as labels than in the text, and it is in labels we ought find the lowest level of influence from surrounding words (obviously). The other evidence is that the first character of Voynich words contains less information than in a natural language, suggesting that it is less integral to the word.
No doubt the ideas of this post will be controversial, so thoughts are very welcome.