Initial [o] Transformation

In my last post I spoke about my Transformation Theory, and how I believe that the shape of words may be influenced by their surroundings. I speculated that, in light of First–Last Combinations, certain characters may be used to break up unwanted combinations. My example was that, in a phrase such as [dar kedy], [d k] are both ‘strong’ characters and a ‘weak’ character is inserted between them. One such character could be [o], which indeed comes at the beginning of many words.

In this post I would like to look further into this suggestion. We already know that certain word–end characters prefer to match up with certain word–start characters, but these statistics are very general. Though we can say that [r o] is more common than [r t], we can’t be sure that these two facts are related by a transformation. It could simply be that the phrase [or oraiin] is really common and [or tchedy] isn’t. We need to compare them to [or raiin] and [or otchedy] to get a better insight.

I took six pairs of words, one beginning with a strong character (in this case two [t], two [k], one [r], and one [l]) and the same word with an [o] added to the beginning. (I must be clear that by ‘same word’ I simply mean the same string of characters and I make no claim as to actual relatedness here.) So, for example, one pair was [tchey] and [otchey].

I then went through the manuscript and recorded the word which come before each instance of such words, dismissing those which had no word before them and those at the beginning of a line (we know that word statistics are different here). I noted the last character of the word which came before, and counted it as strong if it was [n, r, s], and also counted [d] as strong for the following words beginning [k, t].

So, for instance, [ar, otain, cheos] would always be strong, and [otey, sho, qokal] always weak. A word such as [qoked] would be strong for the [t, k] word pairs, but weak for the [l, r] pairs.

Here are the results:

Word Total Strong No Strong %age
tchey 19 2 10.5
otchey 27 9 33.3
tchedy 10 2 20.0
otchedy 30 13 43.3
kchey 17 4 23.5
okchey 26 14 53.8
kchedy 20 2 10.0
okchedy 23 12 52.2
lol 35 3 8.6
olol 15 10 66.7
raiin 73 3 4.1
oraiin 32 18 56.3

The strong percentages for words not beginning [o] range from 4% to 24%, for the words beginning [o] from 33% to 67%. These are quite wide spreads, but the two ranges do not overlap. Also, in each pair the word with [o] has at least double the strong percentage of the one without [o] at the start.

Some of the words occur only a few times, which may make the statistics unreliable in places. But this is a necessary problem as the total number of any word is uncontrollable. Despite this, the pattern is consistent. Of course, running such statistics for more word pairs would provide greater evidence.

I hope that it is, however, enough for us to consider the hypothesis that  in some instances a word–start [o] is used to break up a strong–strong sequence. Many instances of words beginning [o] obviously don’t do this but that needn’t worry us. A word such as [okedy] can be a word in its own right as well as a version of [kedy].

Of course, this ask the question as to which of the two versions is original. There are two pieces of evidence, but which contradict one another. The first is that words beginning [o] are more common as labels than in the text, and it is in labels we ought find the lowest level of influence from surrounding words (obviously). The other evidence is that the first character of Voynich words contains less information than in a natural language, suggesting that it is less integral to the word.

No doubt the ideas of this post will be controversial, so thoughts are very welcome.


2 thoughts on “Initial [o] Transformation

  1. Hello Emma, thank you for this interesting discussion! The subject of labelese is fascinating! Labels can be expected not to be influenced by the surrounding text, yet they might not always be “unmodified” words. In particular, the frequent o- prefix has often been considered as a possible article. As a parallel, the names of the mansions of the moon (which are derived from Arabic) incorporate the al- definite article, with the result that most of them start with that prefix also in Latin lists (D’Imperio provided a couple of versions). The idea has also been discussed by Stephen Bax.

    I guess that one of the many tricky aspects might be that different kinds of variation might happen at the same time (or on the same word at different times):
    * agglutination (like the article example)
    * accidental / arbitrary scribal variation (which was also discussed by Stephen, and seems to happen frequently in medieval Latin manuscripts)
    * morphological variation
    * euphonic “adaptation” (of which you discussed examples in your previous post)

    Labels should be exempt at least from the last two, which in principle makes them considerably more “stable” than the rest of the text. If o- is an article, what kinds of euphonic adaptation can be applied to it in non-label text? Can an article just be dropped if it doesn’t sound “well”? I guess that what happens in known languages might provide an idea of the most likely options…


  2. One of the interesting things about ‘labelese’ is that I can’t find a really good definition of it. Neither exactly which words it applies to nor a thorough description of its properties. It simply seems to be a name for something vaguely understood and agreed to exist.

    I think the idea that [o] could be representative of an article is interesting. However, if that hypothesis derives from astronomical names then there’s no reason for it to apply to the rest of the text. The Arabic articles would be simply accepted as part of the word and not understood as articles.


