Linefirst Words

In the 1970s Prescott Currier introduced a number of key observations regarding the Voynich text, one of which was the “line as a functional unit” (LAAFU, for the sake of shortness). This idea was that the text responds to its physical exposition—the actual writing of characters in the lines of a book—by changing its characteristics. Several LAAFU effects have been noted already by researchers, in particular the position of <m> as the last letter in line end words, and also the grouping of one–legged gallows <f, p> on the first line of a paragraph (known as Neal lines).

I wish here to explore a further aspect of LAAFU: linefirst words, particularly the first letter of the first word of a line. I have taken for my study the section of text known as “Stars” or “Recipes” which stretches from 103r to 116r, the last full page of the manuscript. The choice for this selection is that it is a fairly homogenous stretch of writing which shows strong tendencies for particular linefirst characters, as will shortly be seen.

I used the Takahashi transcription, and discarded a few words which had unreadable initial letters. The whole sample thus had 9,546 tokens (that it, individual instances of words), of which 933 were linefirst (9.8% of all tokens). Straight away it was possible to see that the first letter of linefirst words had different statistics from the text as a whole. The graph below shows what percentage of linefirst words begin with a given letter, compared to all words.

Percentage of tokens linefirst and all

As you can see, the most common initial letters in linefirst words are often uncommon as initial letters in the text as a whole. Of the six most common initial letters for linefirst words, only <o> is also common in the text as a whole. The others <y, d, p, s, t> are more—typically much more—common linefirst than in the whole of the text. The rest of this article will be looking at these six letters in more detail to see what further patterns can be found.

<s> — Although one of the strongest linefirst letters (57.5% of all tokens beginning with <s> are linefirst), the kinds of words which come at the beginning of lines are mostly the same as those found in the whole text. There does not seem to be a further pattern except its heavy initial occurrence.

<t> — For multiple reasons this is the most curious linefirst letter. It occurs 210 times at the beginning of a word in the text as a whole, but 82 of these are linefirst, or 39%. Yet <k> appears 275 times at the beginning of a word in the whole text, but only 21 times linefirst, which at 7.6% linefirst is fairly near to the 9.8% of the text as a whole. So the two common gallows letters do not obey the same pattern.

Further, there is an interesting connection between the appearance of gallows letters in words beginning with <o>. The letter <o> is the most common wordfirst letter in the text as a whole at 23.8% of all words, but halves in linefirst positions at only 12.8%. Words beginning <ok> and <ot> are roughly equal in the text as a whole, with 665 tokens and 712 tokens respectively. But linefirst <ok> has 32 tokens (4.8%) and <ot> has 17 tokens (2.4%). The ratio for <ok> is what we would expect, the ratio of whole text to linefirst being 9.8% but words beginning with <o> being half as common than the whole in that position. Yet wordfirst <ot> is half as common again. The number of tokens is small, but may be significant.

<p> — This is the strongest linefirst letter (68.9% of all its word–initial tokens), but may result from the existence of Grove words. The Stars section has many paragraphs and so many Grove words at their beginnings. This inevitably means that the statistics may be somewhat skewed.

However, as with <ot>, words beginning <op> exhibit an interesting occurrence, but even stronger. There are no words beginning <op> linefirst, yet 133 in the text as a whole. The potential interaction between Neal lines and linefirst effects could be the cause of Grove words, but this is unlikely.

<y> — This is the most common linefirst letter with 163 tokens, though almost the same amount occurs in the rest of the text. But looking in more detail, words beginning <ych> account for 43% of its linefirst tokens, and <ysh> for another 20%. These are also strongly linefirst, with 88% of <ych> and 97% of <ysh> being in that position.

Considering that <y> is the most common letter at the beginning of linefirst words and <o> is the most common for the text as a whole, these two letters once again appear to be in the same class (or could even be vowels). Yet <o> is still common linefirst, so not too much can be made of this. Also, although <a> is likely in some way equivalent to <y> there does not seem to be a strong relationship between the linefirst occurrence of <y> and those of <a> in the rest of the text.

<d> — This letter shows a similar, though weaker, pattern to <y>, with 150 out out 418 tokens being linefirst. Although the proportion of linefirst <dch> (21%) and <dsh> (16%) are much lower than compared to <y>, they are likewise strongly linefirst. Eighty percent of words beginning <dch> and 96% of <dsh> are linefirst.


There are clear patterns to which letters begin linefirst words, with a strong difference between linefirst and the text as a whole. This alone strengthens Currier’s LAAFU idea and adds another aspect. A cursory inspection of the whole Voynich manuscript suggests that this linefirst effect is found throughout Currier B pages, though it may be much weaker or non–existence in Currier A. Further research is needed on this.

The linefirst effects seem to fall into two parts: one for <t, p> and another for <y, d>. The letters <t, p> could be words that might otherwise begin <ot, op> in the rest of the text. If there is a process which is stripping the initial <o> from linefirst words, then this would also explain the much lower occurrence of the letter in that position.

The letters <y, d> seem to being words that might otherwise begin <ch, sh> in the rest of the text. Once again, there could be a process which adds <y, d> to words beginning <ch, sh> when they are in a linefirst position. Unlike <t, p> which are both gallows letters, <y, d> have not up to now been suspected of being strongly related.

Future work needs to look at the most common wordfirst letters in the whole of the text and try to typify what might be happening to make them less common linefirst. Although <o, ch, sh> may be subject to some kinds of transformation which lowers their occurrence, <q, l, a> have either not been touched upon or explained here.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s