Linestart Words

In the 1970s Prescott Currier introduced a number of key observations regarding the Voynich text, one of which he called the “line as a functional unit”. This idea was that patterns in the text respond to its physical exposition—the actual writing of glyphs in the lines of a manuscript. Several line patterns have already been noted by researchers: the position of [m] as the last glyph in line end words, and also the grouping of one–legged gallows [f, p] on the first line of a paragraph (known as Tiltman lines).

I wish here to explore a further line pattern: linestart words, specifically the first glyph of the first word of a line. I have taken for my study the section of text known as “Stars” or “Recipes” which fills most of Quire 20 (103r to 116r). The choice for this selection is that it is a fairly homogeneous stretch of writing which shows strong tendencies for particular linestart glyphs, as will shortly be seen.

I used the Takahashi transcription, and discarded a few words which had unreadable initial glyphs. The whole sample thus had 9,546 tokens (individual instances of words), of which 933 were linestart (9.8% of all tokens).

It is immediately possible to see that statistics for the first glyph of linestart words is different from the text as a whole. The graph below shows what percentage of linestart words begin with a given glyph, compared to all words.

Percentage of tokens linefirst and all

The most common initial glyphs of linestart words are often uncommon as initial glyphs in the text as a whole. Of the six most common initial glyphs for linestart words, only [o] is also common in the text as a whole. The others [y, d, p, s, t] are more—typically much more—common linestart than in the whole of the text. The rest of this article will be looking at these six glyphs in more detail to understand their individual patterns.

[s] — Although one of the strongest linestart glyphs (57.5% of all tokens beginning with [s] are linestart), the kinds of words which come at the beginning of lines are mostly the same as those found in the whole text. There does not seem to be a further pattern except its heavy initial occurrence.

[t] — For multiple reasons this is the most curious linestart glyph. It occurs 210 times at the beginning of a word in the text as a whole, but 82 of these are linestart (39%). Yet [k] appears 275 times at the beginning of a word in the whole text, but only 21 times linestart, which at 7.6% linestart is near to the 9.8% of the text as a whole. So the two common gallows glyphs do not obey the same pattern.

There is an interesting connection between the appearance of gallows glyphs in words beginning with [o]. The glyph [o] is the most common wordstart glyph in the text as a whole at 23.8% of all words, but halves in linestart positions at only 12.8%. Words beginning [ok] and [ot] are roughly equal in the text as a whole, with 665 tokens and 712 tokens respectively. But linestart [ok] has 32 tokens (4.8%) and [ot] has 17 tokens (2.4%). The ratio for [ok] is what we would expect, the ratio of whole text to linestart being 9.8%, but words beginning with [o] being half as common in that position. Yet wordstart [ot] is half as common again. The number of tokens involved is small, but it may hint at an underlying pattern.

[p] — This is the strongest linestart glyph (68.9% of all its word–initial tokens), but may result from Grove words. The Stars section has many paragraphs and so many Grove words at their beginnings. This inevitably means that the statistics for [p] may be somewhat skewed.

However, as with [ot], words beginning [op] exhibit an interesting patter though even stronger. There are no words beginning [op] linestart, yet 133 in the text. The potential interaction between Tiltman lines and linestart effects could be the cause of Grove words, but this is unlikely.

[y] — This is the most common linestart glyph with 163 tokens, though almost the same amount occurs in the rest of the text. But looking in more detail, words beginning [ych] account for 43% of its linestart tokens, and [ysh] for another 20%. These are also strongly linestart, with 88% of [ych] and 97% of [ysh] being in that position.

Considering that [y] is the most common glyph at the beginning of linestart words and [o] is the most common for the text as a whole, these two glyphs once again appear to be in the same class (or could even be vowels). Also, although [a] is likely in some way equivalent to [y] there does not seem to be a strong relationship between the linestart occurrence of [y] and those of [a] in the rest of the text.

[d] — This glyph shows a similar, though weaker, pattern to [y] with 150 out out 418 tokens being linestart. Although the proportion of linestart [dch] (21%) and [dsh] (16%) are a much lower part of the total than compared to [y], they are likewise strongly linestart. Eighty percent of words beginning [dch] and 96% of [dsh] are linestart.


There are clear patterns to which glyphs begin linestart words, with a strong difference between linestart and the text as a whole. This alone strengthens Currier’s line pattern observations and adds another aspect. A cursory inspection of the whole Voynich manuscript suggests that this linestart pattern is found throughout Currier B pages, though it may be much weaker or non–existent in Currier A. Further research is needed on this.

The linestart patterns seem to fall into two parts: one for [t, p] and another for [y, d]. The glyphs [t, p] could result from words that might otherwise begin [ot, op] in the rest of the text. If there is a process which is stripping the initial [o] from linestart words, then this would also explain the much lower occurrence of the glyph in that position.

The glyphs [y, d] seem to begin words that might otherwise begin [ch, sh] in the rest of the text. Once again, there could be a process which adds [y, d] to words beginning [ch, sh] when they are in a linestart position. Unlike [t, p] which are both gallows glyphs [y, d] have not up to now been suspected of being strongly related.

Future work needs to look at the most common wordstart glyphs in the whole of the text and try to understand what might be happening to make them less common at the start of a line. Although [o, ch, sh] may be subject to some kinds of transformation which lowers their occurrence, [q, l, a] have not been fully discussed or explained here.


4 thoughts on “Linestart Words

  1. Hi Emma,
    just wanted to say how valuable I find this post, and your work on the linefirst phenomena in general. I’ve visited this page many times (as well as your posts on transformation theory and linefirst transformation) and found them very useful in my attempts to understand the Voynich text’s behavior. You present the information in a way that is very clear and understandable even for non-linguists like myself, and it is much appreciated.


  2. Extremely interesting. A question, though: could the linestart phenomenon simply be due to words being split across lines (as was somewhat common in old manuscripts)? For example, if you take a “weird” linestart word and combine it with the last word on the previous line, does it tend to make a common word?


    • The structure of Voynich words is quite rigid and words at the start (and end) of lines are often unusual. Combining them only makes for even more unusual words.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s