# Line Position Mapping

The last two posts on this site have been very similar. Not only have the investigations sought to prove similar things using similar methods, but the result of both was failure. Neither managed to prove the hypothesis I sought to test.

Yet I’m going to do it again. Many times. Even though I know that hanging on to disproven ideas makes me a crank.

Let me explain.

Line Patterns

Prescott Currier was the first to mention that the distribution of glyphs in a line is not flat. Some glyphs seem more or less likely to appear in words at the beginning or end of a line. In Currier’s words, “the occurrence of certain symbols is governed by the position of a ‘word’ in a line.”

It is highly unusual that a text responds to a page in this way. Our experience with writing and reading tells us that we compose a text independent of the page upon which it might be presented. An acrostic is the main exception which comes to mind, with its message hidden in the first or last letters of a line. Poetry might be considered a further example, with layout often governed by meter and rhyme.

Currier’s description of the phenomenon was that that, somehow the “line is a functional entity.” He conceded that he didn’t know what the function might be or what process would create the lines as we see them. Sadly the records of Currier’s work are quite sparse given his influence, so his further thinking on this matter is unknown.

Improved access to imagery, multiple transcriptions, and the ubiquity of computers have allowed us to more thoroughly explore how glyphs are distributed in a line. We know that distribution of glyphs is not flat in any major part of the text, that many glyphs demonstrate this kind of distribution, and that the phenomenon exists for the second position in a line, not only the first and last. The picture is more complex but still essentially the same as that painted by Currier.

The only point I wish to make is that Currier’s idea of the line as a “functional entity” is unproven. It could be correct, and given time and evidence it may be proven so, but the description is too strong for the evidence we have. The observations are correct and the judgement that the distributions are unusual is also correct. But what the distributions mean and how they were created is unknown. The words “functional entity” describe an hypothesis, not the observations.

I will refer to them by they blander, but safer and neutral, “line patterns”.

Describing the patterns

For anybody who hasn’t studied the text of the Voynich manuscript in any depth, the foregoing discussion may seem to float above the ground. No examples from the text were offered to illustrate line patterns. A few would help to show what Currier, and subsequent researchers, have seen.

• The “Stars” section of the manuscript, which fills Quire 20, has almost 1,000 words beginning with the glyph [a]. Only 3 of them occur at the start of a line, though we might expect around 100.
• Part of the Herbal section has just over 300 words ending with the glyph [m]. Over half of them occur at the end of a line.
• Words beginning with [ch] or [sh] are around 70% more frequent than expect in the second position of a line.

These kinds of patterns, sometimes of the same strength but often weaker, can be found for many glyphs. Some glyphs may be more frequent at the start and end of a line. Others may be more or less frequent depending on where they appear in a word: words with a final [l] are less frequent at the end of a line, but words containing [l] occur at the same frequency in that line position. Finally, the line patterns may differ by scribe and section.

In my two previous posts I’ve tried to show line patterns by providing the number of occurrences for each word, glyph, or feature, in different line positions. I have numbered the line positions using positive and negative numbers: 1 to 5 describe distance from the start of a line (1 being closest) and -5 to -1 describe distance from the end of a line (-1 being the closest). Although I’ve seen no line patterns which extend beyond the second position from the start or end of a line, the further out positions are provided to demonstrate what the normal distribution looks line.

(I realise that raw counts of how many times a word occurs may be misleading in some circumstances. For many lines position 5 and -5 may overlap. For a single word line 1 and -1 are the same position. And in these cases the furthest distant positions may show lower counts for a word’s occurrence simply because, in some cases, a word doesn’t exist in that position. The counts for each position should be weighted or provided as a ratio of total words in that position.)

Is the lack of flatness a problem?

Nothing in nature is random. Everywhere there are processes which create the patterns we see around us. When we read a text in English we do not expect to see as many occurrences of the word “Neorxnawang” (a real word!) as the words “and” or “the”. Nor as many occurrences of the letter “x” as of “e” or “t”.

Yet the distribution of glyphs in a line of Voynich text is unusual. A process must have created it, as we can observe it, and the text is too long for the statistics we have to be missampled or skewed by chance. But there are no processes in our experience which easily explain the distribution that we see. So how to we reliably learn about this unknown process?

The bulk of the Voynich text, away from the start and end of lines, was also created by a process. That process is likewise unknown, though we have a larger sample from which to learn. The larger amount might also tempt us to assume that the middle positions of the line are ‘normal’ for Voynich and the line starts and ends ‘abnormal’. This assumption is useful to move forward with research, though it is important not to hold onto the assumption beyond its usefulness.

So we can rephrase the question of whether the lack of flatness is a problem and form a new question suitable for research: how did the process which created the starts and ends of lines differ from the process which created the rest of the text? The research then is looking for a relative difference, not an absolute. We can make observations which set out the statistical differences, and that task has been done in part by several researchers, but this does not speak to the difference in process.

To get closer to the difference in process we must know what one process (which created the bulk of the text) would have created instead. What would the starts and ends of lines look like were they created with the same process as the rest of the text? How does the output of one process translate into the other? If we say that some sections lack words with initial [a] at the start of a line, what do they have instead?

Here I must open-handedly admit that the results of such investigations may be totally negative if the starts and ends of lines are fundamentally different. They may have different content, be unconnected with the rest of the lines, or even be empty padding. It is impossible to know. But without looking we have no chance of finding a positive answer, if it exists.

The perfect opportunity

This research proposal is what I want to called ‘line position mapping’. The name is not intended to excite or arouse curiosity. It only states what the investigations will seek to do and their place of focus in the text. The meaning of line position should be obvious from the discussion above. But the choice of the word ‘mapping’ is deliberate.

Another discovery of Prescott Currier was that sections of the text are statistically distinct. He referred to them as different languages or dialects, and labelled them A and B. This division is well-known to all but the most casual researcher of the Voynich manuscript.

Several researchers have sought to reconcile the two languages by finding correspondences between them. Language A may have many occurrences of certain glyphs or words which are much less frequent in language B. But language B may have its own most frequent glyph and words. So researchers have suggested, in various ways, how something in A matches or maps to another thing in B.

The problem they face is that language A and language B are used by different scribes and in different sections. We don’t know how much of the language difference is dependent on the scribe, on the content, or on some other variable. We cannot hold all the variables involved fixed enough to provide solid results. Any suggestion of a correspondence may be due to other reasons.

Line patterns and line position mapping provide an opportunity which sidesteps many of these problems. As line patterns occur within a line, we know that all the words of that line were written by the same scribe on the same topic. The difference – whatever it might be – must be due to the process of composition (or encoding) and not the individual or the topic.

Any result of these investigations, however partial, are much more solid than might be gained from attempting to map language A to language B.

What will we gain?

It is usually impossible to know exactly what knowledge a piece of research will deliver before we have the results. Even if we have a well-formed question to answer positively or negatively, the impact of the results on future research could be unforeseen.

I hope that two things will result from success in line position mapping. The first is that we should be able to make the text more regular. Words and phrase which may have previously lay hidden across a linebreak due to the influence of a line pattern, will suddenly become clear. The text may be more amenable to research on the relationships between words and phrase structure.

The second is that we will be able to draw relationships between glyphs and bigrams. In my previous post I sought to understand if bench gallows could be “unpacked” by looking at the statistics of words at the start of lines, where bench gallows don’t occur. The result was negative and I found nothing. But it was a possibility.

Another real possibility is explaining the nature of [m], which adheres to strong line pattern. Or maybe elucidating the difference between [ch] and [sh] which share similar, but not exactly the same, line patterns. There is the potential to learn more about any glyph which has a line pattern. Even understanding just one line pattern will teach us something we didn’t know before.

Which is why I’ve decided to become a crank when it comes to line patterns.

Many ideas in this post have been stated before elsewhere. I thought I would bring them together to make a complete argument for the importance of this area of research. I encourage other researchers to look into line patterns and undertake their own investigations into line position mapping.