Weak Strings and [l, r]

Following a discussion with René Zandbergen on the similarity between [l] and [r], I thought it best to share some statistics I’ve had hanging around for a while. I made them when looking into another hypothesis, which I really should share as well some day. The statistics relate to words which are ‘weak strings’ followed by [l, r].

(Weak strings are my name for the very common combination of a bench [ch, sh], followed by zero to two [e], and ending with [o, y/a].)

The statistics show the very possible combinations of weak strings, with an additional ending of [r] or [l], and the number of tokens. Here are the two tables:

chor 220 shor 97 chol 397 shol 187
cheor 100 sheor 51 cheol 173 sheol 114
cheeor 14 sheeor 9 cheeol 9 sheeol 14
char 72 shar 34 chal 48 shal 15
chear 51 shear 21 cheal 30 sheal 21
cheear 1 sheear 2 cheeal 2 sheeal 1

The statistics are interesting because they show that the two sets of words share a common pattern. There are three possible factors by which to alter the weak string: replace [ch] with [sh], switch between the number of [e] from none to two, and change [o] for [y] (here [y] is expressed as [a], as is to be expected).

Changes in each one of these factors produces the same kind of frequency change, regardless of whether the word ends [r] or [l]. So, in all cases the [o] form of the word is more common than the [y] form. Likewise, in all but two cases the [ch] form is more common than the [sh] form (both exceptions are minor). Lastly, increasing the number of [e] makes the word less common, with one exception.

The similarity in frequency patterns suggest some underlying cause. One possibility is that [r, l] are suffixes to a shared set of words which already has that pattern. I investigated this hypothesis (in fact, it was the reason I made the statistics in the first place), but the link appears to be weak.

Here are the frequencies for the same twelve words without any suffix:

cho 69 sho 130
cheo 65 sheo 47
cheeo 17 sheeo 8
chy 155 shy 104
chey 344 shey 283
cheey 174 sheey 144

The differences should be clear: the [y] forms are more common than the [o] forms; one [sh] form is significantly more common than the [ch] form, and increasing the number of [e] does not make the word less common.

We must therefore find another explanation for the similarity of the frequency patterns in [r] and [l]. It may be that [r] and [l] have some underlying link, such as sound, which means they behave in similar ways. They do, generally, have a similar distribution in words, which reinforces this suggestion.

One final observation is that words with [o] are more common ending [l] than [r], and the opposite is true for words with [y]. This can be seen in the statistics above, but also in the text as a whole, as shown in the table below:

All With [o] % With [y] %
Ending [l] 5885 3590 61 2061 35
Ending [r] 5595 2256 40 2646 47

(Percentages do not sum to 100%, because some instances of word final [r, l] may be preceded by other characters, particularly [i] in the case of [r].)

If the change between [y, o] can cause a shift between [r, l], then maybe they are closely linked? All thoughts are welcome.

Observation on Double Dealers

Stolfi put the characters [d, l, r, s] together in a loose grouping he called ‘dealers’. The characters do not all act alike, nor do they look alike. They have some similarities, however, including the ability to occur both at the beginning or end of words.

In this post I want to mention briefly an observation on these dealers characters. Unlike gallows, which almost never occur next to one another, the dealers do so. And the patterns by which they do are interesting.

Below is a table for all dealers bigrams in the Voynich text:

1st \ 2nd l r d s
l 28 40 452 162
r 18 2 43 6
d 82 14 23 21
s 5 4 32 6

(The rows show the first characters in a pair and the column the second character.)

Note that the two most common bigrams [ld, ls] begin with [l], and the third most [dl] ends with an [l]. Of course, bigrams with [d] are also common, so we should not read too too much into these numbers.

However, because these bigrams may occur anywhere in the text we cannot be sure they are not split over syllable boundaries. This is an important consideration if we wish to judge which combinations are valid and which are not. Consider that the English word ‘weightlifting’ does not show that the combination /tl/ is valid: the /t/ is the end of one syllable while the /l/ is the beginning of another.

We can dodge this problem by only counting those double dealers which occur at the beginning and ends of words. In this way we can be assured that the bigram is unlikely to be split over a syllable boundary.

Below is a table for double dealers at the beginning of words:

1st \ 2nd l r d s
l 12 5 44 9
r 1 0 0 0
d 18 2 6 7
s 1 0 1 2

As we can see, many combinations simply don’t occur, and none beginning [r, s] can be considered valid. We again see that [d, l] are the characters with more frequent combinations, though the numbers overall are very, very low. Even the most common [ld], occurs just 44 times, and more than half of these are at the end of lines—a position which means they may be atypical.

The next table is for double dealers at the end of words:

1st \ 2nd l r d s
l 6 17 34 106
r 12 2 5 2
d 29 9 1 11
s 3 3 12 2

These are somewhat better results. The most common bigram [ls] occurs enough that it is seen a few times (though just a few!) in all sections of the manuscript. Even so, it is still a word which occurs markedly at the end of lines.

This last table also shows that [l] is clearly the most common combining character, far more than [d, r, s]. Again, I want to stress that these numbers are low, and the bigrams can only be marginal to the text as a whole. But when we consider that [l] also combines well with other characters [k, t, ch, sh]—at least as the first character of the pair—we can see that it is exceptional in some way.

Naturally, my thoughts turn to considering what sound might be able to combine in this way, especially in a word structure which is often quite rigid. There is one, but that will have to wait for another day.

Word Position in Quire 20

Marco Ponzi and I have been discussing for the last couple of weeks a curious observation regarding word position in Quire 20. Neither of us can think of a good explanation for what we’re can see, so I think it is worth simply publishing the observation and letting others comment.

It is well known that line patterns exist which cause words with certain characters—beginning with [d, y, s]—to appear at the start of lines. Such patterns are particularly strong in Quire 20. What Marco and I have discovered is that not only does a pattern exist regarding the second position in a line, but that it works in conjunction with the patterns of the first position.

Firstly, let’s go over what we know about first word positions. Using statistics provided by Marco (as all stats in this post are!), below are the words (with over twenty occurrences) most bound to line first position in Quire 20. Given that there are nine or so words in each line, we should expect each word to occur in the first position roughly 11% of the time*, but many have twice or more percent in that position:

[sain] 81%
[saiin] 66%
[sar] 65%
[dair] 41%
[dain] 40%
[daiin] 39%
[y] 29%
[dar] 27%

These are the only words on the list with above average occurrences, the next lowest, [qol], has an occurrence of 11%. The eight words above all fit into the known preference for [d, y, s] at the beginning of lines. Only one other word beginning with one of these three characters and having twenty of more occurrences, [dal], appears on the list.

Now let’s look at the second position in a line. We should expect, given that the line is about nine words long and the first position often taken by an exceptional word, for the other words to appear in each of the other positions about 12.5% of the time—that is, one in eight (again, see note at the bottom).

Below are the words which appear in the second position at more than twice the rate expected, and have over twenty occurrences:

[shey] 37%
[sheol] 32%
[ain] 30%
[shol] 29%
[cheey] 29%
[chey] 27%
[cheo] 27%
[sheedy] 26%
[lkeedy] 26%
[cheol] 25%

The numbers are less drastic than for first position, but still clear. The key point is, of course, the first character. Just as we saw that three characters dominated the most common words in first position, here 8 of the 10 words begin [ch, sh].

I don’t believe that this would happen by chance. The only reason I can think of is that, as discussed in the post about first-last combinations, the first character of second words is influenced by the last character of the foregoing word. Certainly [ch, sh] are ‘weak’ (as is [a]), and all but one of the common first position words end in ‘strong’ characters [r, n].

Yet is this a good explanation? Many combinations of the common first and second words exist in the manuscript, which is reassuring. But I’m doubtful this is the cause of the observation rather than a fact which happens alongside. As mentioned at the beginning, I’m at a loss.

*It’s really a bit higher, as all short lines bring the average line length down while still having identifiable first, second, third positions. That is, there are more first positions than second positions, more second than third, more third than fourth, and so on, regardless of how long the average line is.

First–Last Combinations

The purpose of this post is to look at whether the beginnings and ends of adjacent words show any patterns in frequency. For this post I will call these first–last combinations.

Let’s have an example: when a word ends [y], is the next word more or less likely to begin with [q] or [o]? If we have a word such as [sheey], is the next word more likely to be [qokeedy] or [okeedy]?

We should expect the number of occurrences to be related to the number of times [y] ends a word and [q] or [o] begins a word, assuming that the two are not linked. Any substantially lower or higher frequencies would suggest an underlying process which needs to be investigated.

A few months ago Marco Ponzi very kindly created a batch of statistics on which this post is based. His statistics counted 1) the number of times a given character ends a word, 2) the number of times a given character begins a words, 3) the number of times we should expect a specific first–last combination, 4) the number of times that first–last combination actually occurs, and 5) the ratio of observed from expected.

This last number is important, and it should be stressed it is not the frequency but the deviation from expected frequency. So 1 = expected frequency, 2 = twice expected frequency, and 0.5 = half expected frequency.

The results show that some combinations are more common, others less common, than should be expected. For example, in Marco’s statistics [r] was the last character in a word 4181 times, and [d] was the first character 2334 times. We should expect an [r d] combination—so a phrase like [or daiin]—about 400 times. But such a combination occurs only 205 times.

The ‘missing’ 200 occurrences is something which will need to be explained. But before we attempt to do that let’s see what other patterns there are. Below is a table with first character in rows and last characters in columns. Each square is a first–last combination with the frequencies (as a decimal, where expected = 1).

I’ve coloured each square, with yellow, orange, and red as less common than expected and shades of green as more common. Purple is a broad band in the middle where the frequency is about what was expected.

First/Last y o n r s l d
d 1.1645 1.5816 0.8271 0.5136 0.2822 1.3901 0.4543
k 1.3298 2.2183 0.1725 0.3153 0.2143 1.7832 0.2491
t 1.7464 1.3459 0.2889 0.2779 0.289 1.0369 0.112
r 1.5375 4.4127 0.146 0.2482 0.1793 0.8222 0.6947
l 1.5871 3.0925 0.1944 0.2566 0.4022 0.8925 1.5952
c 0.7529 0.805 1.3639 1.1217 0.9212 1.1129 0.7526
s 0.7814 0.6528 1.1895 1.1856 0.848 1.2135 0.8451
y 0.7962 0.8072 1.3505 1.2406 1.5655 0.6891 1.5775
o 0.8669 0.4835 1.2872 1.203 1.1843 0.8744 0.99
a 0.1794 0.7535 0.828 2.6476 4.386 0.5634 1.457
q 1.682 0.6726 0.6201 0.4356 0.3172 0.66 1.7517

Strong and Weak Groups

The most striking pattern which emerges is the presence of two main groups, which I’ve named strong and weak (though the names have no meaning). The strong group contains [k, t, r, n, s] and the weak group contains [y, o, a, ch, sh]*.

For characters in these groups the frequency of first–last combinations is very simple. Any combination where the two characters are from the same group is likely to be less frequent, and any combination from different groups is likely to be more frequent than expected.

More specifically, strong first characters are highly correlated with weak end characters and against strong end characters. Weak first characters are moderately correlated against weak endings, and range from ambivalent to highly for strong end characters.

The division is not perfect, as not all characters fall into these two groups, and not all first–last combinations obey the rules. The combination [n a] is maybe the worst of all these, deviating only a little less than [o a].

Exceptions and Outliers

The character [q] is most like the strong group, except that while [y] is very commonly the end of the preceding word, [o] has a much lower frequency. It is noteworthy that, as [q] is almost always at the beginning of a word and before [o], something like two thirds of its occurrences are in the string [y qo].

While [d] is definitely a strong character at the beginning of a word, the situation at the end of a word is mixed. It is still most similar to a strong character, with words ending [d] being infrequent before words beginning [k, t, r, d] and moderately frequent before most weak characters, though not [ch]. However, it is common before words beginning [l, q]. The numbers involved are low, but it is unexplained.

The character [l] is the most wayward. Like [d] it acts as a strong character at the beginning of a word. Yet at the end of a word it is wholly unclassifiable. Most beginning characters appear moderately after a word ending [l], but [y, a, q] appear significantly less and [d, k] significantly more. It simply does not fit into the paradigm.

How Does the Process Work?

As we have looked at the fact that some characters occur more often with others in first–last combinations, one question has consciously been dodged. How do some combinations become more common and other less common?

We are once more faced with a twofold explanation, the same as when we examined Grove words and the position of [m], and with all line patterns generally. Do words move around or are they altered? Is a combination like [sheey qokeedy] made by bringing words together or by shaping an existing combination to make them fit?

The answer is impossible to know for sure until we understand the underlying language. But the idea of altering a word fits with what we already know or suspect about how line patterns work. It is also more readily believable as a linguistic process, for such a thing does occur in a number of languages.

Yet this leads us to a second question: which character conditions which in first–last combinations? If a word begins [k], does that cause the preceding word to add a [y] to the end? Or does a word ending [r] cause the next word to adopt an [o] at the beginning?

The answer may be a bit of both. A first–last combination such as [r k] would be avoided quite heavily, but there are words ending [ry] and words beginning [ok], either of which would solve the problem.

Yet [n] is almost always word final, and the only common character it can be regularly substituted for is [r]. As both [n] and [r] are in the strong group their substitution would make no difference. It is more likely, in such a case, that weak group characters are added to the beginning of following words.

What Does it Mean?

The strong and weak groups fall into line with some things we already knew about the Voynich script, though much is new.

I have already proposed that [y, a, o] make up a group of vowels, and so this is likely to be the core driver behind the weak group. The membership of [ch, sh] is a new observation, and the relationship between those characters and vowel is unexplored.

One key link between these weak group characters is in the first syllable of multisyllable words. If the second syllable of a word contains [t, k], then the preceding syllable normally only contains characters from this weak group (and [e], which is not analyzed here), with the sole exception of the syllable [qo].

The strong group is likely to be consonants, which is something we might have already guessed. But here we have a potential phonological process to explore and explain with reference to that fact.

Though [l] is most like characters in the strong group, there is clearly something odd about its value. It is really the most intriguing of all characters in the Voynich script, which is not obvious at first glance.

*The statistics don’t differentiate [ch, sh] but rather [c, s]. The majority will be [ch, sh], but there may be some interference from words beginning [s, ckh, cth, cph, cfh]. It will be interesting to learn if the differentiated statistics bear out all that I’ve said in this post.

Speculation on [lkl]

I’ll keep this short because I don’t really like speculation. But this is too good to let pass. And besides, I’ve taken a New Year’s Resolution to share more of the thinking and research I do on the Voynich, even if it’s a bit outlandish.

Okay, the word [lkl] is special. When I’ve sorted out words into syllables there are some which can’t be classified because they have nothing which passes for a syllable nucleus. Most are single characters, the rest are two character combinations (all the common ones either contain [ch, sh] or [l]).

The exception is [lkl]. It is the longest word without vowels with five or more tokens. It occurs nine times, which marks it a perfectly valid word and not a possible error. It also doesn’t look—unlike many ‘unattached’ characters—as though we’ve misread a space. Only 15 tokens even include this sequence of characters, meaning that the word [lkl] accounts for the majority of them, and that it is not normal for the Voynich language.

So, what can it be? My speculation is that it’s a set word, or more likely an abbreviation. It is a string of three characters with a reading which is not wholly linguistic. The reader would be expected to know what the characters stood for rather than to read them. Thus there are no vowels, and our main clue is that it must begin and end with the same consonant.

If I can really go out on a limb and be wild with my guess, I wonder if [lkl] could be a way of rendering the Latin abbreviation SCS: sanctus. This would mean that the other six tokens containing [lkl] would be something like attempts to use the abbreviation with different cases or number.

A few incidences of [lkl] are followed by words which could be saint names. Most interesting is the one at the bottom of f107r. The following three words all contain one–leg gallows which are anomalous away from the first line of a paragraph. The complete phrase is: [lkl lfchal pchdy pal]. If [lkl] is SCS, then the second word also begins and ends with /s/!

I’ll leave the guessing game there. I’m almost certainly wrong, but it’s a curious thought.

[i] and [e] and Syllables

The following post is something of a hybrid. I want to make a few new points but also reiterate some points I have made in the past. I am sorry if it reads a bit disjointed, but I promise something interesting lurks within.

It is well–known that [i] and [e] occur in many words and often in sequences: that is, more than one in a row. This is a characteristic they share with each other. None of the other characters in the script regularly occur in this way: [e] and [i] and found in repeating sequences over four thousand times each while [oo], the next highest, is found fewer than a hundred times.

One curious aspect of [e] and [i], however, is that they occur less often in words together than might otherwise be expected. Let us have some statistics to show what I mean.

I took the one thousand most common words, which account for over 26,000 tokens, or nearly 70% of the text. They include all words with five or more tokens, so we can be assured that they are not reading or writing errors. I marked them according to the presence of [e] and [i] (and also the number of syllables).

I found that [e] occurred in 39% of all words while [i] occurred in 12% of all words. Were the occurrence of [e] and [i] to be independent we would expected that in a thousand words about 45 to 50 would contain both. However, only 8 words contained both [e] and [i]: [sheaiin], [chedaiin], [chedain], [chekaiin], [cheodaiin], [shedaiin], [sheodaiin], [oteodaiin].

It should be immediately apparent that in seven of these cases the [e] immediately follows a [ch, sh] as part of a syllable string which commonly occurs at the beginning of words. In my discussion of high level word structure I mentioned that these syllables were constrained in what they could contain, compared with the syllables which followed them and could contain anything. I also mentioned that they became even more constrained in three syllable words.

It turns out that the number of syllables appears to be relevant to the occurrence of [e] and [i].

In the thousand most common words the distribution is: one syllable 32%, two syllables 54%, three syllables 12%, with the balance made up of words which are unclassifiable. The distribution of words with [e] is broadly similar, being a little higher in three syllable words and a little lower in one syllable words. But the distribution of [i] is much different: [oteodaiin], given above as one of the eight words which contain both [e] and [i], is the only three syllable word in the sample which contains [i].

So here we have an interesting conjunction: 1) words containing both [e] and [i] are less common than they should be, 2) most of those which do occur begin with [ch, sh] which is much less common in three syllable words, and 3) [i] itself almost never occurs in three syllable words. (In case you are wondering if the non–occurrence of [i] in three syllable words might explain why there are so few words with both [e] and [i], we should still expect about 25–30 two syllable words with both. There are only 6.)

The answer is that the appearance of [e] and [i] are somehow linked and are not independent events. This is not to say that they are variants but that they are often mutually exclusive.

As the number of syllables also works to exclude [i] we might wonder if this is related. Because [i] mostly occurs at the end of a word, and the number of syllables can affect certain linguistic characteristics of a word (such as prosody), [i] could be a marker for those characteristics. Were [e] the marker for a similar or related prosodic process, or a sound thus affected, the occurrence of both in a word would then naturally be less likely.

One last thing to mention is this: of the one thousand most common words 12% contain [i], of one thousand words with a single token then same figure is over 19%. That is, common words are less likely to contain [i] than rare words. Conversely, [e] occurs less in rare words than in common words. Word with both [e] and [i] are still considerably less common that would be expected otherwise.

What is [m]?

The character [m] is a key part of the LAAFU puzzle. It consistently appears at the end of a word (95% of all occurrences) and regularly as the last character of a line (67% of all occurrences). Because it has a restricted distribution conditioned by the line of text, it is a LAAFU feature.

Though we do not know the ultimate cause of [m] distribution (and therefore of LAAFU) we can still investigate the character as a problem.What could make a character appear so often at the end of a line and less often away from that position?

Our starting point should be, as always, that any process which can generate a text the length of the Voynich manuscript, and with the apparent structure of words and lines, must work by a set of regular rules. The appearance of [m] at the end of lines is not simply random but for a reason inherent in the creation of the text. The best way to discover the reason (or at least how the creation process worked) is to make the text more homogenous. That is, how do we get the end of the lines to look like the rest of the lines?

Some time ago I discussed Grove words, which present a similar problem. It was clear that the semantic content of the text could not or would not cause words to take specific characters. There’s also no reason here why words with a particular meaning would sit at the end of a line.

Likewise, it is hard to believe that the word order would be so free as to let such words be moved to the end of a line. Indeed, lines ending [m] are most common in Quire 20 where the lines are longest and would demand the greatest freedom of word order.

As with Grove words—as with other LAAFU effects—it is easiest to imagine that transformations are made to words already in a given position. So, for example, Grove words and linefirst words have characters added to the beginning of an existing word. The same could be adding [m] to the end of words at the end of lines which are valid without that character.

Yet in some words [m] is preceded by [i], an unlikely word ending which would be the outcome were the [m] removed. It is thus more reasonable to propose that the final character of a word is transformed into [m] when that word occurs in certain environments, one of which is the end of a line.

So we are left with the question: if another character is transformed into [m], which character is that? Even if we are only accepting this transformation as a working hypothesis, we should seek to identify the best fit character.

Below are some considerations.

What Comes Last

Because [m] occurs mostly at the end of words, it can only be replacing a character which also appears at the end of words. The character doesn’t have to only appear in that position, however, as the word–final occurrence could be a condition of what causes [m].

The most common word–final characters are, in descending order (with percentages): [y] 40%, [n] 16%, [l] 16%, [r] 14%, [s] and [o] with 3% each, and [d] with 2%. Given that [m] itself makes up 3% of all word–final characters, and that it is unlikely to effect a majority of all instances of a character word–finally as it occurs mostly at the end of a line, we can guess that [y, n, l, r] are the most likely candidates.

What Comes Next Last

We mentioned above that sometimes the character [i] comes before [m], which is does in about 70 tokens. It is one of just a few characters which come before [m], the percentages of which are: [a] 71%, [o] 18%, and [i] 6%. All other characters total less than 5%.

Given that [y] doesn’t occur much after any of these three characters we can rule it out as a candidate. The percentages for [n, l, r] are as follows:

Before [n]: [a] 2%, [o] <1%, [i] 97%

Before [l]: [a] 31%, [0] 56%, [i] <1%

Before [r]: [a] 45%, [o] 39%, [i] 10%

We can see two things instantly: [n] is a bad candidate because it occurs almost exclusively after [i], and that neither [l] nor [r] are perfect fits. From these statistics it would seem [r] is the best fit, but not convincingly near.

Where Does It Come?

Earlier we said that two thirds of all [m] occur at the end of lines. The total number of tokens is about 780. Any character which is transformed into [m] word and line finally should show a distinct drop in occurrences in that context.

The curious truth is that all three of [n, l, r] show lower occurrences word–finally at the end of a line: [n] is 2.5% points lower, [l] is 4.5% lower, and [r] is 6.5% lower. The character [r], with the greatest drop, would seem once again to be the best candidate.

However, [m] is more than 13% points higher at the end of the line, twice the amount by which [r] drops. Indeed, it is about equal to the total drop of all three characters.

The character [y], which we have already dismissed as a potential candidate, does not change in frequency at the end of the line.

What Does It Look Like?

Although we cannot be sure that the strokes from which Voynich characters are composed are meaningful, we have seen some indication of that. The conditioning environment for [y] becoming [a] is the following character containing a short stroke like [i]. Likewise, we have seen patterns in the occurrences of gallows and their composition. It is reasonable to consider whether [m] bears any similarity to  our three candidates.

All three characters, [n, l, r] contain the same short [i] stroke which [m] contains. Also, all three have an additional stroke emerging rightward from the character, much as [m] does. However, [n] has the rightward stroke emerging from the bottom of the [i], whereas [l, r, m] have it emerging from the top.

The rightward stroke of [m] follows a more similar course to [r] than [l]. Whereas the rightward stroke of [l] quickly turns down and leftward, crossing the [i], in [r] and [m] the rightward stroke continues right before turning up and leftward. For [r] the stroke continues leftward, while in [m] it dives down and rightward through its earlier path.

Once again, the best match for [m] in graphical terms is [r].

What Words Do We End Up With?

Our goal, mentioned at the start of this post, is to attempt to ‘restore’ the text to its state before [m] was present. We meet success if the text becomes more regular and thus one step nearer to its creation process.

If we take [r] as the best fit for [m], and change all instances of [m] to [r], what words do we end up with? And do they resemble existing words for [r]?

The short answer is: somewhat.

It is certainly true that a common word ending with [r] tends to be paralleled by a common word ending with [m]. So, for example, [dar] accounts for nearly 6% of all words ending with [r], and [dam] for about 9% of all words ending with [m].

Not perfect, but for most words where [r/m] is preceded by [a] the percentage for [m] is higher, while for those preceded by [o/i] it is lower. Also, there are a number of instances of words ending [ram] which are barely paralleled by words ending [rar].

Conclusion

The character [m] may well be a word and line final variant of another character. If this is so then the best character fit is [r]. However, the fit is not perfect.

The underlying reason for the occurrence for [m] is unknown, but the hypothesis that [m] is a variant of [r] gives us a point from which we can explore further. One specific question is why [r] still occurs word and line finally, and why only a portion of that character is transformed into [m]. Research into the specific environments of these two characters at the end of lines may reveal differences which take us further toward the ultimate cause.