Following a discussion with René Zandbergen on the similarity between [l] and [r], I thought it best to share some statistics I’ve had hanging around for a while. I made them when looking into another hypothesis, which I really should share as well some day. The statistics relate to words which are ‘weak strings’ followed by [l, r].
(Weak strings are my name for the very common combination of a bench [ch, sh], followed by zero to two [e], and ending with [o, y/a].)
The statistics show the very possible combinations of weak strings, with an additional ending of [r] or [l], and the number of tokens. Here are the two tables:
chor | 220 | shor | 97 | chol | 397 | shol | 187 | |
cheor | 100 | sheor | 51 | cheol | 173 | sheol | 114 | |
cheeor | 14 | sheeor | 9 | cheeol | 9 | sheeol | 14 | |
char | 72 | shar | 34 | chal | 48 | shal | 15 | |
chear | 51 | shear | 21 | cheal | 30 | sheal | 21 | |
cheear | 1 | sheear | 2 | cheeal | 2 | sheeal | 1 |
The statistics are interesting because they show that the two sets of words share a common pattern. There are three possible factors by which to alter the weak string: replace [ch] with [sh], switch between the number of [e] from none to two, and change [o] for [y] (here [y] is expressed as [a], as is to be expected).
Changes in each one of these factors produces the same kind of frequency change, regardless of whether the word ends [r] or [l]. So, in all cases the [o] form of the word is more common than the [y] form. Likewise, in all but two cases the [ch] form is more common than the [sh] form (both exceptions are minor). Lastly, increasing the number of [e] makes the word less common, with one exception.
The similarity in frequency patterns suggest some underlying cause. One possibility is that [r, l] are suffixes to a shared set of words which already has that pattern. I investigated this hypothesis (in fact, it was the reason I made the statistics in the first place), but the link appears to be weak.
Here are the frequencies for the same twelve words without any suffix:
cho | 69 | sho | 130 |
cheo | 65 | sheo | 47 |
cheeo | 17 | sheeo | 8 |
chy | 155 | shy | 104 |
chey | 344 | shey | 283 |
cheey | 174 | sheey | 144 |
The differences should be clear: the [y] forms are more common than the [o] forms; one [sh] form is significantly more common than the [ch] form, and increasing the number of [e] does not make the word less common.
We must therefore find another explanation for the similarity of the frequency patterns in [r] and [l]. It may be that [r] and [l] have some underlying link, such as sound, which means they behave in similar ways. They do, generally, have a similar distribution in words, which reinforces this suggestion.
One final observation is that words with [o] are more common ending [l] than [r], and the opposite is true for words with [y]. This can be seen in the statistics above, but also in the text as a whole, as shown in the table below:
All | With [o] | % | With [y] | % | |
Ending [l] | 5885 | 3590 | 61 | 2061 | 35 |
Ending [r] | 5595 | 2256 | 40 | 2646 | 47 |
(Percentages do not sum to 100%, because some instances of word final [r, l] may be preceded by other characters, particularly [i] in the case of [r].)
If the change between [y, o] can cause a shift between [r, l], then maybe they are closely linked? All thoughts are welcome.
For the tables in your post the statistics differ for Currier A and B.
In Currier A the words [chol], [shol], [sho] and [chy] are frequently used. An increasing number of [e] does make a word less common for Currier A.
Currier A
chol 280 shol 118
cheol 71 sheol 37
cheeol 3 sheeol 5
chal 15 shal 1
cheal 7 sheal 2
cheeal 1 sheeal —
In Currier A multiple [sh] forms are more common than the [ch] forms.
cho 48 sho 106
cheo 12 sheo 15
cheeo — sheeo 1
chy 104 shy 65
chey 78 shey 60
cheey 34 sheey 39
In Currier B everything changes. The words [chol], [shol], [sho] and [chy] are less frequent then in Currier A. Instead the words [chey], [shey], [cheey] and [sheey] are frequently used. Because of the usage of [chey] and [shey] in Currier B an increasing number of [e] does not make a word less common. [sh] forms are not more common than the [ch] forms in Currier B.
Currier B
chol 89 shol 55
cheol 88 sheol 64
cheeol 6 sheeol 9
chal 27 shal 13
cheal 22 sheal 12
cheeal 1 sheeal 1
cho 12 sho 13
cheo 33 sheo 20
cheeo 16 sheeo 7
chy 31 shy 32
chey 238 shey 193
cheey 122 sheey 92
LikeLike
Thanks for that, Torsten. I’ll redo do the stats and look at my ideas again.
LikeLike