Weak Strings and [l, r]

Following a discussion with René Zandbergen on the similarity between [l] and [r], I thought it best to share some statistics I’ve had hanging around for a while. I made them when looking into another hypothesis, which I really should share as well some day. The statistics relate to words which are ‘weak strings’ followed by [l, r].

(Weak strings are my name for the very common combination of a bench [ch, sh], followed by zero to two [e], and ending with [o, y/a].)

The statistics show the very possible combinations of weak strings, with an additional ending of [r] or [l], and the number of tokens. Here are the two tables:

chor 220 shor 97 chol 397 shol 187
cheor 100 sheor 51 cheol 173 sheol 114
cheeor 14 sheeor 9 cheeol 9 sheeol 14
char 72 shar 34 chal 48 shal 15
chear 51 shear 21 cheal 30 sheal 21
cheear 1 sheear 2 cheeal 2 sheeal 1

The statistics are interesting because they show that the two sets of words share a common pattern. There are three possible factors by which to alter the weak string: replace [ch] with [sh], switch between the number of [e] from none to two, and change [o] for [y] (here [y] is expressed as [a], as is to be expected).

Changes in each one of these factors produces the same kind of frequency change, regardless of whether the word ends [r] or [l]. So, in all cases the [o] form of the word is more common than the [y] form. Likewise, in all but two cases the [ch] form is more common than the [sh] form (both exceptions are minor). Lastly, increasing the number of [e] makes the word less common, with one exception.

The similarity in frequency patterns suggest some underlying cause. One possibility is that [r, l] are suffixes to a shared set of words which already has that pattern. I investigated this hypothesis (in fact, it was the reason I made the statistics in the first place), but the link appears to be weak.

Here are the frequencies for the same twelve words without any suffix:

cho 69 sho 130
cheo 65 sheo 47
cheeo 17 sheeo 8
chy 155 shy 104
chey 344 shey 283
cheey 174 sheey 144

The differences should be clear: the [y] forms are more common than the [o] forms; one [sh] form is significantly more common than the [ch] form, and increasing the number of [e] does not make the word less common.

We must therefore find another explanation for the similarity of the frequency patterns in [r] and [l]. It may be that [r] and [l] have some underlying link, such as sound, which means they behave in similar ways. They do, generally, have a similar distribution in words, which reinforces this suggestion.

One final observation is that words with [o] are more common ending [l] than [r], and the opposite is true for words with [y]. This can be seen in the statistics above, but also in the text as a whole, as shown in the table below:

All With [o] % With [y] %
Ending [l] 5885 3590 61 2061 35
Ending [r] 5595 2256 40 2646 47

(Percentages do not sum to 100%, because some instances of word final [r, l] may be preceded by other characters, particularly [i] in the case of [r].)

If the change between [y, o] can cause a shift between [r, l], then maybe they are closely linked? All thoughts are welcome.


2 thoughts on “Weak Strings and [l, r]

  1. For the tables in your post the statistics differ for Currier A and B.
    In Currier A the words [chol], [shol], [sho] and [chy] are frequently used. An increasing number of [e] does make a word less common for Currier A.

    Currier A
    chol 280 shol 118
    cheol 71 sheol 37
    cheeol 3 sheeol 5
    chal 15 shal 1
    cheal 7 sheal 2
    cheeal 1 sheeal —

    In Currier A multiple [sh] forms are more common than the [ch] forms.
    cho 48 sho 106
    cheo 12 sheo 15
    cheeo — sheeo 1
    chy 104 shy 65
    chey 78 shey 60
    cheey 34 sheey 39

    In Currier B everything changes. The words [chol], [shol], [sho] and [chy] are less frequent then in Currier A. Instead the words [chey], [shey], [cheey] and [sheey] are frequently used. Because of the usage of [chey] and [shey] in Currier B an increasing number of [e] does not make a word less common. [sh] forms are not more common than the [ch] forms in Currier B.

    Currier B
    chol 89 shol 55
    cheol 88 sheol 64
    cheeol 6 sheeol 9
    chal 27 shal 13
    cheal 22 sheal 12
    cheeal 1 sheeal 1

    cho 12 sho 13
    cheo 33 sheo 20
    cheeo 16 sheeo 7
    chy 31 shy 32
    chey 238 shey 193
    cheey 122 sheey 92


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s