The character [e] is one of only two characters in the Voynich script which is regularly found repeated. Both [ee] and [eee] occur enough times to be valid sequences. As all three typically occur in the same position within words, they can be generically thought of as ‘[e] sequences’ filling a specific structural slot.
Although we can guess from their similar appearance and position that [e] sequences are a valid ‘family’ within the wider script, why one length is chosen over another is unknown. It may be that they encode different sounds or meanings. It may also be that they are conditioned by their environment. They have also sometimes been linked to benches [ch, sh], which have a related appearance and are often neighbouring [e] sequences in Voynich words.
I don’t expect in this post to answer these questions, but I want to set out some statistics and thoughts which seem pertinent.
Jorge Stolfi, in his page outlining his Grammar for Voynichese Words, provides some interesting tables for [e] sequences. In words which don’t contain a gallows the following counts for the different lengths are found:
We can clearly see that [ee] is the most common, followed by [eee] at half as common, then [e] which is nearly only a third as common as [ee]. However, these numbers don’t include those [e] sequences following a bench character, which have quite different counts (let [B] stand for any bench, [ch, sh]):
The counts are generally much higher, but the ratios are totally different. The most common is [e], [ee] is only a quarter as common, and [eee] less than 1% as common.
The difference in the ratios is also found in [e] sequences which follow gallows (let [G] stand for any non–bench gallows, [t, k, p, f]):
Although neither of these two sets of token counts match those above in environment without gallows, their general direction is the same. Those [e] sequences without a bench before them have high [ee] and significant [eee]. Those [e] sequences following a bench have low [ee] and insignificant [eee].
I performed some follow–up tests, breaking the figures down with specific gallows characters and with following characters such as [y, o, d]. In each case the same pattern emerged: all other things being equal, [e] sequences are shorter after [ch, sh].
This is also observed about bench gallows [cth, ckh], suggesting that they cause the same environment as [ch, sh]. This chimes with earlier observations on bench gallows and their possible relationship with [ch, sh], namely that they aren’t followed by [ch, sh].
Lastly, I want to mention an old observation by Currier, that [p, f] are never followed immediately by [e] sequences. They are however, followed by [ch, sh], which can then be followed by [e] sequences. The [e] sequences in these cases adhere to the same pattern as for gallows as a whole: [ee] is much less common than [e].
I don’t know how to explain all the above observations. There is likely much more to add before we have the full picture. For example, most words won’t begin or end with [e], but what can we say about the few which do? What observations can we make when a word contains two separate [e] sequences?
Bench characters are clearly an important environment for [e] sequences. Why they seem to govern the length of [e] sequences is unknown, but worth taking up as an idea for future investigations.