The fairly strict word structure of Voynich words invites the researcher to try and typify it into a set of simpler rules. Tiltman, Roe, and Stolfi have all attempted this with differing results. I too have tried it, looking at low level and high level word structure.
However, I think my attempts can be improved by adding easier ways to think about common sequences of characters. The one I want to talk about below I will call the Weak String due to the identity of its constituents* as members of the ‘weak’ group.
If we combine any of the characters [ch, sh, e, ee, o, y] into different strings only a subset of those possible would be valid and common in the Voynich text. The total number of combinations which would be common and valid is small enough that we can list them all: [o, y, eo, ey, eeo, eey, cho, chy, cheo, chey, cheeo, cheey, sho, shy, sheo, shey, sheeo, sheey]. That’s 18 combinations in all, most of which occur as standalone words, as well as within longer words.
The pattern is simple to describe, if a little long: all combinations must contain either one of [o, y] (including [a] the variant of [y]); one of either [e, ee] is optional, and one of either [ch, sh] is optional; any occurrence of [e, ee] must come before [o, y]; and any occurrence of [ch, sh] must come before [o, y], and [e, ee] if present.
One noticeable feature of the structure of Voynich words is that all the variations within a pattern tend to occur. So if we state that [t] can be followed by the Weak String, we know that the 18 combinations listed above should be found in this context. This is basically true, though [tcheeo, tsheeo, tsheey] only occur twice each.
We can also state, as a further example, that the Weak String occurs before [k]. Examples of all but [sheeok–] exist, though [cheeok–] only has three tokens.
Indeed, these statements can be further abstracted by putting [t, k] into a single type and saying that ‘the Weak String can occur before or after a gallows character’. There may well be restrictions on occurrence both before and after in the same word, but such things are the subject of further research.
What the Weak String actually stands for or why it occurs is hard to understand. It may simply be an abstraction of a natural pattern, with no inherent meaning. It could set out a specific function within a syllable. What we can say is that where it is permissible for some members of the string to occur it is usually permissible for all the variants to occur.
*I did originally give this string a different name, but renamed it after the ‘weak’ group of characters found in first–last combinations.