Several years ago I wrote a few thoughts on Grove words. One of the points I made is that some Grove words are likely to be valid. I specifically made this argument:

Yet 31% of words with the gallows removed have no occurrences, which presses for an explanation. There may be two reasons for this. One is that such words are genuinely unique. They properly do not begin with a gallows character but as their only occurrence is at the beginning of a paragraph their form as a Grove Word is the only one known.

The other reason is that some Grove Words properly begin with a gallows glyph. Whatever process adds a gallows to the beginning of Grove Words does not happen if that word already begins with one. It may be that removing the gallows from such a word does result in a more common word, even if both are still valid.

Reviewing my words, the second paragraph calls an even simpler explanation to mind: over 2,000 words in the Voynich text begin with [k] or [t]; none begin with two gallows.

If true, we must ask how the reader knew which gallows were valid and which were added? In our current state of knowledge we can identify some invalid words through their structure. Is there an easier way which we’ve missed?

18 August 2021, Edit: I realise that plenty of Grove words do contain two gallows, just not adjacently. Nor are there any words which start [pr] or [pl], which we would expect to occur a few times. There are, however, 64 words which begin [pol] and 17 which begin [por], of which 52 and 15, respectively, are paragraph initial. Paragraph initial words which begin [pok], [pot], and [pyk] also exist, though in tiny numbers.

Thinking back to the division between ‘strong’ and ‘weak’ glyphs, it feels almost that, even though the initial gallows is an addition, it won’t sit adjacent to other ‘strong’ glyphs and other glyphs need to be inserted. I’m jumping several steps ahead, I know, but it would suggest that the initial gallows in Grove words is not purely decorative but retains some of its original features which means that [pk], [pt], [pr], and [pl] are unacceptable.

(Though it’s worth noting that [pd] is acceptable – and in any position, not only initially. In fact, [pd] is as common as [td] and [kd] combined.)

15 thoughts on "Some Grove words are valid

  1. Grove words might not be words at all: they might be nulls: they might be a numbering scheme: they might have a different structure to other words: some of their initial gallows might simply mean “Item”: there may be a whole set of behaviours that overlap that all look like gallows, despite being different: and so on.

    In short, I’m more than a bit suspicious of arguments that start with an either-or choice. With Voynichese, most lists start with ten options (and then extend). 🙂


    • I think that given we might see similar behaviour in line patterns (adding glyphs to the start of words) it would be the simplest explanation that this is the same.

      Neither case is proven yet, but I feel we would need stronger evidence against than otherwise at this time.


    • I’ve always used the definition: a) first word of a paragraph with b) an “apparently” out of place initial gallows. But it seems to me that the b) part is so hard to agree. What exactly is the structure which is wrong?


  2. Hi Rene and Emma,
    I recently asked myself that same question. I searched for the earliest reference by Grove himself and the best I could find is an April 1998 mailing-list message where he wrote:
    “One statistic that appears far too often in the manuscript for normal ‘alphabetic’ usage is the use of a gallows ‘t’ or ‘k’ commencing paragraphs.”

    It seems strange that ‘f’ and ‘p’ were originally excluded, but they were soon added. In August 1998 Stolfi wrote:
    “My working hypotheses are that, in general, (1) and are embellished versions of and ; and (2) a gallows letter that starts a paragraph—whether , , , or —is not part of the text.”
    In a later message, Grove agreed with this idea.

    Apparently, Stolfi later adopted a different meaning and classified as Grove words those that are rejected by his word-grammar and “look ‘normal’ except for an extra initial gallows”: this appears to disregard the paragraph-start position.

    In my opinion, the most useful definition of the term is what Emma used in “A Few Thoughts on Grove Words”:
    “Grove Words are defined as words which appear at the beginning of a paragraph and begin with a gallows glyph: [k, t, f, p].”

    This definition is fairly consistent with the original statements by Grove and Stolfi; it also has the advantage of being easily to verify, so it should be possible to agree about which words belong to this class.
    As Emma pointed out, the alternative of taking subtler details of word-structure into account is likely to make things tricky. Also, it seems that Grove initial observation was not based on anything so sophisticated.


  3. Copy-paste messed up Stolfi’s 1998 quote. I hope this fixes it:

    “My working hypotheses are that, in general, (1) ‘p’ and ‘f’ are embellished versions of ‘t’ and ‘k’; and (2) a gallows letter that starts a paragraph—whether ‘p’, ‘f’, ‘t’, or ‘k’—is not part of the text.”


    • I think that note is trying to separate out “special” words from Grove words, which is interesting. Though his definition seems a little confused here [edited a little for clarity]:

      “We define a word to be “special” if it is a line-initial word of paragraph or starred-paragraph text (in any transcription), has at most one “*”, has at least four letters, and either:
      1) is the first word of the paragraph and starts with any gallows, or
      2) contains a “p” or “f” gallows, or
      3) contains two or more gallows.

      Note that this definition does not include any Grove word that begins with a “t” or “k” gallows, is not paragraph-initial, and contains no other gallows. Such a word may be an ordinary word that just happened to resemble a Grove word and happened to fall in line-initial position by accident.”

      In point 1 he says “starts with any gallows” and then notes that it doesn’t includes those beginning with [t] or [k]. Am I missing something here?


      • Hi Emma,
        if I understand correctly, point 1 is about paragraph-initial words that start with any gallows (this is my preferred definition of Grove words).
        The note is about, for instance, “taor” at the beginning of the last line of f1v: it is line-initial but not paragraph-initial, so 1 does not apply.
        All Stolfi’s special words must be line-initial: “ksheo” and “kchoy”, words 2 and 3 in line 1 of f4v, are also excluded.


  4. Grove words give me a headache 🙂 They are so often referred to, and yet the definition seems elusive, so thanks for helping to clear it up! But I have another issue. From what I can see (if I haven’t made a mistake in my counting), 38% of VMS words that *do not* start with a gallows, can produce a valid VMS word if you remove the first glyph. And, 46% of all VMS words make a valid word if you remove the first glyph. This is obviously a peculiar feature of all VMS words, not just the Grove subset. (As an aside, for English, I think that only about 8% of words make a valid word if you remove the first letter … right?) So, I am trying to understand why Grove words are special 🙂


    • Hi Julian, I wrote a reply to your comment yesterday then deleted it. I thought about trying to define Grove words really well (tighter and clearer than the definition offered to Rene), but I realised it wasn’t easy.

      Now that I’ve had chance to sleep on my reply, I can only say that Grove words do, indeed, need a better definition. I guess that I came into researching the Voynich manuscript when it was already an established term, and maybe took them for being better established than they are.


      • Hi Emma, thanks very much. It reminds me of some years ago on the VMS list, when someone suggested making a list of all the features of the text that everyone agreed on. Several were suggested, and rapidly shot down, and in the end, after a lot of back and forth, it was agreed that nobody could agree on anything to put in that list:-)

  5. Just as if this wasn’t complicated enough 🙂 the majority of word types appear only once in the text, and some of these only by virtue of where we draw the limit with word spaces.
    So even the definition of ‘valid word’ is quite shaky.


    • Yes, I agree, and I think I’m sometimes too confident or enthusiastic. I do think definitions of Grove words and ‘valid’ words are possible, but need a more thorough consideration.


  6. Page-initial weirdo gallows have always struck me as analogous to ornate capitals: while paragraph-initial gallows continue to strike me as analogous to the kind of fancy [Item] shapes you often find in fifteenth-century manuscripts. Similarly, to my eyes line-final EVA -m / -am resembles the kind of line-final “double-line” hyphen you often see in the mid-fifteenth century (and which appears in many incunabula).

    So personally, I’m perfectly comfortable with the suggestion that all these three could be Voynichese non-textual page mechanisms. However, Voynichese statistics very rarely look past any one of these mechanisms (let alone all three): so I can’t help but wonder if the stats we tend to work from should be flattened and normalised first. Just a thought.


