The last post on Grove Words showed some evidence that the initial gallows glyph is, in most cases, simply added to the beginning of an existing word. We also saw in another earlier post on Linestart Words that the statistics for the initial glyphs of linestart words differ from those of words which did not come at the beginning of a line. With this in mind me might propose that at least part of the difference in those statistics is caused by Grove Words, and that by removing the initial gallows the linestart words will become more like those of the main text.
Such an experiment is easy to carry out, even if only in a rough form. As all the Grove Words in the Stars sections (Quire 20) have been identified, they can be replaced with their gallows–less equivalent. So, for example, [pdal, tshedy] are replaced with [dal, shedy]. Even though we believe that some gallows words properly begin with a gallows and thus should not have that character removed, we can remove them all for the sake of speed.
The graph below shows the frequency (by percentage of total) of initial glyphs of words from 1) linestart positions (red), 2) linestart with the Grove Words removed (yellow), and 3) all words not linestart in the Star section (blue).
We can immediately see than some glyphs are now less common initially and others more common. Naturally all four gallows are less common. The glyphs [f, p] have gone down substantially and are now around the same as for non-linestart words (there are now no linestart words beginning with [f], but that is about the expected number).
For [k] the number has also gone down substantially, but it is now much below the rate for non-linestart words. Indeed, its word–initial rate was roughly the same whether linestart or not, even with Grove Words. We might expect that its occurrence on Grove Words is limited to cases where it would normally appear. That some Grove Words properly begin with a gallows glyph anyway is something that was proposed in the former post and seems to be true here.
For [t] it is hard to judge what is right. Even though the frequencies are now very similar the situation is more complex. A fair proportion of its word–initial occurrences are linestart, but maybe a third or more of these are not Grove Words. Thus the numbers are still fairly high, and above those for non-linestart. However, like [k], we might expect some of its occurrences on Grove Words to be normal and should not have been removed. That would place its rate even higher than it is.
The glyphs which have increased are more of a mixed bag. Some of those which were lower linestart are now much more typical. The main gainers seem to be [o, ch, sh], which each come near—though still fall short—of their rate in non-linestart positions. The glyph [a] has also gained somewhat but is still far below the rate for the main body of the text.
Lastly, the rate for [e], though low, has gone up hugely. The character is rare initially throughout the manuscript and this higher rate may have been caused by removing the initial gallows character of a word where the second glyph is [e]. Once again it seems that a number of [k, t] gallows on Grove Words should not have been removed.
Removing the gallows characters from Grove Words makes the initial glyph statistics for linestart words more like those for the rest of the text. It seems like a good step toward bringing the two parts of the text into agreement, but with some reservations. The characters [f, p] might properly be removed in full, but at least some of [k, t] should not be removed. This is good support for the idea that some Grove Words do properly begin with a gallows.
Also, it is clear that Grove Words alone do not account for the difference in word–initial glyph statistics in the Stars section.