The last post on Grove Words showed some evidence that the initial gallows character is, in most cases, simply added to the beginning of an existing word. We also saw in another earlier post on Linefirst Words that the statistics for the initial characters of linefirst words differed from those of words which did not come at the beginning of a line. With this in mind me might propose that at least part of the difference in those statistics is caused by Grove Words, and that by removing the initial gallows the linefirst words will become more like those of the main text.
Such an experiment is easy to carry out, even if only in a rough form. As all the Grove Words in the Stars sections (Quire 20) have been identified, they can be replaced with their gallows–less equivalent. So, for example, <pdal, tshedy> are replaced with <dal, shedy>. Even though we believe that some gallows words properly begin with a gallows and thus should not have that character removed, we can remove them all for the sake of speed.
The graph below shows the frequency (by percentage of total) of initial characters of words from 1) linefirst positions (red), 2) linefirst with the Grove Words removed (yellow), and 3) all words not linefirst in the Star section (blue).
We can immediately see than some characters are now less common initially and others more common. Naturally all four gallows are less common. The characters <f, p> have gone down substantially and are now around the same as for not linefirst words (there are now no linefirst words beginning with <f>, but that is about the expected number).
For <k> the number has also gone down substantially, but it is now much below the rate for not linefirst words. Indeed, its word–initial rate was roughly the same whether linefirst or not, even with Grove Words. We might expect that its occurrence on Grove Words is limited to cases where it would normally appear. That some Grove Words properly begin with a gallows character anyway is something that was proposed in the former post, and seems to be true here.
For <t> it is hard to judge what is right. Even though the frequencies are now very similar the situation is more complex. A fair proportion of its word–initial occurrences are linefirst, but maybe a third or more of these are not Grove Words. Thus the numbers are still fairly high, and above those for not linefirst. However, like <k>, we might expect some of its occurrences on Grove Words to be normal and should not have been removed. That would place its rate even higher than it is.
The characters which have increased are more of a mixed bag. Some of those which were lower linefirst are now much more typical. The main gainers seem to be <o, ch, sh>, which each come near—though still fall short—of the rate not linefirst. The character <a> has also gained somewhat but is still far below the rate for the main body of the text.
Lastly, the rate for <e>, though low, has gone up hugely. The character is rare initially throughout the manuscript and this higher rate may have been caused by removing the initial gallows character of a word where the second character is <e.>. Once again it seems that a number of <k, t> gallows on Grove Words should not have been removed.
Removing the gallows characters from Grove Words makes the initial character statistics for linefirst words more like those for the rest of the text. It seems like a good step toward bringing the two parts of the text into agreement, but with some reservations. The characters <f, p> might properly be removed in full, but at least some of <k, t> should not be removed. This is good support for the idea that some Grove Words do properly begin with a gallows.
Also, it is clear that Grove Words alone do not account for the difference in word–initial character statistics in the Stars section.