In an article by BBM (Bar-Hillel, Bar-Natan and McKay) in the last issue of CHANCE, two different kinds of claims against WRR are raised. The first kind purports to attack the lists of appellations and dates. These claims are dealt with in great detail and refuted in documents posted on www.torahcodes.co.il to which the reader is referred. It is shown there that in every single case their claims of the rules being broken are not justified. On the other hand, it is demonstrated there that the “success” of the "cooked" list in War and Peace was produced entirely by breaking the rules.
Given the limitation of space, we concentrate in this letter on their second kind of claims: allegedly beneficial choices.
BBM define a "fortunate" choice as one which improves the ranking
in the permutation race, and an "unfortunate" choice as one which hurt
the ranking.
Their Data:
The main claim of BBM in the section "Choices, Choices" is: "Wonder
of wonders, however, it turns out that almost always (though not quite
always) the allegedly blind choices paid off: Just about anything that
could have been done differently from how it was actually done would have
been detrimental to the list’s ranking in the race. In particular, all
the choices listed in the present section were fortunate for WRR. Had any
of them been different, the ranking of the lists in the permutation race
would have gone down" (pg. 18). We shall check the validity of this statement,
firstly in the light of their own data.
Such data is found in a report by McKay dated the 3rd of April '97,
describing 20 variations on the experiments of WRR. For 19 variations he
has calculated the raw values of P1 and P2 as well as their rank in a permutations
race (for one variation he calculated it for P1 only). His calculations
were done regarding both List1 and List2. In Table 1 we classified his
results for the ranking, accordingly to their being "fortunate" (the rank
for the original choice was the smallest), "unfortunate" (there was smaller
rank), and neutral.
Kind of choice | "fortunate" | "unfortunate" | neutral |
List1 rank of P1 | 9 | 10 | 1 |
List1 rank of P2 | 13 | 6 | 0 |
List2 rank of P1 | 9 | 10 | 1 |
List2 rank of P2 | 14 | 4 | 1 |
Total | 45 | 30 | 3 |
TABLE 1
Thus out of 78 choices, 45 choices were "fortunate".
Puzzling Logic:
We cannot understand how this data can be reconciled with BBM's claim
"that almost always (though not quite always)" WRR's choices were "fortunate".
On page 18, there are two other statements:
1. "---use of combination of date forms (and also using both forms
of the 15th and 16th of the month) is superior to any single date form."
2. "Moreover, the triplet of date forms used by WRR is superior to
any of the other 14 choices".
However, it is simply not so. The rank of P2 for List1, using the "triplet
of date forms used by WRR" is 36 out of 1,000,000 permutations. But using
only the single date form b'alef Tishrey, the rank becomes smaller:
8 out of 1,000,000 permutations. Using the pair of date forms: b'alef Tishrey
and alef b'Tishrey, the rank becomes even smaller: 1 out of 1,000,000.
[The calculations were done with the same programs and seed as in the original
experiment].
Until now, we have examined some of BBM's results as they stand, ignoring
the discussion if they are at all relevant. But, at this stage, the reader
must be aware of two simple facts:
a. The measuring parameters, and the forms of writing the dates, were
established even before the first experiment (List1), and were published
before the researchers were asked to perform the second experiment (List
2). The calculation of choices, therefore, must be done only with regards
to the first experiment.
b. In the first experiment (as well as in the second), the only indicators
of success were the raw values of the measures P1 and P2.
The only criterion to judge if a choice was “fortunate” - i.e. improved
the result - is according the raw values of the measures P1 and P2. Any
analysis of the choices aiming to uncover possible process of optimization,
must be done relating these raw values. Therefore, it is extremely strange
that BBM's analysis was performed according to an irrelevant test: the
test of permutations, which was suggested two years after the publication
of all the choices.
The Basic Logical Error:
BBM understand the lack of logic behind their claim and they write:
“Some might claim that it is not “fair” that the choices were tested
with respect to their effect on the permutation race rank, because this
statistic had not yet been developed when the choices were made.” – This
is an understatement.
Mathematician Robert J. Aumann has already criticized such analysis
of Maya Bar Hillel: “For this to make sense, clearly the statistic to be
calculated in connection with each choice should be the one with which
WRR were working at the time that the choice in question was made. Here
is problem no. 1 with Maya's tests: she does NOT do this.
The statistic Maya uses are the rank
order out of ten million random permutations. But the entire test – dates,
spelling, appellations, date forms,everything, was fixed before
Diaconis suggested the permutations. Using the permutations here is an inadmissible
anachronism---" (an excerpt from a letter by Robert J. Aumann dated 17
Jan 97. In this letter, an excellent analysis of 13 choices checked by
M. Bar Hillel is given. The reader is urged to read this analysis in full,
as well as Document 4; both are posted on the same web site)
Examining choices by the raw values shows that, for instance, the choice
of the three forms of dates out of the four listed by BBM (p.19), was "unfortunate"
for both lists in comparison with taking all four forms.
However, the logic of BBM is flawed on a more fundamental level.
As a basis for their analysis, BBM state: "It is possible to set up
a null hypothesis of blind choice, according to which the proportion of
fortuitous choices is expected to be no higher than 50%."
This null hypothesis contains a clear logical error: it is true ONLY
if one assumes in advance that the research hypothesis of WRR (the existence
of the ELS phenomenon) is not correct. It ignores the fact that in certain
cases WRR could expect beforehand that the results would be improved by
their choice. For example, there should be no surprise that the results
are improved by taking correct dates and not the wrong dates, IF the research
hypothesis of WRR is correct. Of course, BBM rule out such a possibility,
but it is a logical error to assume this (the absence of the ELS phenomenon)
in their analysis. This mistake immediately invalidates a large part of
their analysis.
Similar arguments also apply to a more technical point. On p.18, BBM
list alternative suggestions to the proximity measure used in WRR. All
but one of these suggestions use the first power of Euclidean distance
(instead of the second power used by WRR). Therefore they have a "disfocusing"
effect, and we can well expect in advance that this change will weaken
the results. (One of these suggestions - using just the Euclidean distance
between the two nearest letters in the pair - is even worse, because it
totally disregards the geometry of the ELS's meetings. To be sure, such
"test" cannot but fail: no wonder it fails.)
When all the above mentioned logical errors are corrected, entirely
different picture emerges: a fair balance between "fortunate" and "unfortunate"
choices. For the details of it the reader is referred to Document 4.
Doron Witztum, Eliyahu Rips.