Report on the Textual Criticism Challenge 1991

This announcement reports the results of attempts at the “Textual Criticism Challenge 1991” posted by Peter Robinson to various network bulletin boards in July 1991[1]. The challenge, reproduced in part below, was to re-create by statistical or numerical means alone the table of relationships for some 44 manuscripts of the Old Norse narrative “Svipdagsmal” established by Robinson on the basis of external evidence and traditional stemmatic methods. Especially, we report the remarkable results obtained by Robert J. O’Hara, an evolutionary biologist at the University of Wisconsin-Madison. O’Hara used a technique known as cladistic analysis, developed over the last thirty years by evolutionary biologists for the reconstruction of the evolutionary history of organisms from study of their shared characteristics. Using cladistic analysis, specifically the computer program PAUP (Phylogenetic Analysis Using Parsimony, Swofford 1991), O’Hara was able to reproduce all the major manuscript groups hypothesized by Robinson. In all cases, the relationships between individual manuscripts suggested by cladistic analysis agreed with those known from external evidence. Most previous attempts at computer-assisted analysis of manuscript relations have used statistical clustering techniques. These methods have not been outstandingly effective. The success of cladistic analysis, based on a quite different intellectual model, may have considerable implications for scholars concerned with the exploration of large manuscript traditions.

The Textual Criticism Challenge 1991

“A textual critic engaged upon his business is not at all like Newton investigating the motion of the planets; he is much more like a dog hunting for fleas. If a dog hunted for fleas on mathematical principles, basing his researches on statistics of area and population, he would never catch a flea except by accident.” — A. E. Housman

Housman (and others) believed that statistics and mathematics have no place in the study of textual traditions, such as those of Biblical, Classical, or Medieval texts. A scholar’s only weapons when trying to determine how an author’s single long-lost original descended into hundreds (even thousands) of surviving copies are a trained mind and intuition. The Challenge: Prove Housman Wrong. The Old Norse narrative sequence “Svipdagsmal”, comprising two poems “Grougaldr” and “Fjolsvinnsmal” together about 1500 words long, survives in 47 manuscripts known to me. These manuscripts were written in Iceland, Denmark, and Sweden between 1650 and 1830. Because of this late date much is known about how these manuscripts are related. From this evidence and from database analysis of a complete computer collation I have made a table of relationships of the manuscripts, showing how they are divided into groups and how these groups and the individual manuscripts within them are descended one from another. The challenge is this: to construct by Housman’s “mathematical principles” alone, and not using any external evidence, a table of relationships of the manuscripts (a “stemma”) like that I have already made. Only the raw data of manuscript agreements and disagreements in individual readings generated directly from the computer collation may be used. As far as I know, while attempts at exploring manuscript traditions have been made using statistical analysis of small samples of data this will be the first time all the data for a complete manuscript tradition has been so analyzed. It will also be the first time results of such analysis can be so thoroughly checked against external evidence. In approximately ascending order of difficulty, a successful attempt would:

(1) Divide the manuscripts into groups reflecting the most consistent patterns of agreements and disagreements within the manuscripts. These groups might constitute “genetic groups”: that is, manuscripts presumably related by direct copying one from another or from a common parent manuscript.

(2) Identify just what readings in what manuscripts are characteristic of the groups identified in (1) above.

(3) Show the groups identified in (1) which arc themselves descended from other groups and identify the groups they descend from; show the individual manuscripts within the groups descended from other manuscripts and identify the manuscripts they descend from.

(4) Identify particular groups and manuscripts which contain readings which have not descended to them by direct copying from their parent manuscript but by deliberate importation from an alien group (“contamination”). Identify just what readings in what manuscripts seem to have spread by contamination as well as by direct copying: compare (2) above.

Identify just what readings in what manuscripts appear distributed at random: that is, readings which have spread by virtue of the common descent of all these manuscripts from a single parent manuscript, or readings independently conceived by different scribes.

The Data

I have computer files of every agreement and disagreement on every reading of 44 of the 47 manuscripts (the other three are not important), generated directly from my computer collation of these manuscripts in my doctoral work (see my articles in Literary and Linguistic Computing 4 (1989), 99-105, 174-81). This data is available in two ASCII files, one containing all the data for “Grougaldr”, the other for “Fjolsvinnsmal”. These files are available in two formats. In format A, each line begins with the variant number, followed by numbers identifying which manuscripts have this variant and with the numbers separated by a single space. Thus the line “6 1 2 7” indicates that variant 6 occurs only in manuscript numbers 1, 2, and 7. In format B, each line again begins with the variant number, followed by a space and then a sequence of 0s and 1s for each of the 44 manuscripts. A “1” indicates the reading is in the manuscript corresponding to that column of the table, a “0” indicates it is not. Thus the line 611000010000000000000000000000000000000000000 indicates that variant 6 occurs only in manuscript numbers 1, 2, and 7. The two files have about 3500 lines between them. I alone have the key to the variant and manuscript numbers. A closing date of 1 December 1991 was set for the challenge.

Attempts at the Challenge

Nine scholars requested the challenge data outlined above. Three submitted entries. Two of these attempts used varieties of statistical clustering techniques. One of these, performed by Daniel Apollon of the University of Bergen using his own multivariate analysis program Analytica, achieved a partial separation of the manuscripts into groups corresponding with those constructed by Robinson. However, while Apollon’s results were impressive in their consistency with the table of manuscript relations established by Robinson, they did not define precisely which manuscript, or group of manuscripts, might be descended from which. Thus, although Analytica managed to cluster manuscripts known to be directly related close to one another, in most cases such manuscripts were clustered within larger groups. One could not, from the output of Analytica alone, have distinguished the manuscripts of a clustered group which were actually closely related from those which merely contained many similar readings but were not in fact closely related.

The third attempt was that of O’Hara, using the cladistics program PAUP. In five minutes, using a Macintosh II computer, PAUP achieved the following:

(1) It placed directly adjacent to one another (usually as descendants from the same node) sixteen manuscripts known from external evidence to be directly related to one another.

(2) It successfully defined the seven manuscript groups deduced by Robinson within the tradition.

(3) It successfully defined two of these groups as subgroups of another, larger group.

(4) It suggested, accurately, that the two largest groups were each descendants of single manuscripts, and that a third group also descended from one of these two manuscripts.

(5) It provided lists of just what variants were introduced at what point in the tradition. These agreed reasonably closely with Robinson’s own lists of the variants, derived by database analysis of the collation output, characteristic of particular groups of manuscripts.

Some of the results achieved by the cladistic analysis showed relationships it had taken Robinson weeks, or months, to discover using other means. Had Robinson had this analysis at the beginning of his work with these manuscripts he could have devoted more time to exploring fine detail of relationships within the established groups. Fuller discussion of these results, with figures, is available from us at the addresses at the bottom of this document. We have since tested PAUP on the collation output of some one hundred and seventy manuscripts across eight different traditions. In each case, PAUP’s cladistic analysis has produced results consistent with known relations among the manuscripts (largely reproducing, for example, Manly and Rickert’s analysis (1940) of the manuscripts of Chaucer’s “Wife of Bath’s Prologue”). It has also, most interestingly, pointed in several cases to manuscript relations otherwise unsuspected but which further, traditional, analysis suggested might be well- founded.

Why did cladistic analysis perform so much better than the better-known (better-known to manuscript scholars, at least: see the articles of Griffith (1968) and Pierce (1988); but cf. Platnick and Cameron 1977, Hoenigswald and Wiener 1987, Lee 1989) methods of statistical analysis? Cladistic (or phylogenetic) techniques are fundamentally different, in concept and practice, from statistical clustering techniques such as those employed by Analytica. Statistical clustering uses various mathematical means to derive “measures of distance” from all the data concerning agreements between manuscripts. It pays no attention to the type of agreement: especially, it does not attempt to discriminate agreement in “inherited” or “ancestral” readings from agreement in “introduced readings”, typically errors. This appears to be the source of the relative failure of Analytica, referred to above: manuscripts actually genealogically distinct looked similar to it because they happened to retain a large number of ancestral readings.

Cladistic analysis, in contrast, is an explicitly historical approach that aims at reconstructing sequences of events (O’Hara 1988, Sober 1988), and fundamental to the cladistic approach is the identification of ancestral readings and their elimination at every point. Thus: cladistic analysis hypothesizes a tree of descent for the manuscripts. It then “measures” the tree by spreading all the data about manuscript agreements across the tree: the shortest possible tree will be the one involving the fewest variant changes. When thus measuring each hypothetical tree, cladistics identifies just what variants are “inherited” at each node and then rules those out of consideration as it evaluates the tree. This elimination of “ancestral variants” brings cladistics very close to the traditional stemmatic practice (e.g. Maas 1958, West 1973) of insisting that only “errors”, or readings introduced below the archetype, may define sub-groups of manuscripts. In fact, cladistics actually elaborates this elimination of ancestral readings further than does traditional stemmatics. Whereas stemmatics only concerns itself with distinguishing readings in the presumed single archetype from all other introduced readings (usually defined as errors), cladistics seeks to identify not just the readings ancestral at the “lop” of the tree but those ancestral at every node within the tree. This has a remarkable and most powerful consequence. Because inherited variants are eliminated at every node, wherever they lie in the tree, one does not need to specify beforehand just what variants are ancestral for the whole tree. The tree is unrooted: whichever way it is oriented, the ancestral variants are discounted. Therefore, cladistic analysis offers a way around the paradox of recension identified by Talbot Donaldson (1970): that one cannot create a stemma until one knows what readings are archetypal, but one cannot determine what readings are archetypal until one has a stemma. One can use cladistic analysis to create an unrooted tree, deferring judgement on just what readings are ancestral to the whole tree. Then, one can decide which of the branches of the tree lies closest to the archetype and root the whole tree at this branch.

A further reason for the success of cladistics is that it works explicitly on the tree model. It assumes that a varied group of objects (whether of manuscripts or of species) is the result of a sequence of branching descents over time. Cladistics simply finds the shortest (or “most parsimonious”) tree of descent which explains the agreements and disagreements within this group. The overall similarity or dissimilarity of the objects under study, so important in statistical clustering, is unimportant in cladistics. Like species, manuscripts may appear alike but be genealogically quite distinct because of their disagreement on just a few key readings: cladistics recognizes this explicitly. There are many types of manuscript analysis (particularly, studies of dialectal, paleographic, or other scribal phenomena) for which measures of similarity are appropriate. It may also be appropriate in those cases where contamination between manuscripts has so obscured relationship by descent as to make it impossible to determine genealogical affiliation. But such cases apart (and these may be rather rarer than are supposed by some critics, e.g. Kane 1960) we have every reason to think that manuscripts descend from one another just as do species. Therefore, a tool which seeks to reconstruct the stages of descent is appropriate: cladistic analysis is such a tool.

The cladistic analysis of the Svipdagsmal manuscripts was not without fault. Its greatest difficulties, lay in the areas of contamination and coincident variation. Cladistics effectively ignores these: it assumes that instances of horizontal transmission will be outnumbered by instances of vertical transmission. This is broadly true of the mass of variants in manuscript traditions too, hence PAUP’s general success with the Svipdagsmal material. But there are subgroups of variants in subgroups of manuscripts highly susceptible to horizontal transmission. Thus, there are a large number of variants found as marginalia in several groups of Svipdagsmal manuscripts which appear to have been borrowed from the text of distinct other groups. Failure to recognize this led to some deformation of the stemma. Thus, one group of manuscripts which had been heavily contaminated by readings from another group was incorrectly placed too close to that group. There were similar problems with coincident variation, involving a series of readings found in four manuscripts: this coincident variation led PAUP to place these four manuscripts closer to one another than was warranted. Evolutionary biologists have been developing cladistics programs for some twenty years now, and have equipped them with sophisticated procedures for refining their analysis. Variants (“characters”, in cladistic terminology) may be weighted; they may be declared as irreversible, or as necessarily occurring in set sequences. The analysis of the Svipdagsmal material used none of these facilities, and it is likely that the results could have been improved yet further had they been used. There is much to be learnt about the use of cladistic techniques with manuscript traditions. PAUP, the program we have been using, is a very powerful and flexible instrument: considerable experiment is necessary to determine appropriate ways of using it (or any of the other cladistics programs that are available) in different circumstances. On 1st June we met in Chicago with David Swofford, PAUP’s developer. We discussed the special difficulties of analysis of manuscript traditions, especially those arising from contamination. We agreed to work together to optimize PAUP for use in stemmatics. Robinson has developed an interface between the collation program Collate and PAUP: this reads apparatus output by Collate and formatted in one of the styles to be recommended in the next draft of the Text Encoding Initiative, and translates it into the standard NEXUS form recognized by several cladistics programs including PAUP. A user manual, introducing PAUP for manuscript scholars, is a desideratum. The success of cladistic analysis with the Svipdagsmal material offers hope that it may now be possible to reconstruct the history of large and complex manuscript traditions which have hitherto defied explanation. This has consequences for textual scholars, for students of language, and for historians of culture. For textual scholars, knowledge of the evolution of a text through its tradition will change how that text is edited. For students of language, knowledge of just what manuscripts are related to one another will facilitate the study of changing linguistic forms across the tradition. For historians of culture, the reception of the text may be read in what is written into it as it evolves. The above is (in part) a summary of a paper written by us discussing cladistic techniques and their application to the Svipdagsmal material. This paper was presented to the ALLC/ACH conference in Oxford in April and will be published in “Research in Humanities Computing ’92”, edited by Nancy Ide and Susan Hockey (OUP, Oxford) under the title “Cladistic Analysis of an Old Norse Manuscript Tradition”. Copies of this paper are available from either of us. A version of this paper was also presented at the Medieval Academy of America conference in Kalamazoo in May. Robinson will be giving an outline of the results of cladistic analysis of the collation of 44 manuscripts of Chaucer’s “Wife of Bath’s Prologue” at the New Chaucer Society conference in Seattle in August.

Brief Bibliography

Donaldson, E. Talbot (1970), “The Psychology of Editors”, in Speaking of Chaucer (London), 102-118.

Griffith, J. G. (1968), “A Taxonomic Study of the Manuscript Tradition of Juvenal”, Museum Helveticum 25, 101-138.

Hoenigswald, H. M., and Wiener, L. F. (eds.) (1987), Biological Metaphor and Cladistic Classification: An Interdisciplinary Perspective (Philadelphia).

Lee, A. (1989), “Numerical Taxonomy Revisited: John Griffith, Cladistic Analysis and St. Augustine’s Quaestiones in Heptateuchum”, Studia Patristica XX.

Kane, G. (1960), Piers Plowman: The A Version (London).

Maas, P. (1958), Textual Criticism (B. Flower, trans.) (Oxford).

O’Hara, R. J. (1988), “Homage to Clio, or, Toward an Historical Philosophy for Evolutionary Biology”, Systematic Zoology 37, 142-155.Manly, J. M., and Rickert, E. (1940), The Text of the Canterbury Tales (Chicago).

Pierce, R. H. (1988), “Multivariate Numerical Techniques Applied to the Study of Manuscript Traditions”, in B. Fidjestol et al. (eds.), Tekst Kritisk Teori og Praksis (Oslo), 24-45.

Platnick, N. I., and Cameron, H. D. (1977), “Cladistic Methods in Textual, Linguistic, and Phylogenetic Analysis”, Systematic Zoology 26, 380-385.

Robinson, P. M. W. (1989), “The Collation and Textual Criticism of Icelandic Manuscripts”, Literary and Linguistic Computing 4, 99-105, 174- 181.

Robinson, P. M. W. (1992), Collate: A Program for Interactive Collation of Large Textual Traditions, Version 1.1, Computer program distributed by the Computers and Manuscripts Project, Oxford University Computing Services, Oxford.

Sober, E. (1988), Reconstructing the Past: Parsimony, Evolution, and Inference (Cambridge, Mass.)

Swofford, D. L. (1991), PAUP: Phylogenetic Analysis Using Parsimony, Macintosh Version 3.0r, Computer program distributed by the Illinois Natural History Survey, Champaign, Illinois.

West, M. L. (1973), Textual Criticism and Editorial Technique Applicable to Greek and Latin Texts (Stuttgart).

[1] This fascinating study has appeared recently on several e-lists, and we are delighted to reproduce it for the BMCR readership in a slightly different form. We are grateful to Peter Robinson for his assistance.

Report on the Textual Criticism Challenge 1991

Peter M. W. Robinson

Robert J. O'Hara