What Is Perseus? What Is It Not? Comments on the BMCR Review of Perseus

Those of us who are involved in developing the Perseus database read the review of Perseus 1.0 published by BMCR with great interest.

This response thus sets out to provide some additional background on what was done and thus to advance criticism and analysis of the database as it stands. Even as these words are being composed, we are hard at work constructing the new version of the database that will be shown at the New Orleans APA/AIA convention. Work on Perseus 2.0 will continue until the summer of 1993 and the next version should be available in late 1993 or early 1994. The more comments and criticism we receive now, the more able we will be to make Perseus 2.0 as useful as possible.

First, none of the reviewers discussed the “Object Keyword Search” for art objects. Although the collection of objects in Perseus 1.0 is relatively small, students and even scholars can quickly find many representations of material culture (e.g., musical instruments, generic scenes). Moreover, this kind of keyword lookup will become progressively more important as the database grows. Reactions to its design and usefulness now could make a big difference for Perseus 2.0, which will, for example, contain c. 1,500 vases, rather than 150.

Second, Perseus 1.0 is perhaps most innovative in its handling of texts. Perseus is a first-generation project in classical art and archaeology, but a second-generation project on the text side. Many of us who worked on Perseus had become interested in computer applications to classics by developing text retrieval tools for the Thesaurus Linguae Graecae. Perseus 1.0 thus contains a relatively small number of texts, but scholars and students alike can do things with these texts that they cannot do in any other system. In a traditional text retrieval system, to find examples of φέρω, one must enter a group of strings (e.g., φερ-, οισ-, ενηνοχ- etc.). In Perseus 1.0, however, one can type in φέρω and Perseus 1.0 will automatically retrieve retrieve οἴσω, ἐνήνοχα, etc. This is possible because we spent much of the last eight years developing an elaborate system that can analyze Greek words. We use this system to analyze every lower-case word in every author in Perseus and store the results in a database. Classicists can effectively search these electronic texts for εἰμί or ἵημι or for any other morphologically complex form. Such searches are, however, qualitatively different from those which have been generally available. This feature alone gives the serious as well as the novice student of Greek a significant new tool.

The morphological analysis feature still has some problems. The morphological analyzer, for example, contains a number of features to determine the dialect of the form, but, for simplicity’s sake, we chose to follow the traditional dialectology such as one finds in Smyth with occasionally unnerving consequences. We had, therefore, waited with anticipation to see what weaknesses our reviewers would find or what suggestions they would make. We were surprised to learn that a problem in setting up SMK GreekKeys prevented Richard Hamilton from learning how to type in Greek accents and that this, compounded with a lack of emphasis in our documentation, had prevented him from even noticing the new searching capabilities. As a result he concluded that Perseus 1.0 could only serve as a research tool for “scholars not able to read Greek, a prospect that turns my heart to stone.” We understood this physiological sensation very well as we read these words.

It is easy to see, in retrospect, how such a problem could have arisen. The “Greek Word Search” does not occur in the “Philological Tools” section of the documentation, but under “Word Searches,” and the picture illustrating a Greek word search shows the blank search screen without showing the results of an exemplary search—a printed search screen that automatically retrieved ἤνεγχ’ (Aesch. Lib. 992) and οἴσεσθαι (PV 638) with φέρω would have caught the attention of anyone who has toiled with existing search programs.

Users of Perseus who are especially interested in the Greek language should therefore spend some time not only with the Greek word search but also with the English-Greek word list described on pp. 76-7 of the documentation. Not many classicists can, for example, quickly name a half-dozen Greek words used for “wallet.” Anyone seriously exploring a concept or semantic field in Greek stands to benefit from this feature of Perseus. We also suggest that users explore the English translation as a new tool to augment searching. If one is interested in temples in Thucydides, for example, searching the English translation will locate τὸ Λεωκόρειον καλούμενον at 1.20.2, τὸ Ἥραιον at 1.24.7, τοῦ Ἐλευσινίου at 2.17.1, τὸ Ἀπολλώνιον t 2.91.1 etc., where ἱερόν has been left out of the Greek. Even a translation, when converted into an electronic form, can in such a fashion help the most knowledgeable Hellenist.

James O’Donnell is right when he observes that the ambitious goals of Perseus 1.0 are its greatest defect. If we had constrained our goals, Perseus 1.0 would be much easier to evaluate. A computerized database can be enormous, but even a smaller dataset can allow us to do things that we could not otherwise do. Perseus 1.0 simply does not, for example, contain enough material to support encyclopedic research in art history, but this does not mean that this version has nothing to offer advanced users. The potential of Perseus should not obscure its smaller, but nonetheless tangible, present advantages.

Perseus 1.0, for example, contains detailed images of sixty-four vases that we photographed ourselves. Many of these vases had never before been published in any form (that is one reason why a total of only 137 vases appear in 1.0—we needed to do much more work on basic documentation than we had anticipated). Some had been published, but not adequately. Harvard 1960.236, for example, is a large kalyx krater by the Kleophrades Painter and an important illustration of his work. We include 111 views of this object and Perseus 1.0 will allow serious students of Greek vase painting—whether professional researchers or those who find themselves attracted to the subject—to study this object in much greater detail than was possible before. More important, perhaps, Perseus 1.0 contains the first full and detailed photography (c. 400 views) of both the east and the hitherto unpublished west pediments from the Temple of Aphaia on Aegina. If we had published these photographs as a separate, $125 paper catalogue, no one would question the fact that we had contributed to the published record; those publishing research which touched upon these objects would, even if they had their own private photographs with which to work, still cite our publication for its visual documentation. Likewise, if we had dumped out the morphological data and just printed a “semi-lemmatized” index to Pausanias that tied, for example, all the forms of φέρω under a single entry, everyone would accept that this was a research tool, but they would object that it was too specialized and boring to serve in teaching.

The problem that we face is, of course, that we are not trying to build a database with limited and well-defined functionality. The ambitions which drive our work not only pose technical problems, but raise expectations so high that users inevitably find themselves, at least for a time, disappointed with the reality.

Clearly, we need to do more to make our intentions and assumptions clearer. While the vastly greater scale of Perseus 2.0 will obviate some criticisms, our experience suggests that the more we provide, the better people will understand what is possible and the more they will want. Electronic tools, by their size and flexibility, breed curiosity rather than satisfaction.

Several of the BMCR reviewers discussed the lack of focus in the presentation of the Perseus materials, if it is to be used as a teaching tool; they point out that it lacks extensive secondary sources, a unifying vision of a problem or set of problems, and specific links across different parts of the material. First, we should make it clear that we are not trying to replace the instructor or give Perseus a dominant authorial voice—we talked to a great many classicists early on in our work and the vast majority made it clear that they had their own ideas of how and what they wanted to teach. We have thus sought to provide a multivalent system with basic tools. Since it is, as one reviewer points out, too early to decide exactly how such electronic databases are best used in teaching and research, we thought it best to leave that prerogative up to our users. To this end, we have emphasized the collection and organization of primary material, and leave it to the human instructor to decide how best to integrate this resource into his or her work. In addition, we tried to provide some tools that will facilitate navigation for all users, such as the lemmatized Greek word search and the keyword database for archaeological objects. It is these, specifically electronic, search and retrieval tools that make Perseus different from the books on Jerome’s or a student’s desk.

Any humanist developing a complex academic database must confront two opposing forces. First, we had to build for the long term. Classicists create documents that must last for decades. Unlike many scientists, for example, we do not spend most of our time working with publications that are less than five years old. We designed Perseus from the start to be, as much as possible, independent of any particular system or program. Pictures are stored archivally as slides and can be redigitized as technology improves. Texts are encoded in a powerful format called SGML. Tabular data is stored in standard relational databases. Nothing in Perseus 1.0 is tied to any one program or computer, and much of our energy has gone into making sure that the data that we collected would become increasingly useful as computer systems became more powerful, rather than drift into obsolescence. Equally important, Perseus was built to grow larger. It is easy to create a useful, small database that collapses under its own weight as it becomes larger. Much of our effort has gone into making Perseus “scale” properly, so that Perseus as a whole as well as its various components can expand over time.

Second, we wanted to create something that served a significant group of people in a reasonable period of time. The long term always extends into the future, and visionary projects have a tendency to remain visions. After much hard thought and debate, we chose to develop a tool that would run on a system that was as flexible as possible but that was also accessible to a wide range of individuals. On the one hand, we had for years developed searching tools in the Unix operating system and we considered developing Perseus on powerful, but expensive, Unix workstations. Had we done this, we would now have a much more flexible database (and one better suited to the needs of researchers); but even today few classicists have Unix workstations with graphics capabilities on their desks, and our audience would have been small. Instead of helping democratize information, we would have appealed to a tiny elite. On the other hand, we could have built something that would run on a very inexpensive machine running DOS (and, ultimately, Windows), but we could not have created anything nearly as useful as we have on the Macintosh. The software tools and general environment in the DOS world were not as powerful as those offered by the Macintosh when we began work in 1987.

Our goal in 1987 was to create something that would run on a Macintosh system that cost less than $3,000. As of November 1992, we have exceeded that goal. A thrifty shopper could put together a Macintosh LC II, color monitor and CD ROM player for about $2,000. Ultimately, we hope that not only all professors of classics, but every serious student as well, will be able to own their own copies of Perseus.

These requirements have forced us to be restrained and to emphasize fundamental, although often unglamorous, tasks. Anyone who compares Perseus 1.0 with the Perseus Sampler that we distributed in 1988 and 1989 as an example of what we planned to build will find that Perseus 1.0 is, in many ways, much simpler than what we had planned. We actually began work in 1987/1988 by building the kinds of dynamic maps, tools for automatically juxtaposing vase paintings, talking documents, detailed tutorials, etc. that Lee Pearcy calls for. His criticisms came as something of a surprise, and brought home to us how far our thinking had evolved in the past five years. It was as if someone had studied our earlier plans and were chiding us for stream-lining our immediate intentions.

The problem with many attractive applications revolves around standards. Hundreds of useful software tools developed in the 1980s have vanished, because software requires constant maintenance and periodic major revision. Until well-developed, system-independent standards for multimedia documents appear, the kinds of applications suggested by Lee Pearcy will tend to be tied to a particular system or application. If Perseus is going to have a long-term existence, then it must be constructed in such a way that we can move it from one system to another with the least possible effort. Converting generic databases into a version of Perseus for the Apple Macintosh running Hypercard has grown increasingly automated, but it still requires at least of month of concentrated effort by several people. As the basic textual and visual materials are completed, and as the overall architecture of Perseus becomes more fully developed, it will be possible to develop even more sophisticated tools. If we do not show restraint and discipline now, however, then the database will grow too complex and will collapse under its own weight.

To us, the question is not whether we should build a large database or create a highly focused tool, such as Robert Winter’s splendid hypertextual introduction to Beethoven’s Ninth Symphony, published by the Voyager Company. Rather, we built Perseus to provide the infrastructure that would, among other things, allow someone to build such a focused tool for a specific topic such as Aristophanes or Greek religion. We want to provide a foundation that increases in size but is stable enough so that others can extend it. Classicists do not use textbooks, and our early research made it clear that those teaching about ancient Greece were interested in examples of pedagogy, but that they create their own syntheses. Perseus is thus not a curriculum or even a course, but a tool whereby others can reconceptualize what they do and develop their own courses.

Lee Pearcy raised a major problem that all designers of hypertextual databases confront. He asks whether we are not creating a “horseless carriage,” and failing to look beyond the limiting paradigms of print.

Anyone who has tried to introduce electronic tools into the curriculum has, however, encountered the problem that computers are already too different and disorienting for the novice users. In the late 1980s, the Apple Macintosh redefined the paradigm of computing because its interface appealed to what was familiar and sought to ease the transition from print to the electronic medium. Computers are revolutionizing the way we work and even think, but they must do so incrementally, building on prior experience, solving old problems even as they open up new vistas. The BMCR review reminded us how different and foreign Perseus 1.0 is, and how difficult it is for those who have not worked with it for years to become familiar with what it can and cannot do.

Many of those who work with Perseus are anxious to define it as either a teaching tool or as a research tool. Underlying this distinction is a pernicious dichotomy that splits the world into two extreme groups: the undergraduate population which needs (but is only imperfectly capable of) enlightenment vs. the professional classicists, who have laboriously mastered the tools of our trade, assimilated a deep understanding of the ancient languages, familiarized ourselves with the secondary scholarship and become capable of evaluating and even altering the sum of received wisdom. In the middle, graduate students struggle painfully upwards in their long and dangerous ascent from the general populace to the elite. In this model, the omniscient specialists stride across their areas of expertise, carefully developing their own work, keeping to their own turf and pumping any interlopers full of intellectual birdshot.

The first part of this piece pointed to some specific electronic tools in Perseus 1.0 that allow the researcher and the undergraduate to perform their traditional tasks more effectively, and such resources will only increase in number and in power, both in Perseus and elsewhere. Classicists stand to gain a tremendous amount from the new technology. But if we are to understand the broader implications of such technological tools, we might consider a third category of thinker whom we all know and may admire, but who is often (as in the the BMCR review) overlooked: the specialist in one field who wishes to work with material from another.

No classicist can master all those ideas relevant to our day-to-day work developed in anthropology, literary criticism, economic history, cognitive science, etc. Conversely, scholars concentrating in other disciplines simply do not have the time to become fluent masters of Greek. At Harvard, for example, the history of science graduate seminars on Aristotle explicitly state that knowledge of Greek is not necessary. If an historian of biology writes a book which covers Aristotle’s Historia Animalium, are we to dismiss the strengths of this research—such as a knowledge of the history of biology that no full time classicist could ever match—because the author does not also know Greek? Likewise, the eminent sociologist Orlando Patterson recently published a book called Freedom in the Making of Western Culture (Basic Books 1991). It would be easy for a classicist to work through this book and underline statements with which specialists in Greek might quibble, but no classicist could replace Patterson’s particular angle of vision on the problem of freedom. Far from dismissing or discouraging such forays into our field, we need to do everything in our power to attract more of them and to help such work be as good as possible. In the United States, at least, classics only begins to assume its proper role when it extends beyond the thin ranks of professional classicists.

When all is said and done, Perseus has already begun to meet its most important goal. When we began planning Perseus in 1985, we wanted to see what would happen when we created a single, heterogeneous database with many kinds of evidence about a particular culture. We wanted to see what would work and what would fall short of our expectations. Above all, we wanted to give classicists and students of all cultures a concrete object of analysis, to begin moving the discussion out of the subjunctive and optative. Electronic tools of various kinds will become more numerous and significant in the years to come, but these tools can evolve in very different ways and can cause harm as well as good. The better we understand what we want (or what we don’t want), the better prepared we shall all be to see that these new tools support the values in which we individually believe

Response: What Is Perseus? What Is It Not? Comments on the BMCR Review of Perseus

Response to 1992.05.04

Response by

Gregory Crane