The Duke Papyri on the Internet

by Peter van Minnen

The Duke Papyrus Archive (DPA) was first put online in June, 1995. It included both records describing and images illustrating 153 papyri from the Duke papyrus collection, 2 published and 151 unpublished, as well as a number of introductory materials and bibliographies, some for beginners, others for more advanced users. In December, 1995, the DPA was expanded to include the entire collection, 43 published and 1,330 unpublished. Although records for all papyri are already available, images for 219 of them are still lacking. The DPA is expected to be completed by March, 1995.

This summary limits itself to the bare essentials and does not convey an adequate idea of the work involved in the creation of the DPA and the complex history of the ideas that guided its creators. The DPA is the end product of a series of activities that were put in motion long before there was such a thing as the Internet. To understand what the DPA is all about it is necessary to say something about its prehistory.

About five years ago Professor John Oates of Duke University conceived the idea of cataloguing the Duke papyrus collection. The last major purchase of papyri was made in 1988 and the body of mostly unexplored materials had grown substantially in the preceding twenty years. Few papyri had been edited and many had not yet been properly conserved. To make a successful bid for funding to the Preservation and Access division of the National Endowment for the Humanities Oates enlisted the cooperation of Steve Hensen of the Special Collections Library at Duke. The project, which was eventually funded in 1992, provided for the conservation of the entire collection by a papyrologist, the preparation of catalogue records for the online catalogue of the Duke University Libraries as well as for OCLC and RLIN by a librarian in close cooperation with the papyrologist, archival photography and the publication of a printed catalogue containing excerpts from the catalogue records. This would ensure physical as well as intellectual control over the collection.

The project introduced some novel ideas about accessing information about papyri. The catalogue records would be presented in a standard format and become part of local and international databases, where they could be accessed by a wide variety of users, not just professional papyrologists for whom the printed catalogue was intended. The standard format, derived from the second edition of the Anglo-American Cataloguing Rules, would make searches for certain subjects more predictable or at least less idiosyncratic. The databases would provide their own search engines, which would not require the use of special or system-dependent software on the part of the users. Since there was open access to the online catalogue of the Duke University Libraries, no user would need to acquire an expensive set of CD ROMs or another medium containing the records.

By September, 1992, the need for a printed catalogue quickly evaporated. More and more papyrologists were becoming computer literate and they could also be expected to use the online databases to get at the information they wanted. In addition experiments at The University of Michigan had shown that scanning papyri rather than photography was the way of the future. Scans could be manipulated and enlarged depending on the need of the scholar or the teacher: changing brightness and contrast could help read difficult passages in the text, enlargement and projection onto a screen could improve instruction in papyrology. Since the conservation work on the Duke papyri would have to be done first, it was decided to postpone the scanning for a year.

In September, 1993, Suzanne Corr, the librarian, started to make catalogue records and scans simultaneously. For the catalogue records she used notes made while conserving, transcribing and interpreting the papyri. The colour scans were made with a flatbed scanner at 300 dpi. This was deemed sufficient for the practical uses the scans might be put to. They could be blown up four times without losing sharpness. The files thus created were not too big. It did not take long before 600 dpi became the generally accepted standard for archival scanning and the files, though extremely large, could be accomodated on ever cheaper storage media. When the switch was made from 300 to 600 dpi a differentiation was introduced between archival scans, which would not be tampered with, and scans for general use, which would be compressed and disseminated on CD ROMs. Over time further improvements in the scanning technology were adopted and this has resulted in better colour quality and a somewhat faster scanning process.

Midway in the project the full implications of the Internet became clear and the pieces of the puzzle started to come together. Until that moment separate dissemination of the information and the images had been contemplated. The individual user would have to put the two together from different sources. The catalogue records would be available through the online catalogue of the Duke University Libraries and at least one international database, OCLC. The images were still few in number and would become a problem only when they would fill up the storage medium. But with the evolution of the Internet, the World Wide Web in particular, it became clear that the information and the images could be dynamically linked. The cumbersome method of searching an online database and then turning to a set of CD ROMs or vice versa had become obsolete.

This is the major advantage of the Internet over traditional media and it cannot be overemphasized. Online bibliographical databases such as OCLC and RLIN, which both contain millions of records, have been with us for some time, but they are dead ends in themselves: once the relevant bibliographical record has been found, one still has to locate an actual copy of the book in the library of one's own institution or, if one is less fortunate, through interlibrary loan. Likewise large sets of pictures of items of special interest to students of the ancient world, such as Greek vases, have been with us for some time, usually in print such as the Corpus vasorum antiquorum or sometimes in microform such as the Dead Sea Scrolls. But these media are also dead ends in themselves. There is no way to search the Corpus vasorum antiquorum but to browse through all the volumes. Indices to some of the Dead Sea Scrolls have only become available in recent years, but these are again not linked up with the pictures. Only the Internet allows links between large sets of descriptive data and large sets of images. The Duke papyrus project could make a smooth transition to the Internet, because it had been producing just such a large set of descriptive data - the catalogue records describing the papyri - and was in the process of producing just such a large set of images.

This is an essential point: the production of large sets of images for the Internet has to go hand in hand with the production of large sets of descriptive data. Without the catalogue records the images of the Duke papyri would not only be meaningless, but also inaccessible. Of course, one could call them up one by one and try to make sense of what one sees, but this would be impractical. That the DPA provides both content and pictures in fully searchable and browsable form is the result of more painstaking work than meets the eye. Eventually only four months out of a total of sixty-four months of work will have been spent on the DPA proper. The rest of the time was spent conserving, transcribing, interpreting, cataloguing and imaging the papyri.

With assistance from Paul Mangiafico, library systems coordinator of the Special Collections Library at Duke, work on the DPA started in April of 1995 and a first sampling was put online in June. The catalogue records were downloaded from the online catalogue of the Duke University Libraries into dummy text records that provide links to the images. These records are the core of the DPA, because they provide the essential contextual and descriptive information about the papyrus. One can get at these records in two ways. A keyword search will take a user straight to a particular record. Thumbnails of the images then provide access to images of the papyrus. This is the way to search for those who already know exactly what they are looking for. One can also approach the records by browsing the various categories into which the material has been organized. This search mode seems attractive for those who have as yet only vague ideas about what they are looking for. There are some topics of general interest such as "cultural aspects" and "women." Or one can limit oneself to one of the languages represented in the Duke papyrus collection and to a particular type of text within certain time periods. Either way one will end up with a list of items with short titles indicating what to expect. This should be enough to make up one's mind. Choosing one of the items from the list will immediately take one to the record and the thumbnails of the images. There are only a limited number of steps involved in searching the DPA. In addition, the pages are all fairly small and do not take a long time to download. The records are also separate files of modest size. All this contributes to a speedy downloading process and contria relatively seamless user interface.

The DPA makes both 72 and 150 dpi images of the Duke papyri available online. These were derived from the 300 and 600 dpi archival scans, which were made before. The 72 dpi images display lifesize and in full colour on a 72 lpi screen, the most common kind in use today. On a good monitor they give an adequate idea of what the papyri look like in real life, but they can still be difficult to read. The 150 dpi images display a little more than twice enlarged on a 72 lpi screen and are quite legible. Ninety-five percent of all papyri can be read adequately from these 150 dpi scans, which are far superior to black-and-white photos. When the papyri are substantial the size of the image files can be quite large, but never too large to handle. If one is downloading them through a modem it may be a good idea to read the records carefully before deciding to call up the images!

It was also decided to make records and images of all papyri accessible in both search modes, not just a "best of Duke" selection. This is essential for scholarly purposes, as Theodor Mommsen put it in 1858 describing his work on the corpus of Latin inscriptions: "Ob jedes Stück, das er aufhebt und aufheben muss, auch wirklich des Aufhebens wert sei, danach fragt der Archivar zunächst nicht." Small fragments may join others and comprehensiveness avoids the effect of creating a "canned" exhibit. Also included are a number of text files about papyrology in general and about the Duke papyri in particular. From these users can get some help in formulating their search requests and also gain an insight into the kind of context they can put the material in. This can be a scholarly context, through the annotated bibliographies, or a general context, through the blurbs written for the non-specialist. This should facilitate greater user control once users familiarize themselves with the options available to them.

It was decided not to scan the printed editions of some of the texts, because this is better done in the context of capturing the entire papyrological apparatus digitally. It would of course be extremely useful to have these materials online as well, especially for those working at papyrologically challenged institutions such as Oshkosh and Harvard. To scan in all text editions is a major undertaking, which will have to be part of the Advanced Papyrological Information System (APIS), a joint project of a number of institutions to create for all major papyrus collections in the United States an apparatus of the sort pioneered at Duke.

With the basic structure in place, the material in the DPA was increased ninefold in the last two months of 1995. It will take only a few more weeks to complete the DPA in March of 1996 once the remaining images have been made. The speed with which it has been put together should raise hopes for the future. One should expect other papyrus collections to be able to add their holdings in an expeditious manner, particularly if APIS is funded. Other projects, such as the online database of Greek vases and that of squeezes (paper impressions) of Greek inscriptions, both undertaken at Oxford, can proceed even more expeditiously. The vases and inscriptions have almost all been published before and some form of cataloguing already exists for them. Oxford has a large collection of squeezes of Ptolemaic inscriptions, which would be a welcome complement to APIS.

It should be clear that the creation of databases such as the DPA requires the investment of large amounts of time and money. The Internet, however, has clear advantages over traditional media when it comes to cost. Only the creators of online databases need to spend time and money on the catalogue: they have already put the information and the images together and the materials are available to the users at the cost of a very short telephone call. There is no need for every individual scholar to construct meaning every time an image is called up, because the essential information is available in the catalogue record. There is also no need for every research library to buy expensive picture databases such as the Corpus vasorum antiquorum. Individual scholars with a modem will eventually be able to mimick such a research library at home.

Another advantage of the Internet over traditional catalogues is the open-ended character of online databases. They are dynamic - provided there is an institution or a scholar behind the scene to refresh the data on a regular basis. This helps produce the initial set of data more quickly and also facilitates updates. When a catalogue of pictures with descriptions is given in print, whether on paper, in microform or on CD ROM, it almost has to be final. An online database, however, does not have to be perfect right away and can therefore be made accessible at an early stage. This allows extensive screening of preliminary work by colleagues worldwide, who will all bring their expertise to bear on problematic pieces. The DPA is ninety-five percent original scholarship and it is simply impossible that the "final" catalogue for all 1,373 papyri was produced in merely three years. Ongoing research will always create the need to update the material, whether it is presented as a printed catalogue or as an online database, and it is much easier to update the latter. Revisions of materials in the DPA are dated at the bottom of the page. The need to service an online database can be a powerful argument to persuade university administrators to continue to fund the expertise that originally created the data.

There are a number of virtues of the DPA that can be reproduced at other institutions and with other materials. First the images. What does it mean to have a database of over a thousand papyri at your fingertips? The instant accessibility of the images obviously holds great promise for scholars. Almost any kind of papyrological research involves the consultation of hundreds of different texts. Only very few are illustrated in the old-fashioned way and one usually does not bother to acquire prints of every text one wants to use or quote. If the readings are in doubt or the character of the script is the feature one is interested in, often mediocre prints have to be ordered from the institutions holding the items at sometimes forbidding costs. This may take a long time and the alternative is to change one's travel plans to accomodate visits to the institutions holding the items. Now, if the DPA is replicated at more and more institutions, the costs of one's everyday research will go down and the quality as well as the speed (or quantity) will go up. Eventually one will be able to put materials dispersed over several collections together or at least check published and unpublished holdings for possible links.

The instant accessibility of the images is a boon to students. They have a large selection of material at their disposal to test their skills. Relatively few institutions hold significant numbers of papyri - only about seven in the United States. Until recently students at other institutions have had to make do with black and white plates. The DPA now offers them study material in full colour with helpful comments on the individual texts as well as more general information and annotated bibliographies. Since the Duke papyrus collection holds a significant number of literary texts, which are often written in easy-to-read scripts, there is enough practice material for beginners. The instant accessibility of the images is also a boon to all non-specialists who want to see an example of a certain kind of text or an illustration they can use in their classes or on their own home pages. There is great variety in the Duke papyrus collection and it appeals to a similarly great variety of interests. Examples of the various scripts used in ancient and early medieval Egypt can be called up at a moment's notice, fragments of the sacred writings of Christianity and Islam are instantly accessible, Greek and Latin classics from Homer to Cicero can be inspected without a big fuss, and the curiosity about life in the past can be satisfied by reading the notes provided in the catalogue records or the blurbs written specifically for non-specialists, all in plain English.

Now the ideas behind the DPA or its substance. Early on the decision was made to catalogue the entire collection, not just the Greek texts. Every major papyrus collection holds items in the various languages and scripts used in ancient Egypt, but usually the Greek material is in better shape than the rest, because it has attracted the attention of classicists or ancient historians in the past. Thanks to the inclusive character of the project at Duke, the DPA gives the material in languages other than Greek a chance. This greatly enhances the appeal of the papyri to those interested in ancient Egypt (the Hieratic and the Demotic papyri), in early Christianity (the Coptic papyri) and early Islam (the Arabic papyri). By including both literary and documentary texts another fence is broken down. Of course, most scholars interested in Greek literary texts will not always bother to look at the documents and those interested in early Christianity will not always bother to look at the Ptolemaic documents and so on, but at least the DPA does not automatically preclude them from doing so.

The decision to use standard cataloguing rules has resulted in greater consistency and makes searches of the material more rewarding to specialists and non-specialists alike. Transliterated Greek has been avoided, but the translations of certain papyrological terms is not a problem for specialists: sapienti sat. As noted earlier, one can search the web site as a whole by keywords. This search mode is not very precise, but the great advantage is that one can go straight to the records linked to the images. One can also search the online catalogue of the Duke University Libraries, which allows more sophisticated searches. A list of Library of Congress subject headings used in the catalogue records is available as part of the DPA and users should familiarize themselves with them and with the genre headings from the Art and Architecture Thesaurus, a list of which is also provided. The records in the online catalogue do not currently link up with the DPA, but about half the records have links in them that only need to be activated. The Duke University Libraries will add the links to the other half of the records in due course.

The original grant from NEH came with the proviso that the Duke papyri would become more accessible in a variety of ways. A large part of the collection had not yet been conserved, let alone explored. This has now been taken care of. The catalogue records are available to scholars and others through the online catalogue and one international database. The DPA presents these catalogue records together with images of the papyri. The number of people on the Internet who all have access to the DPA doubles every year. Its audience is fast becoming the world. The questions sent to the papyrus mailbox, which is part of the DPA, range from scholarly inquiries from colleagues to requests from grade school students for their projects. Thousands of users have logged in from around the world.

A few years ago the Dead Sea Scrolls affair raised the issue of who should have access to ancient manuscripts. This has resulted in the release of this material in microform. It has also created the impression that scholars working on ancient manuscripts do not want to share their privileges with others. Occasionally a sensational find of a Greek literary papyrus is indeed kept from the eyes of scholars for years, but this is bad for papyrology, as the disastrous publication (or rather non-publication) history of the Derveni papyrus shows. There are really only a few scholars who are up to publishing papyri and they all know and trust each other.

The DPA puts the Society of Biblical Literature's Statement on Access to Ancient Written Materials into practice. To quote: "Those who own or control ancient written materials should allow all scholars to have access to them. If the condition of the written materials requires that access to them be restricted," - of course it does - "arrangements should be made for a facsimile reproduction that will be accessible to all scholars." Those who formulated this statement were thinking of the Dead Sea Scrolls and the checkered history of their publication, but there are a few examples of ancient written materials made accessible in facsimile almost instantly after they were acquired. In fact these examples suggest that only good can come from increased accessibility. Restrictions have never resulted in anything good; the Derveni papyrus and the Dead Sea Scrolls disasters offer instructive warnings. Not long after the Nag Hammadi codices with Coptic gnostic texts were found a facsimile was published, which enabled a team of scholars from around the world to start producing reliable editions of these texts, kept at the Coptic Museum in Cairo. Editions with full philological apparatus as well as translations have been widely available for many years now and these texts have thus become part and parcel of scholarship on early Christianity. The contrast with the pirated transcription of the Derveni papyrus is painful, because that text is so important to classicists.

To conclude, the DPA not only provides a challenge to other institutions holding collections of both published and unpublished papyri, but also appeals to the wider world of scholarship. The Duke papyri present an interesting mix of materials for as many contexts as there are scholars. As noted earlier, the work on the ninety-five percent unpublished pieces has been preliminary at best. The DPA invites papyrologists to finish the job of editing the texts. It invites others to become papyrologists.

