Here is the SGML posting I mentioned in my previous message. --David Gants Date: Wed, 28 Aug 1996 12:00:26 CDT From: Murray McGillivray <mmcgilli@acs.ucalgary.ca> Subject: Textual Representation Group The following message is being widely posted. Please forgive me if you receive multiple copies; please copy to listservs and newsgroups I may have ignored but that you know would be interested. Below you will find the Draft Terms of Reference of a new Working Group formed this spring. Most of the tasks listed were discussed as desiderata at the physical sessions of the Electric Scriptorium conference in Calgary (November 1995), and the initial members of the working group were attendees at that conference. We have since been joined by prominent textual scholars and members of the international text-encoding community. The draft terms of reference are being posted to several different scholarly lists for comment. We would appreciate it if you would post your comments to scriptorium@morgan.ucs.mun.ca (normal listserv subscription procedures apply: send a single-line message, SUBSCRIBE SCRIPTORIUM <Your Name> to listserv@morgan.ucs.mun.ca before attempting to post to the list). We are especially interested to hear >from people who think that our plan (or some part of it) is badly thought-out or just a bad idea. We also would welcome new members and offers of assistance. Murray McGillivray University of Calgary ______________________________________________________ DRAFT TERMS OF REFERENCE WORKING GROUP ON ELECTRONIC REPRESENTATION OF HANDWRITTEN AND PRINTED MATERIALS Preamble: The Working Group on Electronic Representation of Handwritten and Printed Materials has been formed to devise standards and guidelines to be used by scholars who wish to prepare electronic editions, textbases, image-bases, or other electronic objects which have as a purpose the exact representation of a physical print or manuscript of whatever form, period, language, or script. Significant work has been done towards this goal by the Text Encoding Initiative in its guidelines published in 1994 (hereafter "P3"). There remain, however, several areas which the Text Encoding Initiative did not have the time or resources to explore fully. Chief among these is the area of exact bibliographic description, with respect to which P3 explicitly solicits further work, but there are other areas where P3 is either merely suggestive or where it gives too many possible answers, so that scholars may, fairly arbitrarily, make completely different decisions about essentially similar problems, with resulting incompatibility between what ought to be compatible projects. The primary focus of the new Working Group is the exact electronic representation of physical books (and other printed or handwritten materials), a focus both narrower than and different from the main goal of P3. It is therefore not so much a continuation of the TEI effort as it is a new initiative, which will, however, aim for compatibility with the TEI recommendations wherever possible. Draft Statement of Goals: The Working Group sees a need for standards both for text-file representations and for graphics-file representations of printed and handwritten materials, since it is clear that scholars can make use in different ways of electronic transcriptions or encodings of physical prints and manuscripts and of electronic images of those prints and manuscripts. Some goals are certain to emerge in the course of the work. However, there are a certain number of goals that the Working Group adopts explicitly from the beginning: For text-files: 1. To devise standard encodings for the most common of those marks used by European and American scribes and printers that are not contained in modern standard character sets (such as brevigraphs, ligatures, and variant letter forms) in the form of SGML-type entity sets. These entity sets would supplement the work of the Text Encoding Initiative and serve as a resource for scholars using P3 for transcription of manuscript or print materials. 2. To recommend procedures for scholars whose encoding needs are not met by the entity sets mentioned in 1). 3. To devise standard ways, likely using SGML-type tags, of encoding structural and other features of physical books and documents that do not fall under a strict definition of "text," such as: page layout; font, script or hand; condition of the carrier or ink; relationship of leaves, quires, signatures, etc. to one another; interlineal or marginal annotation; damage, whether affecting the text or not; and so on. Many of these features are treated suggestively in P3, but not considered as fully as the working group intends to do. 4. To develop appropriate definitions describing the possible relationships between the tags in the proposed tag-sets. The new encoding procedures should be as compatible as possible with the work of TEI: a DTD fragment should be developed that will make documents encoded using the new tags TEI-conformant, and further DTD fragments or other encoding strategems should be developed that will make it possible for encoders to combine tags from the existing TEI sets with tags from the proposed set(s) within the same document without loss of structural information. 5. To develop a standard procedure for referencing a graphics file from a text file when the graphics file contains an image of the page that is transcribed in the text file. For graphics files: 1. To develop for graphics files containing images of printed or handwritten pages a standard text header that would specify such things as: bibliographic information, page of book, copy number, library where kept, size, and so on of original the image is of; by whom and how the image was created (camera or other device, lighting used, method of digitizing, etc.); what transformations it has undergone since digitized and who was responsible (resizing, compression, dithering, adjustments of colour, etc.); the format of the image (e.g. GIF, TIFF); copyright information for the image where relevant. 2. To recommend standard ways of indicating within a graphic image itself the size and colour of the original it is a picture of. 3. To recommend standard ways of referring to precise locations or areas within a graphic image of a printed or handwritten text. 4. To develop a standard procedure for referencing a text file from a graphics file when the graphics file contains an image of the page that is transcribed in the text file. 5. To investigate and if possible recommend on ways of annotating images by attaching text files or other graphics files to them. 6. To recommend standard methods for mutually referencing several graphics files when those graphics files are images of successive pages of a single book or manuscript, and for indicating the relationships between: images of pages that succeed one another; the images of the two sides of a leaf; images of pages related by membership in a gathering or signature; and so on. 7. To investigate ways of assuring the integrity of images contained in graphics files, so that users can be reasonably certain that they are working with images that have not been altered without their knowledge.
Robert W. Allison
Dept. of Philosophy & Religion, Bates College and
James Hart
Information Services, Bates College Lewiston, Maine, 04240