========================================================================= Date: Wed, 2 Dec 1992 16:09:40 CST Reply-To: PETERR@VAX.OXFORD.AC.UK Sender: "TEI-L: Text Encoding Initiative public discussion list" From: PETERR@VAX.OXFORD.AC.UK Subject: of page breaks and TEI Of page-breaks and TEI I have been having some small struggles with encoding page breaks in various 18th and 19th century printed books (they allowed me off manuscripts for this one). Take the case of pages 346-7 of the reprint of Hannah More's "Village Politics" in her 1818 "Collected Works". At the base of p. 346 against the right margin we have the catch-word "Tom."; at the top of the next page we have the page number 347 against the right margin and a running header "Village Politics", centred and in small caps; at the base of this page we have the signature "Q6", in a slightly smaller type face than the main text and centred and the catchword "merly". So far as I can see, P1 suggests capturing all this information within the "page.break" tag. Adapting the example on p. 125 of P1, you might code this page break between pages 346 and 347 in Hannah More: There seem to be a few problems in this. Firstly, you would not know, just from looking at this coding, whether the 347 referred to the preceding or following page. In fact, P1 states clearly (on p. 125) that the page.break must be placed "at the start of every new page", and so the 347 refers to the following page. This is fine if one knows the documentation, but it would seem better if the page.break mechanism were so designed that one could tell from its use alone just which page was which, and not rely on everyone knowing and obeying the administrative fiat by which P1 declares that page breaks do not come at the end of pages, nor between them, but at their beginning (I believe some time in P1's dark past there was an intense theological debate over just where page breaks happened. Oh to have been there). Actually, it would make better sense if P1 called this element "new.page" (which is what it is) and not "page.break". There is another problem with the dictum that page-breaks come at the beginning of the page. According to P1, this means that catch- words and signatures (which appear at the bottom of the page) must be captured in the page.break tag which is placed at the top of the page. Thus, the catch-word "merly", which appears at the bottom of the page and connects this page to the next, is bundled with the page.break tag at the top of the page. This might suggest to the unwary reader that this catchword actually connects this page to the last, not this page to the next. It certainly seems counter- intuitive, to me at least. The P1 page.break mechanism also gives no indication as to where in the text the page number actually appears: whether at the bottom or top of the page, right or left or centred, surrounded by printer's ornaments, etc. Yet another problem is the insistence, it seems, that the text of all these (page numbers, signatures, catch-words) must be captured as attributes of all-powerful page.break. What of the case where one wishes to include tags within the text of (say) a catch-word? A catch-word might contain an interesting abbreviation which one might wish to register with or ; it might be italicized (as in the catch-word "Tom." at the bottom of p. 346); one might wish to attach a note to an unusual signature; formatters will find it more difficult to extract the text of the attributes and display it correctly. The greatest problem looks to me to be the running header, "Village Politics". There seems no place for this in the page.break mechanism, or anywhere else in P1 for that matter. Given that running headers can be very important in all sorts of ways we need them. Dickens delighted in clever running headers, a sort of marginal commentary on his text, and I seem to recall a nice Randall McCleod (I think) article in which he drew elegant conclusions about the printing of Shakespeare's Sonnets from the changing spacing in the running headers from compositor to compositor. We need running footers too, for that matter. Indeed, it would be rather difficult to encode the page.breaks in P1 with P1' s own mechanisms: P1 has BOTH running headers and footers! So now for some mild suggestions. Instead of a single page.break tag, I suggest we have a small swag of tags, all grouped within a page.break element. Thus: A page break is defined as occurring not at the beginning or end of a page but between two pages; A page break may contain two elements,