========================================================================= Date: Fri, 2 Jul 1993 16:34:56 CDT Reply-To: "Brian S. Baigrie" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "Brian S. Baigrie" Subject: Descartes Project The DESCARTES PROJECT At the University of Toronto I would like to alert members of the list to a new project and to seek advice, suggestions, criticisms, and the like. We have applied for and received funding to help us (Brian Baigrie, Calvin Normore, Andre Gombay) to produce three "tools": (1) a machine readable version of Descartes' collected works (2) a Descartes biographical di|ctionary (3) and a book with the title -- _Truth and Fabrication: Descartes on Scepticism, Automata, and Rights_. The third tool is not relevant to this discussion group, so I will stick to (1) and (2). (1) The canonical text is Adam and Tannery (14 vols). We intend to prepare a machine readable version of this along with correspondence found elsewhere. (2) We will entract a list of entries for the Dictionary from the e-text. This list will include all persons, institutions, groups, and places mentioned by Descartes. We will establish a set of fields for each kind of entry and our research assistants will track down the relevant information. Our original proposal was to produce a working (not perfect) electronic text. It function was to yield a set of entries for the Dictionary. However, we are now feeling more ambitious. We are negotiating with Vrin (the publisher of the canonical text) for permission to produce an electronic text. We now intend to produce a near perfect e-text and to donate copies of this to ARTFL and other lists. We also plan to produce a commercial version bundled with the electronic version of the Descartes Dictionary. What I'm seeking is advice about how to proceed. What we envision is a hypertext-like environment with links between the e-text and dictionary. Find a word -- say, van Schooten -- hit a key, and the appropriate part of the Dictionary opens up on sceen. Hit another key, and return to the e-text. We'd also like to have ports open to other possible add-ons. We are applying for funds to look at other Cartesians. We are also interested in establishing links with other relevant e-text (e.g., the Mersenne correspondence). The most important element for us is the text itself. Philosophy is text driven and we need to retain cross-references to the conanical text each step of the way. Any encoding, therefore, will be regulated by the authority of the text, first and foremost. If you have thoughts that you'd are to share with us, or knowledge of related projects, we'd be grateful to hear about them. Brian Baigrie IHPST University of Toronto Toronto, Canada M5S 1K7 baigrie@epas.utoronto.ca P.S. Yesterday, at Willard McCarty's suggestion, I started a journal for our project on the assumption that it might be of interest to specialists in TEI. ========================================================================= Date: Tue, 6 Jul 1993 18:55:36 CDT Reply-To: "Prof. Dr. Koehler" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "Prof. Dr. Koehler" Subject: Dateitransfer (Please redistribute on relevant bulletin boards) CALL FOR PAPERS A new journal on QUANTITATIVE LINGUISTICS will be launched in 1994. Authors are invited to submit 4 copies of their article providing it is within the scope of the journal. Scope The Journal of Quantitative Linguistics is a new international forum for the publication and discussion of research on the quantitative characteristics of language and text in an exact mathematical form. This approach, which is of growing interest, opens up important and exciting theoretical perspectives, as well as solutions for a wide range of practical problems such as machine learning or statistical parsing, by introducing into linguistics the methods and models of advanced scientific disciplines such as the natural sciences, economics and psychology. Specifically, JQL will publish papers on: 1) Observations and descriptions of all aspects of language and text phenomena, including the areas of psycholinguistics, sociolinguistics, dialectology, pragmatics, etc. as far as they use quantitative mathematical methods (probability theory, stochastic processes, differential and difference equations, fuzzy logics and set theory, function theory etc.), on all levels of linguistic analysis. 2) Applications of methods, models, or findings from quantitative linguistics to problems of natural language processing, machine translation, language teaching, documentation and information retrieval. 3) Methodological problems of linguistic measurement, model construction, sampling and test theory. 4) Epistemological issues such as explanation of language and text phenomena, contributions to theory construction, systems theory, philosophy of science. Audience The Journal of Quantitative Linguistics will be important reading for all researchers in the following disciplines who are interested in quantitative methods and observations: linguistics, mathematics, statistics, artificial intelligence, cognitive science, and stylistics. The Journal is edited by Reinhard Koehler, linguistische Datenverarbeitung, University of Trier, Germany and will initially have 3 issues a year. Associate Editors: Gabriel Altmann, Bochum Sheila Embleton, York Assistant Editor: Peter Schmidt, Trier Editorial Board: Harald Baayen, Nijmegen Kenneth Church, Murray Hill, NJ Jacques Guy, Clayton (Australia) Christian Delcourt, Liege Lud^ek Hr^eb'i^cek, Prague Tatsuo Miyajima, Osaka Hiroshi Nakano, Tokyo Rajmond G. Piotrowski, St. Peterburg Anatolij A. Polikarpov, Moscow Roland Posner, Berlin Burghard Rieger, Trier Jadwiga Sambor, Warsow Pauli Saukkonnen, Oulu/Helsinki Royal Skousen, Provo, Utah How to Submit Send 4 copies to one of these addresses: The Journal of Quantitative Linguistics Editorial Office Swets & Zeitlinger / SPS P.O. Box 825 2160 SZ Lisse The Netherlands or The Journal of Quantitative Linguistics Editorial Office Swets & Zeitlinger / SPS P.O. Box 613 ROYERSFORD, PA 19468 U.S.A. For further information please contact by e-mail: koehler@ldv01.uni-trier.de or Scrivy@swets.nl ---------------------------------------------------------------------------- ========================================================================= Date: Tue, 6 Jul 1993 18:56:16 CDT Reply-To: "C. M. Sperberg-McQueen" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "C. M. Sperberg-McQueen" Subject: TEI Advisory Board has met TEI Advisory Board Meeting The Advisory Board of the Text Encoding Initiative met on Monday and Tuesday, 28-29 June, in Chicago. After an intensive review of the technical content of the draft Guidelines for Text Encoding and Interchange (TEI P2), and of the TEI's plans for future dissemination, introductory manuals, workshops, evaluation efforts, and further development, the board approved the work done to date. With the completion of this review by the advisory board, the TEI has nearly completed the preparation of its Guidelines for the encoding and interchange of machine-readable texts. After completion of further editorial revision to ensure consistency and clarity, and some further substantive improvements, the Guidelines will be published late this year as document TEI P3. (Further announcements will be made on this and other lists.) At that point, the focus of the TEI will shift to active dissemination of the Guidelines, including the organization of workshops, the preparation of short introductory manuals, and consulting with projects who can use assistance applying the TEI to their materials. Technical work will continue on both new and continuing topics, so that the Guidelines can remain current and useful in as broad a range of research-oriented applications as possible. Many thanks are due on behalf of the entire research community to those who have served on the TEI work groups and working committees, who have made the Guidelines a serious contribution to the field. -C. M. Sperberg-McQueen Lou Burnard Editors, Text Encoding Initiative ========================================================================= Date: Sun, 11 Jul 1993 00:10:43 CDT Reply-To: "Henry S. Thompson" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "Henry S. Thompson" Subject: Re: TEI Advisory Board has met In-Reply-To: "C. M. Sperberg-McQueen"'s message of Tue, 6 Jul 1993 18:56:16 CDT I am concerned that it now appears that P2 will not be available for public comment (we have something like 8 of a promised 80 fascicles) before P the 3rd and last is brought down from the mountain. Please either correct my misunderstanding or justify this change of plan. ht ========================================================================= Date: Mon, 12 Jul 1993 10:39:01 CDT Reply-To: Lou Burnard Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Lou Burnard Subject: Wot? no fascicles? Spurred on by Henry Thompson's recent enquiry about the absence of any published fascicles of P2 over the last three months, I take this opportunity to apologize to readers of this list for the regrettable hiatus in their summer reading. Just to set the record straight: in fact eleven fascicles of P2 have so far appeared (not 8!), out of an expected total of 42 (not 80!). It should also be noted that the fascicles vary very greatly in size and significance: if you have collected the complete set so far, you have considerably more than a quarter of what will eventually appear as P2, both in terms of bulk and in terms of intellectual content. The last fascicle published was ST (chapter 3: structure of the TEI dtd) which appeared at the beginning of April. During May and June, the TEI editorial offices have been more or less entirely devoted to the task of organizing complete drafts of P2 for review by first the TEI Technical Review Committee (meeting in mid May) and second the TEI Advisory Board (meeting at the end of June). This does not mean that work on individual chapters has ceased: only that we have not had time to push them out for publication individually. The process of creating for the first time an integrated draft of the whole of P2 (and still more the process of creating the TEI dtd) has inevitably shown up inconsistencies between individual chapters, which has further discouraged us from issuing them as free-standing objects. It's our intention to publish all of P2 as fascicles eventually. As we now have only a couple of months before copy is locked for inclusion in the full published draft, you may expect to see a flurry of fascicles along in the near future. Any resemblance to London buses is entirely accidental! Lou Burnard p.s. For convenience, I append a list of currently-published fascicles. These are all currently available either from Listserv or by anonymous ftp from sgml1.ex.ac.uk (and elsewhere) DATE FILE CHAPTER Apr 92 34 10 Base for Transcriptions of Spoken Texts (renamed TS) Jul 92 21 4 Characters and character sets (renamed CH) Aug 92 22 5 The TEI Header (renamed HD) Nov 92 GR 42 SGML Grammar Dec 92 TE 13 Base for Terminological Data Dec 92 CO 6 Elements available in all TEI documents Jan 93 SA 16 Segmentation and Alignment Mar 93 AB 1 About these Guidelines Mar 93 CC 26 Language Corpora Apr 93 DS 7 Default Text Structure (replaces version of PR released Oct 92) Apr 93 ST 3 Structure of the TEI dtd ========================================================================= Date: Mon, 12 Jul 1993 11:17:38 CDT Reply-To: "Wendy Plotkin, TEI (312) 413-0331" Sender: "TEI-L: Text Encoding Initiative public discussion list" From: "Wendy Plotkin, TEI (312) 413-0331" Subject: Obtaining P2 Fascicles As a follow-up to Lou Burnard's note, I would like to remind TEI-L subscribers of the two documents that include information on how to obtain the electronic versions of the published P2 fascicles. The TEI-L filelist lists each of the fascicles that are currently published, in a variety of formats (e.g. P2CH Doc for the ASCII, P2CH PS for the Postscript). To obtain the filelist, send a note to Listserv@uicvm or Listserv@uicvm.uic.edu with the message: Index TEI-L More detailed information on how to obtain the fascicles from the Listserv in the United States and the FTP and Listserv sites in Europe and Asia is available in the document TEI ED J8. To obtain this document in ASCII, send a note to Listserv@uicvm or Listserv@uicvm.uic.edu with the message: Get EDJ8 Memo If you have any questions about this procedure, please contact me at U49127@uicvm or U49127@uicvm.uic.edu. Wendy Plotkin ========================================================================= Date: Tue, 13 Jul 1993 15:58:58 CDT Reply-To: Jorn Barger Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Jorn Barger Subject: TEI *Email* DTD? Are there plans for a DTD for email? It's not like I need one, but I'm curious.... (In fact, what set me thinking that was I noticed how in email sometimes *emphasis* will be added to quoted text: > ...when I was speaking of the blablablah... ^^^^^^^is this a hidden slur on Blab-Lab? ;^) and I wondered what the terminology was for this... jorn barger jorn@chinet.com ;^) ========================================================================= Date: Thu, 15 Jul 1993 14:49:29 CDT Reply-To: Peter Flynn Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Peter Flynn Subject: Re: TEI *Email* DTD? > Are there plans for a DTD for email? Already there. MIME (Multipurpose Internet Mail Extensions). ///Peter ========================================================================= Date: Thu, 15 Jul 1993 14:55:40 CDT Reply-To: sjd@ebt.com Sender: "TEI-L: Text Encoding Initiative public discussion list" From: sjd@ebt.com Subject: Re: TEI *Email* DTD? At 3:58 PM 7/13/93 -0500, Jorn Barger wrote: >Are there plans for a DTD for email? > I don't think we addressed that; but if you need one I've written a simple one that I've found quite useful -- especially because it comes with C code to convert mail files into it (it's fairly good at detecting components in the mail bodies, though I forget if I wrote it to trap words surrounded by underscores/asterisks/etc. -- I know I thought about it at the time. Steve DeRose ========================================================================= Date: Fri, 16 Jul 1993 14:24:14 CDT Reply-To: Erik Naggum Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Erik Naggum Subject: Re: TEI *Email* DTD? In-Reply-To: <9307141106.AA18218@curia.ucc.ie> ----------------------------Original message---------------------------- [Peter Flynn] : | > Are there plans for a DTD for email? | | Already there. MIME (Multipurpose Internet Mail Extensions). Whatever MIME is, it is not a DTD, and it has nothing to do with SGML. I don't know of any specific plans for a DTD for e-mail. Best regards, -- Erik Naggum ISO 8879 SGML Chairman, SGML SIGhyper ISO 10744 HyTime "Memento, terrigena. Memento, vita brevis." ISO 10646 UCS ========================================================================= Date: Fri, 16 Jul 1993 14:24:42 CDT Reply-To: Peter Flynn Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Peter Flynn Subject: Re: TEI *Email* DTD? In-Reply-To: <19930715.021@sfo.naggum.no> from "Erik Naggum" at Jul 15, 93 02:39:39 pm > > Whatever MIME is, it is not a DTD, and it has nothing to do with SGML. I've obviously been misled by the discussions on comp.mail.mime for which my apologies: I thought there actually was a MIME.DTD around. > I don't know of any specific plans for a DTD for e-mail. In that case, nor do I, and I'm not sure that mail as a concept could easily be made the subject of a DTD. Mail messages vary so widely (and wildly) in their content that even a single all-embracing DTD probably couldn't cover all the things people want to do in mail messages. What might be practicable though would be a header like X-DTD: which could be detected by MUAs and cause a switch to a display system which would implement the specified .dtd, but I'm still not convinced we need it. ///Peter ========================================================================= Date: Fri, 16 Jul 1993 14:25:13 CDT Reply-To: Erik Naggum Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Erik Naggum Subject: Re: TEI *Email* DTD? In-Reply-To: <9307152250.AA13732@curia.ucc.ie> [Erik Naggum] : | Whatever MIME is, it is not a DTD, and it has nothing to do with SGML. [Peter Flynn] : | I've obviously been misled by the discussions on comp.mail.mime for | which my apologies: I thought there actually was a MIME.DTD around. I have not been following those discussions closely of late, but MIME used to be concerned only with the body of mail messages, and I tacitly assumed that were were talking about e-mail as a whole, not just the bodies, in fact, I do not consider the body part specifications in MIME to be very significant, but I also know I'm in the minority here. However, the mail headers can relatively easily be expressed in SGML, and a DTD for that should be possible to design without much effort. The biggest problem is the ability to add headers that are not specified in RFC 822 or updates, so that there are headers for which the semantics is known, and which should have their own element types to capture their structure, and headers for which no meaning is defined, and whose headers names needs to be preserved. This distinction tends to complicate the picture somewhat. | In that case, nor do I, and I'm not sure that mail as a concept could | easily be made the subject of a DTD. Mail messages vary so widely (and | wildly) in their content that even a single all-embracing DTD probably | couldn't cover all the things people want to do in mail messages. For the body part stuff, I'm sure a DTD can be written up for MIME, but I'm not so certain about the other stateful stuff that MIME talks about in those body parts. Character sets is one of them. | What might be practicable though would be a header like X-DTD: | which could be detected by MUAs and cause a switch to a display system | which would implement the specified .dtd, but I'm still not | convinced we need it. This is not a very good solution. It is better to allow any DTD as long as it comes with a link process definition (LINK) that uses presentation attributes defined for the MIME MUA. Then people can use whatever DTD they want, and still get useful things out of it. Best regards, -- Erik Naggum ISO 8879 SGML Chairman, SGML SIGhyper ISO 10744 HyTime "Memento, terrigena. Memento, vita brevis." ISO 10646 UCS ========================================================================= Date: Fri, 16 Jul 1993 14:25:39 CDT Reply-To: Peter Flynn Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Peter Flynn Subject: Re: TEI *Email* DTD? In-Reply-To: <19930715.027@sfo.naggum.no> from "Erik Naggum" at Jul 15, 93 04:28:41 pm Erik writes: > I have not been following those discussions closely of late, but MIME used > to be concerned only with the body of mail messages, and I tacitly assumed > that were were talking about e-mail as a whole, not just the bodies, in > fact, I do not consider the body part specifications in MIME to be very > significant, but I also know I'm in the minority here. I had never actually considered anything _except_ the body-part. It never dawned on me that someone might want to express headers in SGML (why would someone want to do that? unless building a textbase of email messages, which is not at all what I was thinking of). > However, the mail headers can relatively easily be expressed in SGML, and a > DTD for that should be possible to design without much effort. The biggest > problem is the ability to add headers that are not specified in RFC 822 or > updates, so that there are headers for which the semantics is known, and > which should have their own element types to capture their structure, and > headers for which no meaning is defined, and whose headers names needs to > be preserved. This distinction tends to complicate the picture somewhat. Yes, this would certainly be possible. Looking a bit further into the future, it might be more productive to tackle X.400 structures, which are better- defined, and probably the way things will ultimately go. I know it doesn't solve the problem of what to do with RFC822 headers, but if those are going to die, do we need to sped a lot of effort on them? Unless, as said, one is constructing a corpus of email... > For the body part stuff, I'm sure a DTD can be written up for MIME, but I'm > not so certain about the other stateful stuff that MIME talks about in > those body parts. Character sets is one of them. My understanding (already shewn to be faulty :-) was that MIME and HTML held a number of things in common. The spur for development was the perceived need to be able to represent purely visual attributes (italics, bold and the like) in email, and to add a hypertext dimension, enabling some form of live x-reference between messages, or between a message and a URL elsewhere. But I really haven't gotten into it: I downloaded a copy of metaMail because MIME is turned ON in my version of elm, and I wanted to see what it did. It failed to compile tho, so I left it. > This is not a very good solution. It is better to allow any DTD as long as > it comes with a link process definition (LINK) that uses presentation > attributes defined for the MIME MUA. Then people can use whatever DTD they > want, and still get useful things out of it. I'd like to see this perform at an acceptable speed if it has to read and act on a DTD in real time :-) ///Peter ========================================================================= Date: Tue, 20 Jul 1993 19:22:11 CDT Reply-To: christian wittern Sender: "TEI-L: Text Encoding Initiative public discussion list" From: christian wittern Subject: TEI for Buddhist Text Database?? Hello, At the shaping stages for a large text database project, which eventually might include the whole Buddhist Canon in Chinese in several hundred volumes, I'm trying to establish if the TEI-Guidelines are applyable for our project and need some advice. 1. After scanning through some megabytes of DOC's, DTD's and Drafts I got the impression, that coding the text in Chinese Characters would require the proper writing system declaration and character set declaration somewhere in the beginning. Does this mean, that all the rest, i.e. the tags etc. will also be in this writing system (that is, in double-byte characters)? Is there any software, which can handle this? 2. For some texts, we want to incorporate a translation in the e-text. Are there some special tags available for doing so, or do we have to use the usual pointer-references? 3. Apparently the part of the P2 draft, which deals with critical editions is not yet released. What approach is recommended in the meantime to code such texts? 4. Are there any complete examples of TEI conformant coded texts, which could be studied for reference. Any help would be appreciated, Christian Wittern, Kyoto International Research Institute for Zen-Buddhism, Hanazono College ========================================================================= Date: Tue, 27 Jul 1993 14:48:59 CDT Reply-To: Origami Ltd <71461.2021@CompuServe.COM> Sender: "TEI-L: Text Encoding Initiative public discussion list" From: Origami Ltd <71461.2021@CompuServe.COM> Subject: ISO Standard 12 620 - information wanted I have attempted to get a hold of ISO 12 620 (referred to extensively in "Base Tag Set for Terminological Data" of TEI P2) through ANSI and was unable to. They said that it is still in the working group stage and they are unable to get any information on it. Is there a way to get a draft copy of this standard? If not, is there at least a document describing the goals that standard is trying to meet? Does anybody know how close this standard is to publication? Any information would be greatly appreciated, Craig Ogg (cogg@attmail.com) Dir. of Information Products Origami, Ltd.