|
|
OSIS™ 2.0 User's Manual (draft)
Draft Version of OSIS User's Manual
Note the updated schema and users manual number. One of
our users already spotted a bug and it has been
corrected. Make sure the schema and users manual you are
consulting correspond in the numbering.
As you go through this guide to the OSIS 2.0.1 schema, you are going
to notice mistakes, omissions and examples you don't find
useful. Those were not left as an exercise for the reader.
The editors discussed having a registry of Bible verses for people
who contribute corrections, supply omissions or examples but
feared that there might be more corrections, supplied omissions
or examples than there are verses in the Bible. Not to mention
that some verses are more popular than others.
So, as an alternative, future versions of the OSIS User's
Manual will have a Contributor's section, which will list your
name and the number of corrections or supplied omissions/examples
that you have contributed to the manual. Counting by the editors
will be final but generous and credit given for duplicates or
suggestions not ultimately used in the form submitted. Please
specify if you want your email contact information included as
well. Address your comments, corrections, supplied
omissions/examples, to osis-editors@bibletechnologieswg.org.
This manual is meant to be a guide for all users of the OSIS
schema and your assistance will be appreciate both by the editors
as well as the community of OSIS users.
Contents
1. Introduction to OSIS™
Welcome to the OSIS (Open Scriptural Information Standard™)
User's Manual. OSIS is a set of XML structures that can be used to
produce Bibles, commentaries, and related texts that can be easily
interchanged with other users, formatted as HTML, PDF, Postscript or
any other desired format, and searched on any personal computer. It
provides a standard way to express such documents, which is important
because it saves time, money, and effort for:
-
authors, who will have less need to adjust their manuscripts for each
different potential publisher;
-
publishers, who will gradually come to experience lower costs by not
having to manage converting texts presented by authors in so wide a
variety of formats, and by not having to provide texts in a different
form to each electronic-book system vendor out there (or pay
indirectly for those vendors to do the conversions).
-
and software vendors, who can avoid writing a lot of code to manage
different formats, and thus make their programs smaller, faster, and
more reliable.
The OSIS development team closely studied previous Bible encoding
forms, as well as tools for literary encoding in general. By doing
this we hope we have avoided some weaknesses, and gained from some
strengths, of each one, and we thank the many people who worked on
those prior specifications, as well as those who have provided help
and feedback in developing OSIS itself, and testing it by encoding
large numbers of Biblical and related texts. A list of participants
may be found in an Appendix.
Users familiar with the Text Encoding Initiative will find OSIS
markup quite familiar, because the bulk of the elements we define
correspond directly to TEI elements, and almost always have the same
name (though often simplified content). The schema also provides a
TEIform attribute for such elements, so they can be recognized by
form-aware processors as equivalent to their TEI counterparts. We
have attempted to point out any elements below that do not have TEI
equivalents, for the sake of anyone using both systems.
OSIS is provided as a free resource by the Bible Technologies
Group™ (or BTG™), which is a collaborative effort of the
American Bible Society, the Society of Biblical Literature, the
Summer Institute of Linguistics, the United Bible Societies, other
Bible Societies and related groups, and individual volunteers around
the world. OSIS is designed to meet the needs of diverse user
communities who read, study, research, translate or distribute
biblical texts. This introduction gives a brief overview of OSIS
before leading you step by step through producing your first OSIS
text.
For more information on OSIS, you may wish to join the OSIS Users'
Group. To do so, send mail to osis-user@whi.wts.edu, setting the
Subject line to "subscribe". Online information about OSIS is also
available at http://www.bibletechnologies.org and
http://www.bibletechnologieswg.org.
2. Getting started
The first question that is often asked when learning that OSIS
uses XML (a markup language) is: "I'm not a computer person. Can
I learn to use OSIS?" If you can type and use even the most
basic word processor or computer text-editing program, the answer is
clearly "Yes!" OSIS was designed to be offer the beginning
user a simple way to do the basic "markup" required for a standard
biblical text. "Markup" refers to markers placed within the text,
that indicate where useful units (or "elements") such as verses,
quotations, cross-references, and other things begin and end.
If you know HTML, you already know most of what you need to know
to use OSIS; OSIS uses the same pointy-bracket syntax as HTML (or
XHTML to be completely precise). It merely provides a different set
of element and attribute names. A few names such as "p" and "div" are
the same; others are new, such as "verse". The core set of elements
for OSIS is actually smaller than the set for HTML 3.2. To be sure,
there are some complex cases that we deal with later, but you can do
useful work with no more information than is provided in this basic
manual.
The second question that is most often asked is: ‘Do I need an
XML editor to do OSIS?’ This question often comes up after a
friend of a friend has recommended some editor, and you then checked
its price. XML editors vary from free to over $10,000.00 (US), and
many are difficult to use (though XMetal™ is a notable
exception, and not very expensive).
The basic answer is no, you do not need any special software. You can
use any text editor you like to create OSIS documents (or any other
XML documents, for that matter). Many will even color the tags for
you, because they know how to color HTML tags and the languages are
similar enough. However, you should have a way to check your documents
for errors -- if your editor doesn't know enough about XML to warn you
if you misspell a tag, or forget to end some element that you started,
you will want to check for errors periodically using an "XML
validator". Many such program are available for various computers;
some are available as Web services. (See Appendix, Validating Your
OSIS Document for pointers and instructions on web based validation
services.) Both Internet Explorer and Netscape can also validate an
OSIS file once you have installed the OSIS rules file (called a
"schema") and an appropriate stylesheet.
An OSIS-aware text editor will do this checking for you, either on
demand or continuously. A friendlier OSIS-aware text editor will
provide help by showing you just which elements are permitted at any
given place. The friendliest editors also give you the option to see
and edit a fully-formatted view on demand, rather than staring
directly at pointy-brackets. The choice between the many tools is a
personal one, dictated by your working style, level of technical
sophistication, goals, budget, and other factors.
3. Some authoring tools
The OSIS team is working even as this manual is being written to
adapt free authoring tools that will hide most if not all of the
markup from the casual user of OSIS. In the meantime, the best way to
learn OSIS is to use a simple text editor, such as WordPad or Kedit
on Windows, BBEdit or Alpha on MacOS, or vi or emacs on Linux. You
can even use a word processor, though any formatting that you do in
it won't matter (you would simply save the file as "text only").
The examples in this manual have been kept deliberately short and
can be downloaded as a package from the OSIS website. After you have
gained some basic skill using OSIS, you may want try out more
sophisticated editors.
Editing is much easier with an editing program that is aware of XML
rules in general, and OSIS in particular. For example, rather than
seeing literal tags with pointy-brackets, you can have a choice of
seeing that, or structural views of your document (say, as a tree or
expandable outline), or fully-formatted views to facilitate print
layout.
Many products are available that can help you edit XML documents. One
style shows the literal XML source file, but colors tags, attributes,
and other things to make them stand out. Most such programs also read
an XML schema and ensure that you only insert elements and attributes
are permitted by the OSIS schema (schemas, such as the OSIS one,
declare what elements and attributes are permitted where in documents
of a particular kind). One free and helpful tool of this kinds is
jEdit, which runs on most platforms. It can be set up to know about
many kinds of files, including XML files, and OSIS in particular.
With such an editor, you can see or print a basic a formatted view by
using most any Web browser. Later in this manual are instructions for
setting up an OSIS file with a style sheet (generally in CSS) so that
typical browsers can deal with it.
There are also more word-processor-like XML editors, which primarily
show a formatted view defined by some style sheet. These are mainly
commercial. XML Spy is one such tool (see http://www.xmlspy.com/);
XMetal (see
http://www.corel.com/servlet/Satellite?pagename=Corel/Products/productInfo&id=1042152754863)
is another.
For high-end layout and typesetting from XML source files, usually a
stylesheet language called XSL-FO is used. Two of the more popular
commercial XSL-FO solutions are 3b2 (see http://www.3b2.com/), and
Antenna House (see http://www.antennahouse.com/). Non-XML-based
composition systems such as Quark™ and TeX generally have ways
to import XML, but using them for XML composition requires
substantial expertise and effort.
4. Your First OSIS Document
Like HTML documents, an OSIS document starts with a header,
and then goes on to the actual text content. The header identifies
the file as being XML, and as using the OSIS schema. It also
provides places to declare a bibliographical description of the work
and of any other works cited; and a place to record a history of
editing changes. Here is a short, but valid, OSIS document:
<?xml version="1.0" encoding="UTF-8"?>
<osis xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.bibletechnologies.net/2003/OSIS/namespace osisCore.2.0.1.xsd">
<osisText osisIDWork="thisWork" osisRefWork="bible" xml:lang="en-US">
<header>
<work osisWork="thisWork">
<title>Contemporary English Version</title>
<type type="OSIS">Bible</type>
<identifier type="OSIS">Bible.en.CEV.1995</identifier>
<rights type="x-copyright">Copyright 1995 American Bible Society</rights>
<scope>Esth.1.1-Esth.1.4</scope>
<refSystem>Bible</refSystem>
</work>
<work osisWork="bible">
<type type="OSIS">Bible</type>
<refSystem>Bible</refSystem>
</work>
</header>
<div type="section" scope="Esth.1.1-Esth.1.4">
<title>Queen Vashti Disobeys King Xerxes</title>
<p>
<verse sID="Esth.1.1-Esth.1.2" osisID="Esth.1.1 Esth.1.2" n="1-2"/>
King Xerxes of Persia lived in his capital city of Susa and ruled one
hundred twenty-seven provinces from India to Ethiopia.
<verse eID="Esth.1.1-Esth.1.2"/>
<verse sID="Esth.1.3" osisID="Esth.1.3"/>
During the third year of his rule, Xerxes gave a big dinner for all
his officials and officers. The governors and leaders of the provinces
were also invited, and even the commanders of the Persian and Median
armies came.
<verse eID="Esth.1.3"/>
<verse sID="Esth.1.4" osisID="Esth.1.4"/>
For one hundred eighty days he showed off his wealth and spent a lot
of money to impress his guests with the greatness of his kingdom.
<verse eID="Esth.1.4"/>
</p>
</div>
</osisText>
</osis>
5. XML and OSIS declarations
The first several lines of any OSIS document will generally be identical:
The first line above identifies the document as being XML; this
is required in exactly the form shown, and enables computers to
identify how to process the rest of the document.
The second through third lines are a very long start-tag for the
outermost OSIS element, which is called "osis." All elements in an
OSIS document must be declared within the OSIS namespace. There
are two ways to achieve this and other than remembering to pick
one of the two following methods, that is all you need remember
about it to start encoding texts using OSIS 2.0.
OSIS Namespace, Method 1: Copy the
following lines just after <?xml version="1.0"
encoding="UTF-8"/>:
<osis xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.bibletechnologies.net/2003/OSIS/namespace osisCore.2.0.1.xsd">
OSIS Namespace, Method 2: Copy the
following lines just after <?xml version="1.0"
encoding="UTF-8"/>:
<osis:osis xmlns:osis="http://www.bibletechnologies.net/2003/OSIS/namespace"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.bibletechnologies.net/2003/OSIS/namespace osisCore.2.0.1.xsd">
Note with the second method, the last closing element must be:
</osis:osis>. The first method is simpler but both are
legitimate.
At this point, the OSIS document has begun. This sample is a single
document rather than a collection of documents, so the next element
opened is osisText:
<osisText osisIDWork="CEV" osisRefWork="Bible" lang="en">
Every osisText needs to supply an osisIDWork attribute and value. The value will
generally be the short name of what is being encoded, in this
case the Contemporary English Version, or CEV. The short name is
defined in the work declaration for the
work, described later. The work element
that identifies the work being encoded should be the first work element, if the text has more than
one. This sets things up for some of the later elements nested
within the osisText element. One such
element is work. It requires an osisWork attribute. That attribute's value has
to be the same as the value found on the osisIDWork attribute of osisText (see line 7 of the sample). Other
elements use/require an osisID attribute
which refer back to the osisIDWork
attribute of osisText (see lines 19 and 21
of the sample).
Every osisText also needs to specify what reference or versification
scheme any osisRefs within it refer to. This may or may not be the
same work. Depending on how finely you distinguish things, there are
several major versification traditions, and countless fine-grained
variations. For the present, we identify and reserve names for these
major traditional reference systems:
- NRSVA New Revised Standard Version with Apocrypha
- NA27 Nestle-Aland, 27th Edition of the Greek New Testament
- KJV King James Version or Authorized Version (AV)
- LXX Septuagint
- MT Masoretic Text. Hebrew
tradition varies in several respects, the best known being that
it numbers what is given as a title for Psalms in most English translations as verse 1,and the beginning of the psalm
in such a translation as verse 2.
- SamPent the Samaritan Pentateuch used a quite different
numbering system.
- Synodal Russian
- Vugl Vulgate
- Loeb This system is used for most classical literature,
though many major works have other systems as well.
OSIS is developing a schema for declaring versification systems
formally, and for declaring some systems in terms of others. This
will enable programs to map between systems. However, at this time we
merely reserve the names above for some systems we know to be
substantially different and important.
6. Canonical vs. non-canonical parts of a work
The element osisText has one other important attribute that is not shown above.
It is called "canonical", and always has a value of "true" or
"false". When true, it asserts that the content is a part of the text
being encoded. For example, the "text" of the Bible includes the content
of books, chapters, and verses but does not include notes,
section-headings added by editors or translators, etc.
The canonical attribute is available on all elements. Its value
inherits in the same manner as xml:lang. Because of this inheritance,
encoders will seldom need to make this attribute explicit. In osisText this attribute is set to a default
value of "true", while header, note, and
reference that setting is overidden by setting the value of that
attribute to be "false."
In books other than the Bible, a similar distinction holds: the text
proper of Herodotus' Histories must be contained in elements with
canonical="true", while notes, header data, and the like must not.
The meaning of this attribute is limited. It must not be used to
encode interpretive or theological judgements about canonicity. For
example, encoders who include the apocryphal books of the Bible, or
the alternate longer ending to the Gospel of Mark, must mark them as
canonical (whether by default or explicitly). This is simply because
they are part of the text being encoded. Users of a text are never
justified in drawing conclusions about a translator's, editor's, or
encoder's position on questions of inspiration or other theological
questions based on how they set the canonical
attribute, because the attribute does not mean that.
In most cases use of the canonical attribute is
straightforward, and we expect that the default values will almost
always produce the intended result. However, there will arise truly
difficult cases: for example, one may be encoding an ancient text
with annotations of its own. In that case those notes would be
canonical, while any added by the current editor would not be. In
such cases, the practice chosen and its rationale should be described
in the work's documentation.
7. The OSIS text header
The first element within every osisText must be a
header. The header declares
various works (including the work being encoded and any that are
being referenced), and provides a place to keep a revision history of
the text.
7.1. The Revision Description
To record changes or edits to the text, authors and editors are
encouraged to insert a change element every time
significant editing is done. Each change element
should contain a date element which says when
those edits were completed, in the form
yyyy-mm-ddThh:mm:ss
Note that all fields must have exactly the number of digits shown
(4-digit year, 2-digit month, etc.). It is permissible to omit the
time and the preceding "T", thus giving just a date. For example,
December 25th of 1999 CE would be:
1999-12-25
A date element in the revision description is
followed by any number of p (paragraph) elements,
in which the changes made are summarized. The person responsible for
making the changes should also be identified, using the resp attribute on the change
element.
Recommended practice is that more recent change
elements appear earlier in the document. That is, entries should
occur in reverse chronological order. For example:
<change><date>2003-09-11</date>
<p>sjd: Filling in the gaps. Adding some info for 2.0 as defined
at the Calvin College meetings.</p>
</change>
<change><date>2003-07-01</date>
<p>sjd: Annotated alpha list of elements. Reworked reference and
work sections and added type, scope, and explanations of type and
subtype for work. Explained more elements and attributes.</p>
</change>
<change><date>2003-06-17</date>
<p>sjd: Wrote conformance section. Added lists of elements and
attributes, USMARC list. Inserted placeholders for doc on all element
types. Got document back to XML WF. Wrote CSS stylesheet.</p>
</change>
7.2. Work Declarations
A work element is a declaration. It provides
information comparable to that found on the title page of a printed
work, using the fields defined by the Dublin Core Initiative (see
http://dublincore.org/).
The work element serves two purposes. The
work element in the header with an osisWork
attribute that matches the osisIDRef in the
osisText element identifies the work in which it occurs -- much
like the title page in a printed work. For example:
<osisText osisIDWork="CEV" osisRefWork="Bible" lang="en">
<header>
<work osisWork="CEV">
Note that the match between osisIDWork="CEV"
in osisText and osisWork="CEV" in the work
element links this osisText to this
particular work element.
Subsequent work elements identify other
works -- much like a citation in a footnote or bibliography in a
printed work. Each assigns a local name to each
one, using the osisWork attribute. Works so
declared can then be referred to from osisIDs or
osisRefs throughout the text. For Bibles, this
should generally be the accepted acronym or abbreviated form of the
translation's name (some standard version abbreviations are listed in
an appendix). No periods, hypens, spaces, or colons are allowed in
short names.
Note: This mechanism of declaring a short name and using it later
as a prefix, is very similar to the XML Namespace mechanism defined
at http://www.w3.org/TR/xml-names11/.
7.3. The Dublin Core
Each work element describes a single
publication using several pieces of information, primarily title,
creator, date, publisher, identifier and language. All of the standard
"Dublin Core" fields may be used, plus a few OSIS-specific additions
(further information on the Dublin Core system may be found at
http://www.dublincore.org). All of the Dublin core fields may be
repeated as necessary, but must be encoded in the order shown
here. For example:
<work osisWork="EG">
<title>Egyptian Grammar</title>
<creator role="aut">Alan Gardiner</creator>
<contributor role="dte">Francis Llewellyn Griffith</contributor>
<date event="original" type="gregorian">1927</date>
<date event="eversion" type="gregorian">2003</date>
<type type="x-grammar">Grammar</type>
<publisher>Griffith Institute, Ashmolean Museum, Oxford</publisher>
<language type="ISO-639">EN</language>
<language type="Ethnologue">EG-ancient</language>
<identifier type="ISBN">0900416351</identifier>
<identifier type="LCCN">95230980</identifer>
</work>
<work osisWork="CPV">
<title>Cotton Patch Version of Luke and Acts: Jesus' Doings and
the Happenings</title>
<creator role="aut">Clarence Jordan</creator>
<date event="original" type="gregorian">1969</date>
<date event="eversion" type="gregorian">2003</date>
<type type="x-bible">Bible</type>
<publisher>Association Press
<name type="place">New York, NY</name></publisher>
<language type="ISO-639">EN</language>
<identifier type="ISBN">0809617250</identifier>
<identifier type="LCCN">69-18840</identifer>
<scope osisRef="Luke" />
<scope osisRef="Acts" />
</work>
7.3.1. title
A title element must be
provided in the work element and contain the main
title of the work. Additional titles may also be specified, using the
type attribute to identify them as main, sub,
part, monographicSeries, or another kind of title. No OSIS-specific
types are established for this type attribute.
7.3.2. creator
The creator element is used to specify the person(s) or
organization(s) who are primarily responsible for the intellectual
content of a work. The role attribute must specify the particular role
the primary responsible party played. The most common values would be
aut (author), edt (editor), cmm (commentator), trl (translator). A
short list of such codes appears in Appendix D: Contributor Roles, with the complete set being
found in Appendix G: USMARC Relator Codes This
list covers an enormous range, and it should seldom if ever be
necessary to use a code not from this list.
7.3.3. contributor
Many people may contribute to a work in roles other than the
primary role listed under creator. They should be listed using the contributor element. Their specific role should be
recorded in the role attribute of their contributor element. See Appendix G: USMARC Relator Codes for the complete list of role
codes provided by the USMARC organization.
7.3.4. date
Date elements in the work element record significant dates in the
production or publication process. Use the role
attribute to identify the particular date contained in each of the
date elements. Those defined roles are:
- original The original publication date of the first
edition
- edition The date of publication of the referenced or
source edition
- imprint The printing date of the referenced or source
edition
- eversion The revision date of the present electronic
edition
The type attribute is used, instead, to
identify the calendrical system in which the date is expressed, from
the list: Chinese, Gregorian, Islamic, ISO, Jewish, and Julian. At
this time, OSIS only defines a syntax for Gregorian dates:
yyyy:mm:dd. See the later section on "Date Formats".
7.3.5. publisher
The publisher element in the work element is
used to indentify the publisher of a particular work. If a work was
published by more than one publisher and that publication record needs
to be recorded, use multiple publisher elements and distinguish them
using the type attribute. The description given
in this attribute is not constrained but it is suggested that values
that tie a publisher to a particular edition, such as <publisher
type="1848Edition"> should be used. For cases where full
identification of a publication history is essential, use of multiple
work elements is suggested.
7.3.6. language
A language element must be provided for each
language used substantially in a work. The language may be specified
using an ISO 639 or ISO 639-2, or SIL Ethnologue codes. The type
attribute must be set to IANA, IETF, ISO-639-1, ISO-639-2, ISO-639-2-B, ISO-639-2-T, LINGUIST, or SIL. In the rare case that none of these is
sufficient, a prose description should be inserted in the element and
the type attribute set to other.
7.3.7. type
The nature or genre of the content of the resource. This element
includes terms describing general categories, functions, genres, or
aggregation levels for content. Dublin Core's recommended best
practice is to select a value from a controlled vocabulary (for
example, the DCMI Type Vocabulary -- see
http://dublincore.org/documents/dcmi-type-vocabulary/). OSIS does not
provide such a controlled vocabulary at this time. If you encode this
element, the controlled vocabulary in use should be identified via
the type attribute (for example, <type type="DCMI">). To describe the physical or
digital manifestation of the resource, use the format element instead.
Note that the Dublin Core type element is distinct from the OSIS type attribute (the latter can occcur on any OSIS element,
to distinguish relevant subdivisions of the type).
7.3.8. identifier
The identifier elements provide one or more
formal identifiers for the work. The values to be entered for the type attribute on the identifier element are shown in bold. Note
that these values must be entered exactly as shown. XML is case
sensitive, that is to say, DEWEY is not
equal to Dewey. Enter the latter one and
you will get an error message.
- DEWEY Dewey Decimal System
- DOI Digital Object Identifier
- ISBN International Standard Book Number
- ISSN International Standard Serial Number
- LCCN Library of Congress Control Number
- OSIS Open Scriptural Information Standard
- SICI Serial Item and Contribution Identifier
- URI Uniform Resource Identifier
- URL Uniform Resource Locator
- URN Uniform Resource Name
ISBN and LCCN numbers must be recorded without spaces or hyphens.
ISBNs must contain ten digits (that is, they must include the final
check digit).
We strongly recommend the assignment of an ISBN to each
published work using OSIS. This number must, if available, be
specified in the identifier field for the work.
The following examples show identifier
elements used along with their type
attribute to provide an identifier for a work, in this case,
the "Cotton Patch Version of Luke and Acts" noted above:
<identifier type="ISBN">0809617250</identifier>
<identifier type="LCCN">69-18840</identifer>
Note that without the proper type attribute, a
reader or computer only has a string of numbers, which could be
from almost any system of identifiers. The type attribute plays an important role in
making sure the information you so carefully record is
understandable to others or even yourself, after a few months
have lapsed since you looked at the text.
7.3.9. coverage
This element may be used to specify the spatial location (a place
name or geographic coordinates), temporal period (a period label,
date, or date range) or jurisdiction (such as a named administrative
entity) to which the work applies. For example, an edition of
Herodotus could be specified as Greek/Hellenic, Classical Period. Or
a study of medieval Bibles could declare coverage as "medieval".
7.3.10. description
An account of the content of the resource.
Examples of description include, but are not limited to: an
abstract, table of contents, reference to a graphical representation
of content or a free-text account of the content.
7.3.11. format
The physical or digital manifestation of the resource.
Typically, format may include the media-type or
dimensions of the resource. Format may be used to identify the
software, hardware, or other equipment needed to display or operate
the resource. Examples of dimensions include size and
duration. Recommended best practice is to select a value from a
controlled vocabulary (for example, the list of Internet Media Types
[MIME] defining computer media formats).
7.3.12. relation
A reference to a related resource.
Recommended best practice is to identify the referenced resource
by means of a string or number conforming to a formal identification
system.
7.3.13. rights
Information about rights held in and over the resource.
Typically, rights will contain a rights
management statement for the resource, or reference a service
providing such information. Rights information often encompasses
Intellectual Property Rights (IPR), Copyright, and other property
rights. The rights element is informative
only. Legal rights and penalties for violation of those rights vary
from jurisdiction to jurisdiction. Reuse of any resource should be
done only after obtaining the necessary rights and permissions or
ascertaining that none is required.
7.3.14. subject
A topic of the content of the resource.
Typically, subject will be expressed as keywords, key phrases or
classification codes that describe a topic of the resource.
Recommended best practice is to select a value from a controlled
vocabulary or formal classification scheme.
7.3.14.1. subject classification systems
The type attribute on subject
allows the user to specify what classification system the
subject entered can be found.
<subject type="ATLA">Fathers of the Church</subject>
Means that the subject "Fathers of the Church" is a subject found
in the listing of subjects maintained by the American
Theological Libraries Association (ATLA). To assist users, an
admittedly partial list of the more well known subject
classification systems have been prepared by the OSIS
project. Those systems with their abbreviations for use with
an OSIS encoded text are as follows:
- ATLA American Theological Libraries Association
- BILDI Biblewissenschaftliche Literaturdokumentation Innsbruck
- DBC Dutch Basic Classification
- DDC Dewey Decimal Classification
- EUT Estonian Universal Thesaurus
- FGT Finnish General Thesaurus
- LCSH Library of Congress Subject Heading
- MeSH Medical Subject Headings
- NLSH National Library Subject Headings (National Library
of Poland)
- RSWK Regeln für den Schlagwortkatalog
- SEARS Sears List of Subject Headings
- SOG Soggettario
- SWD_RSWK Swiss National Library
- UDC Universal Decimal Classification
- VAT Vatican Library
For classification systems not listed, insert the classification
system with a leading "x-" in the type attribute
and notify the OSIS team if that system should be added in a future
revision of the schema.
7.3.14.2. source
A reference to a resource from which the present resource is derived.
The present resource may be derived from the source resource in whole or in part. Recommended best
practice is to identify the referenced resource by means of a string
or number conforming to a formal identification system.
7.3.14.3. type
The nature or genre of the content of the resource.
Type includes terms describing general categories, functions,
genres, or aggregation levels for content. Recommended best practice
is to select a value from a controlled vocabulary (for example, the
DCMI Type Vocabulary [DCT1]). To describe the physical or digital
manifestation of the resource, use the format element.
7.3.15. Non-Dublin Core Elements and Attributes in the Work Declaration
7.3.15.1. scope
The scope element(s) must have an osisRef attribute, which defines what part of the
titled work occurs in this electronic edition. For example, an
edition may consist of only the New Testament and Psalms, or of only
a single book. Contiguous ranges may be specified using the hyphen
notation described later for osisRefs in general; discontiguous
ranges must be specified by including multiple scope element(s), as shown in the second example
above. These should be, but are not required to be, in canonical
order.
7.4. Identifying a Work given a work declaration element
The six elements already described are the primary means of
identifying a referenced work.
If a publication matches all of the above elements within work, it
is presumed to be an acceptable resolution for any reference to that
work as declared.
If no perfect match can be found, applications may, indeed should,
attempt to fall back to the closest available publication. OSIS does
not define a required method of fallback, or define what "closest"
must mean in all contexts. However, one possible approach is to
successively ignore particular elements in this order:
- Identifier: because identifiers are often
ambiguous. For example, hardcover and softcover editions of a book
typically have different ISBNs, and occasionally publishers re-use an
old ISBN for a completely different book.
- Date: because a different imprint or edition of
the same conceptual work is typically adequate. Precisely targeted
links, however, may not refer to the exact location desired.
Applications may wish to ignore all dates except for the original
publication date.
- Publisher: because several publishers may
publish a given work (particularly older works), publishers may change
name, etc.
- Language: Accepting a publication that
does not match in language is a substantial concession. However, some
variations of language are greater than others. For example, some
modern Bible translations are available in separate American and
British English versions, and substituting one for the other is not
unreasonable. This is particularly true because translations
generally use translated titles as well, and so if the language is
not closely related, the title will probably not match either.
Applications may wish to encode some knowledge of language and
dialect similarities to implement more sophisticated
fallback.
- Creator: because some authors have
multiple forms of name: St. Augustine vs. Augustine of Hippo vs.
Augustine. The Bible Technology Group intends to develop an
authority list of normative name-forms for relevant authors, and once
such a list is available, using it will help to avoid such problems.
As with other elements, more sophisticated applications may wish to
attempt some kind of approximate matching in order to achieve better
fallback.
- Title: the final item to discard is
probably title. If a work's title differs, it is probably a different
work, or at least a translation into a non-close language. On the
other hand, some titles have been used by multiple authors, and so a
match on title alone should be considered suspect.
Arguments can easily be made for a variety of other fallback
methods. For example, if the identifier element matches, the work is
probably right, even though an identifier mismatch is not good
evidence that the work is wrong.
7.5. Date formats
All dates in the header and in attributes should be in this standard
format, which is based on IETF RFC 3339. However, it uses period
rather than colon as the field separator (for consistency with other
OSISis types), and adds features to allow for dates BCE, for
approximate dates, for date ranges, for yearless dates (as used in
many daily devotionals), for weekly dates, and for named times of day
(such as used in many prayer books). There are 3 standard date
formats; the prefixes that identify them are reserved, and may not be
redefined via the refSysId attribute of any work element:
- yearly:yyyy-mm-ddThh.mm.ss
Any number of fields may be left off from the
right end; for example, if the seconds are dropped (along with the
preceding colon), the time refers to the entire minute specified; if
the entire time section is left off (along with the preceding "T"),
the string refers to the entire day.
The year must always have 4 digits. However, the year may be
entirely omitted to indicate dates that apply to any year, such as in
a book of 365 daily readings.
To indicates years before the common era, add an underscore ("_")
before the first digit of the year (immediately following the colon).
A hyphen would be preferable, but it is already in use to indicate
ranges in osisRefs.
The entire date/time string (possibly including a leading
underscore) may be preceded by "~", indicating that the time is
approximate. No means is provided to express just how approximate a
time may be.
- weekly:n
When readings or other materials are specified as being for
particular days of the week, this form must be used. The 'n' value
may range from 1 to 7; 1 indicates Monday, in accordance with ISO
8601:2000.
As an alternative to quantitative times, a small set of named
times is provided, which can be specified in place of the entire
(post-"T") time section (the "T" itself remains). For example:
yearly:06-04T~(Vespers)
would be the identifier for a prayer, reading, or other work to be
used at Vespers on June 4 of any year. The named times (which are
case-sensitive) include: Vigils, Matins, Lauds, Terce, Sext, None,
Vespers, Compline; Sunrise, Sunset; Morning, Afternoon, Evening,
Night; AM, PM;
Fajr, Zuhr, _Asr, Maghhrib, _Isha, Lail, Dzuha, _Id.
Some works will be primarily organized by dates and times: for
example, lectionaries, daily devotionals, prayer books, historical
time lines, etc. In such works, use the osisID
attribute to identify the retrievable portions; the value should the
the applicable time in one of the formats just shown.
Typically, such works are organized in chgronological order of the
times specified; however, OSIS does not impose that requirement.
8. Title Pages
In order to make the encoding of title pages as found in standard
works easier, OSIS 2.0 introduced the titlePage
element. This element contains the following elements from the header:
title, contributor, creator, subject, date, description, publisher, type, format, identifier, source, language, relation, coverage, which are
explained in the material on the header
section. Three additional elements are allowed, which are figure, milestone, and, p. Due to the complexity of title pages, all of these
elements may occur in any order inside the titlePage element.
The titlePage element can occur within the osis, osisText, and, osisCorpus elements.
Users just starting with OSIS should use a minimum headers and
simple titlePage element until they have gained
some experience with text encoding and determining what is, or perhaps
more importantly, what is not useful to have encoded in a work.
9. Basic Elements
While book, chapter, and verse numbers are a familiar and useful way
of referring to locations in the Bible, they often conflict with the
boundaries of parables, stories, genealogies, paragraphs, quotations,
and other important units of understanding. Even to print a
well-formatted Bible edition, and much more to support high-end
search, annotation, and other capabilities, these meaningful units
must also commonly be marked.
It is possible to encode a Bible using only book, chapter, and verse
markup. However, most encoders also want to also represent sections,
paragraphs, quotations, and so on. Higher-level structures are tagged
as div, for "division", with a type attribute to specify the particular significance.
div elements can occur within other div elements to any number of levels. The first and
outermost div should occur immediately after the
end of the header. For example,
<div type="book" osisID="Gen">
<head>Genesis</head>
<chapter osisID="Gen.1">
<head>1</head>
<verse osisID="Gen.1.1">In the beginning,...</verse>
<verse osisID="Gen.1.2">The earth was formless and void...</verse>
...
</chapter>
</div>
The div element is used for many top-level
components, and so makes heavy use of the type
attribute. The pre-defined types include the most common major
divisions found in present-day Bibles and related works:
‘
acknowledgement,
afterword, annotant,
appendix, article, article, back, body, book, bookGroup,
chapter, colophon, commentary, concordance, coverPage, dedication, devotional,
entry, front, gazetteer, glossary, imprimatur, index, introduction,
majorSection, map,
outline, paragraph, part, preface, section, subSection, titlePage.
’
The main body of a Bible will typically consist of div elements of type="bookGroup"
(such as each Testament, the Apocrypha, and perhaps smaller groups
such as the Pentateuch, the Minor Prophets, etc), plus any front and
back matter divisions (the selection of which varies greatly between
editions).
With each bookGroup div, there will typically
be book divs corresponding to each included
Canonical or deutero-canonical book. Some books are divided into
majorSections (such as the sub-books in Psalms), sections (typically
topical divisions with headings), subSections (occasional minor
divisions within sections). A specific chapter
element is provided and encouraged, though div
type="chapter" is also permissible.
Below this point typical texts switch from successive levels of
div elements, to more specific markup such as
paragraphs, lists, quotations, inscriptions, and the like. Also at
this level, the markup begins commonly to interact with verse markup.
Use of the types defined for div is mandatory
when a provided type is applicable. For example, a colophon must be
marked up as <div type='colophon'>.
If types not provided are needed, they may be added but must begin
with "x-", to distinguish them from OSIS-standard values.
Such markup forms the primary backbone of an OSIS document. Chapter
and verse elements are important (particularly for retrieval), but
considered to be an overlay onto the more linguistic or thematic
structure. Therefore, so long as verses or chapters do not cross the
boundaries of other elements, they may be expressed in the normal
fashion (NASB):
<chapter osisID="Mark.10">
<head>Mark Chapter 10</head>
<div type="section"><head>Divorce</head>
<verse osisID="Mark.10.1">Jesus then left that place and went into
the region of Judea and across the Jordan. Again crowds of people
came to him, and as was his custom, he taught them.
</verse>
<verse osisID="Mark.10.2">Some Pharisees came and tested him by
asking, "Is it lawful for a man to divorce his wife?"
</verse>
<verse osisID="Mark.10.3">"What did Moses command you?" he replied.
</verse>
<verse osisID="Mark.10.4">They said, "Moses permitted a man to write
a certificate of divorce and send her away."
</verse>
<verse osisID="Mark.10.5">"It was because your hearts were hard that
Moses wrote you this law," Jesus replied. </verse>
<verse osisID="Mark.10.6">"But at the beginning of creation God 'made
them male and female.' </verse>
<verse osisID="Mark.10.7">'For this reason a man will leave his
father and mother and be united to his wife,</verse>
<verse osisID="Mark.10.8">and the two will become one flesh.' So they
are no longer two, but one. </verse>
<verse osisID="Mark.10.9">Therefore what God has joined together, let
man not separate."</verse>
<verse osisID="Mark.10.10">When they were in the house again, the
disciples asked Jesus about this. </verse>
<verse osisID="Mark.10.11">He answered, "Anyone who divorces his wife
and marries another woman commits adultery against her. </verse>
<verse osisID="Mark.10.12">And if she divorces her husband and
marries another man, she commits adultery." </verse>
</div>
...
</chapter>
10. Simple paragraphing, quotes, and notes
Paragraphs (element p), quotations (element q), and other grouping elements can be inserted around
groups of verses, as shown below. Likewise, note
elements can be inserted where needed. The paragraph need not give an
osisID for the set of verses it contains, since
they are typically provided on the verse elements
themselves:
...
<p>
<verse osisID="Esth.4.10">Then Esther spoke to Hathach, and gave him
a command for Mordecai: </verse>
<verse osisID="Esth.4.11"><q>All the king's servants and the people
of the king's provinces know that any man or woman who goes into the
inner court to the king, who has not been called, he has but one law:
put all to death, except the one to whom the king holds out the
golden scepter, that he may live. Yet I myself have not been called
to go in to the king these thirty days.</q> </verse>
<verse osisID="Esth.4.12">So they told Mordecai Esther's words.
</verse> </p>
<p>
<verse osisID="Esth.4.13">And Mordecai told them to answer Esther:
"Do not think in your heart that you will escape in the king's palace
any more than all the other Jews. </verse>
</p>
<p>
<verse osisID="Esth.4.14">For if you remain completely silent at this
time, relief and deliverance will arise for the Jews from another
place, but you and your father's house will perish. Yet who knows
whether you have come to the kingdom for such a time as this?"
</verse>
</p>
<p>
<verse osisID="Esth.4.15">Then Esther told them to reply to Mordecai: </verse>
<q>
<verse osisID="Esth.4.16">"Go, gather all the Jews who are present in
Shushan, and fast for me; neither eat nor drink for three days, night
or day. My maids and I will fast likewise. And so I will go to the
king, which is against the law; and if I perish, I perish!
</verse></q>
</p>
<p><verse osisID="Esth.4.17">So Mordecai went his way and did
according to all that Esther commanded him.<note
type="textual">Septuagint adds a prayer of Mordecai
here.</note></verse> </p>
Notice in this example that all the paragraphs and quotations still
enclose an exact number of verses; there are exceptions to this
elsewhere in the Bible, that need special handling as explained later.
When tagging quotations, do not also include quotation marks. They
will be generated in the typesetting or display process. This is
important for several reasons. FIrst, if some people use q, some use punctuation marks, and some use both,
anyone processing OSIS texts will have to check every text and
account for all the variations -- this is expensive and
time-consuming: that is, it will make the Bibles cost more (to
someone), and be delivered later. Another reason is that punctuation
for quotes differs around the world; so any given quotation mark may
be meaningless to other communities. In Spanish, for example, there
are special rules about how to mark quotes that continue after an
interruption -- such cases can be distinguished by adding a type
attribute to the q element, with values such as initial, medial,
and final.
Many editions of the Bible have accompanying notes, often of several
distinct types. A number of predefined types, and some additional
internal structure, are discussed later. It is customary to include
the notes directly within the text, at the point to which they apply.
This can be done via the note element, which can
be placed almost anywhere. In the future, it is likely that notes
will more commonly reside outside of the text, instead residing in
special notes-files that can be attached (via osisRef) to any Bible
edition on request.
Every note should have a type
attribute to indicate its purpose; many Bible editions show different
kinds of notes in different places. The pre-defined note types are
listed below; they are not sharply-defined, wholly distinct
categories. In addition, if none of these categories suffice,
encoders may create their own so long as their names begin with "x-".
- allusion The note explains an implicit reference the text
makes to another text or concept.
- alternative The note records an alternate possible reading of
the text, whether due to ambiguity in translation or to manuscript
variation.
- background The note provides background information, such as
cultural norms, explanations of geographic or other information
original readers would have known, and so on.
- citation The note cites a supporting text or further
explanation of some kind.
- crossReference The note provides a cross-reference to a related
passage or other text.
- devotional The note includes information of interest for
devotional reading.
- exegesis The note discusses a relevant point of exegesis or
interpretation
- explanation The note explains implicit, ambiguous, or
otherwise non-obvious aspects of the passage.
- speaker [2.0] This type is intended mainly for use in sermons
and other performance texts, where the performer may wish to make notes to him or herself. For
example, "tell joke here".
- study The note provides helps for a deeper study of the
passage.
- translation The note discusses an issue of translation, such
as a word whose meanining is unclear in the original, or a reasons
for the translator's choice of phrasing. Bible translation projects
will likely use this heavily, using the subtype attribute to mark the
status of each note as resolved or unresolved, the person responsible
for the note, and so on.
- variant The note records a textual variation in manuscript
tradition, relevant at its location.
Sometimes a verse or chapter
starts or end in the middle of some other unit, such as a poetic line
group, paragraph, quotation, or speech. In such cases an alternate
form of the verse or chapter
tags must be used. This usage is explained in the next section.
11. Elements that cross other elements
The normal form of an element is a start tag and an end tag: <verse>...</verse>. For handling markup
that crosses boundaries, however, a special form must be used. It
consists of two totally empty instances of the same element type: one
to mark the starting point, and one to mark the ending point. The two
empty elements identify themselves as to which is the start and which is
the end, and co-identify themselves by the sID
attribute (the start of the traditional element) and the eID attribute (the end of the traditional
element), the values of which must match.
Empty elements are indicated in XML by a tag with "/" preceding the
final ">": thus "<verse/>" rather than <verse> or
</verse>. Elements used in this way are commonly called
"milestones", and those particular elements in OSIS that permit
this alternate encoding are thus called "milestoneable". Elements
that are "milestoneable" in the OSIS schema are:
- abbr
- chapter
- closer
- div
- foreign
- l
- lg
- q
- salute
- seg
- signed
- speech
- verse
This is particularly useful where modern translations break up
verses or other traditional divisions in a Bible text. For
example, a paragraph based encoding of part of the Book of Esther
would appears as follows:
<p>
<verse sID="Esth.2.7" osisID="Esth.2.7"/>Mordecai had a very beautiful cousin named Esther, whose Hebrew name was Hadassah. He had raised her as his own daughter, after her father and mother died.<verse eID="Esth.2.7"/>
<verse sID="Esth.2.8" osisID="Esth.2.8"/>When the king ordered the search for beautiful women, many were taken to the king's palace in Susa, and Esther was one of them.</p>
<p>Hegai was put in charge of all the women,<verse eID="Esth.2.8"/>
<verse sID="Esth.2.9" osisID="Esth.2.9"/>and from the first day, Esther was his favorite. He began her beauty treatments at once. He also gave her plenty of food and seven special maids from the king's palace, and they had the best rooms.<verse eID="Esth.2.9"/>
</p>
There are two things to note about the Esther example:
- Esther 2:8 is divided by a paragraph (the p element and so must be marked using the verse element as a milestones with the sID and eID attributes
to link those two milestones together.
- Where overlapping elements are necessary, the milestoneable
element technique must be used for the entire text. That is, it
is an error to mark some verses in Esther with traditional verse elements, i.e., as containers and others
with the milestoneable verses. The reason is quite simple,
inconsistent markup is more difficult to process and makes the
encoded text less useful for everyone.
This is equivalent to the TEI "milestone" method for marking such
phenomena. It has the advantage that milestones representing a given
type of element have the same name as the element, and automatically
have the same attributes. Although XML itself will not detect a
validation error if attributes other than eID are
specified on the ending milestone, eID is specified on the starting milestone, or the start and end milestones are in the
wrong order, each of these conditions is an OSIS error.
For OSIS purposes, there is no semantic difference between marking up
a chapter or verse as a container using a start and end tag, versus
marking it up as a "milestone pair" consisting of two empty tags.
Note: Typesetting and layout systems vary in their ability to
accommodate non-hierarchical markup such as this. Fortunately, in
most Bible editions the only formatting consequence of a verse element is insertion of the verse number, and
perhaps insertion of a line-break; these are within the capabilities
of most layout and style systems even though the verse is not a
container in XML terms.
12. Special Text Types
The bulk of the remaining OSIS elements fall into a few simple
classes: First, markup for special text types, such as epistles and
drama. Second, generic structures such as lists, tables and
glossaries (typically found in appendixes of printed Bibles). And
finally, small-scale elements that mark, quotations, notes, names,
index entries, and the like.
12.1. Markup for epistles and similar materials
Letters, epistles, and similar texts are marked up in basically the
same way as any other text. However, three special elements are
available for marking portions unique to this genre:
12.1.1. salute
The salute element encloses the salutation or
greeting, typically at the very beginning of a letter. It should
include the whole salutation, including (if present) the "to",
"from", and any following greeting or blessing. If the boundaries of
a salutation are the same as the boundaries of a paragraph, section,
or other unit, that unit should be placed outside, with the salute
element directly within. For example (LBP):
<div type="book" osisID="1Tim">
<head>The First Epistle to Timothy</head>
<chapter osisID="1Tim.1">
<salute>
<verse osisID="1Tim.1.1">FROM: PAUL, a missionary of Jesus Christ,
sent out by the direct command of God our Savior and by Jesus Christ
our Lord -- our only hope.</verse>
<verse osisID="1Tim.1.2">To: Timothy. Timothy, you are like a son
to mein the things of the Lord. May God our Father and Jesus Christ
our Lord show you his kindness and mercy and give you great peace
of hear and mind.</verse>
</salute>
<verse osisID="1Tim.1.3">...</verse>
</chapter>
...
</div>
12.1.2. signed
The signed element surrounds the name of the
author and/or amanuensis of a letter and its immediately surrounding
phrase of opening or closing (if any). In Biblical epistles, it is
common for the author to be named only at the beginning; this should
still be marked up with the signed element.
signed may appear with or without an
accompanying closer or salute
element, and the name may or may not also be tagged as a name (if it is, the name should be
the inner element even if it includes all the text content of the signed element. In New Testament epistles, there is not
generally an obvious, final signature. However, this element may be
used somewhat more broadly of a phrase or portion judged as intended
to identify the writer. As shown below, the signature of an
amanuensis may also be marked up in this way. For example (RSV):
- <verse osisID="Rom.16.22"><signed>I Tertius
salute you which wrote this epistle in the Lorde.<signed</verse>
[English, Tyndale, 1525/1530]
- <verse
osisID="1Cor.16.21"><signed>I, Paul, write this greeting with
my own hand.</signed></verse>
[English, RSV]
- <verse
osisID="2Cor.1.1"><signed>Paul, an apostle of Jesus Christ by
the will of God, and Timothy [our] brother, to the church of God
which is at Corinth, with all the saints who are in all
Achaia:</signed></verse>
[English, Webster]
- <verse
osisID="Gal.6.11"><signed>See with what large letters I am
writing to you with my own hand.</signed></verse>
[English, RSV]
- <verse
osisID="Eph.1.1"><signed>Paul, an apostle of Christ Jesus
through the will of God, to the saints that are at Ephesus, and the
faithful in Christ Jesus:</signed></verse>
[English, American Standard Version, 1901]
- <verse osisID=""><signed>Paul, and
Silvanus, and Timothy, to the church of the Thessalonians which is in
God the Father and in the Lord Jesus Christ: Grace to you, and peace.
</signed></verse>
[English, RKJNT]
- <verse osisID="1TIm.1.1"><signed>Paul,
an apostle of Jesus Christ, according to the commandment of God our
Savior, and of Christ Jesus our
hope:</signed></verse>
[English, Douay-Rheims Bible, Challoner Revision]
- <verse osisID="Phm.1.1"><signed>Mimi
Paulo, mfungwa kwa ajili ya Kristo Yesu, na ndugu
Timotheo,</signed> ninakuandikia wewe Filemoni mpendwa,
mfanyakazi mwenzetu</verse>
<verse osisID="Phm.1.2">na kanisa linalokutana nyumbani kwako, na
wewe dada Afia, na askari mwenzetu Arkupo.</verse>
[Swahili NT]
12.1.3. closer
The closer element surrounds the closing portion of a letter,
typically consisting of final greetings or blessing, and a signature
(see signed). It is a matter of judgement just where a closer begins and ends. For example:
- <closer><verse osisID="1John.5.21">Dear
children, keep away from
anything that might take God's place in your hearts. Amen.
Sincerely, <signed>John</signed></verse></closer>
[LBP]
12.1.3.1. benediction
OSIS presently provides no special markup for benedictions and
blessings. Recommended practice at this time if an encoder wishes to
identify them in a text, is to use seg
type="benediction". For example:
- <verse osisID="2Cor.13.14"><seg
type="benediction">The grace of the Lord Jesus Christ, and the love
of God, and the communion of the Holy Spirit, [be] with you all.
Amen.</seg></verse>
[Webster]
12.2. Dramatic texts
OSIS provides two main features for marking up dramatic texts: A way
to declare the list of characters, or castList;
and a way to identify speeches and speakers in the body of a dramatic
text.
A castList element contains a structured list of
the roles, or cast, of a dramatic work. It is drawn directly from the
TEI structure for the same thing. For example, in the Song of Songs,
some translations may present the list of characters at the start of
the book: lover, beloved, and friends. The same might be done for
Job. However, these elements will be most commonly used for
extra-Biblical materials, such as a play based on the Bible, or
dramas in classical or other literature.
A simple example of a castList is shown below, perhaps for a dramatic
re-enactment of Job:
<castList>
<castGroup>
<head>Cast of characters</head>
<castItem>
<actor>Patrick Durusau</actor>
<role>Job</role>
<roleDesc>A man of God who suffers greatly</roleDesc>
</castItem>
<castItem>
<actor>(a whirlwind)</actor>
<role>God</role>
<roleDesc>The Almighty, who permits Job's suffering, and
responds to his questions about it.</roleDesc>
</castItem>
<castItem>
<actor>(a disembodied voice)</actor>
<role>Satan</role>
<roleDesc>The instigator of Job's suffering</roleDesc>
</castItem>
<castItem>
<actor>Todd Tillinghast</actor>
<role>Eliphaz</role>
<roleDesc>The first of Job's friends to speak</roleDesc>
</castItem>
<castItem>
<actor>Chris Little</actor>
<role>Bildad</role>
<roleDesc>The second of Job's friends to speak</roleDesc>
</castItem>
<castItem>
<actor>Steve DeRose</actor>
<role>Zophar</role>
<roleDesc>The third of Job's friends to speak</roleDesc>
</castItem>
<castItem>
<actor>Troy Griffiths</actor>
<role>Elihu</role>
<roleDesc>The youngest and last of Job's friends to speak,
who was slightly less clueless than the rest.</roleDesc>
</castItem>
</castGroup>
</castList>
The castList element contains the entire casting
List, and consists of one or more castGroup
elements. Multiple castGroups, each with its own head, would be used
if there were multiple sub-groups of the cast to be listed
separately; more typically there will be only one castGroup within a castList.
At this time, castList can only occur in a work declaration, after the Dublin Core elements.
Thus, if a Bible encoder wishes to include the casts of Song of Songs
and of Job, they would each need to be marked as a separate castGroup within that one castList.
The castItem element contains the full information
for a single character. This must include a name for the role being played, and should include a roleDesc, that is, a description of that role. It may
also include the name of an actor, if the text
being encoded represents a particular enactment rather than, say, a
libretto or script.
In general there is no need to also encode an actor name or role name
with an explicit name element, unless the encoder
wishes to provide a normalized form for later reference; in that
case, the name element would be placed just within the actor or role element, not surrounding it.
It is strongly recommended that each castGroup and
castItem have an ID attribute.
Since IDs must be unique across all element types in a document,
encoders may wish to prefix certain kinds of IDs to separate them and
avoid conflicts. For example, an appropriate ID for a castItem representing the Friends in Song of Songs
would be "cast.friends", or perhaps "cast.song.friends".
12.3. speaker
The speaker element is used to identify the person or role that is
uttering the content of an associated speech.
<div osisID="NRSV.Song.2">
<speech>
<speaker>woman</speaker>
<verse osisID="NRSV.Song.2.1">I am a rose of Sharon, a lilly of the valleys.</verse>
</speech>
</div>
Which is the equivalent to:
<div osisID="NRSV.Song.2">
<speech who="woman">
<verse osisID="NRSV.Song.2.1">I am a rose of Sharon, a lilly of the valleys.</verse>
</speech>
</div>
Either method is correct but careful encoders will choose one or
the other and be consistent in using one method or the
other. Other than document invalidity, nothing makes use of
an encoded document more difficult than correct, but
inconsistent encoding.
12.4. speech
The speech element is used to indicate quoted direct speech. In
that sense it represents a kind of quotation. However, the q element is to be used for quotations in general,
where the speech element is limited to accounts of
an individual making an actual speech in some kind of performance
context. In general, both elements should not be applied to the same
text portion. Just as with the q element, using
the speech element makes quotation marks
unnecessary, and they must not be used. For example:
<chapter osisID="Acts.7">
<head>Stephen's Speech to the Sanhedrin</head>
<verse osisID="Acts.7.1" sID="a71"/>Then the high priest asked him, <speech>Are
these charges true?</speech>
<verse eID="a71">
<verse osisID="Acts.7.2" sID="a72"/>To this he replied:
<speech>Brothers and fathers, listen to me! The God of glory appeared
to our father Abraham while he was still in Mesopotamia, before he
lived in Haran. <verse eID='a72'/>
<verse osisID="Acts.7.3" sID="a73">'Leave your country and your people,' God
said, 'and go to the land I will show you.'<verse eID="a73"/>
<verse osisID="Acts.7.4" sID="a74"/>"So he left the land of the Chaldeans and
settled in Haran. After the death of his father, God sent him to this
land where you are now living. <verse eID="a74"/>
<verse osisID="Acts.7.5" sID="a75"/>He gave him no inheritance here, not even a
foot of ground. But God promised him that he and his descendants
after him would possess the land, even though at that time Abraham
had no child. <verse eID="a75"/>
<verse osisID="Acts.7.6" sID="a76"/>God spoke to him in this way: 'Your
descendants will be strangers in a country not their own, and they
will be enslaved and mistreated four hundred years. <verse eID="a76"/>
<verse osisID="Acts.7.7" sID="a77"/>But I will punish the nation they serve as
slaves,' God said, 'and afterward they will come out of that country
and worship me in this place.'<verse eID="a77"/>
<verse osisID="Acts.7.8" sID="a78"/>Then he gave Abraham the covenant of
circumcision. And Abraham became the father of Isaac and circumcised
him eight days after his birth. Later Isaac became the father of
Jacob, and Jacob became the father of the twelve
patriarchs.<verse eID="a78"/>
...
<verse osisID="Acts.7.53" sID="a79"/>you who have received the law that was put
into effect through angels but have not obeyed it.
<verse eID="a79"/>
</speech>
...</chapter>
Note that in this example the high priest's short speech in verse 1
is marked up as a normal container element with normal start- and
end-tags, as is Stephen's reply. But, note that all the verse
boundaries have been repesented with milestoneable verse elements. The
reason for this is quite simple, if the encoding jumps from using
containers for verses and only on occassion changes to milestones,
noting that Stephen's speech start inside a verse, the file becomes
very difficult to process reliably. When a conflict arises between
the scope of chapter/verse units and other units, the chapter/verse
units give way by being represented as milestones. If a conflict
arises between two other units (say, a quote that encompasses part but
not all of each of two paragraphs), it is left to the encoder's
discretion which or them is represented via milestones.
12.5. Marking up poetic material
Although poetic material is commonly called "verse" material, OSIS
avoids that term because of potential confusion with the
book/chapter/verse reference system. Thus, like "TEI," markup of poetry
refers to lines and line groups.
In addition, OSIS provides a typographic line-break element. This is
because in at least some editions of the Bible, the exact placement
of typographic line-breaks within poetic lines is considered very
important; while on the other hand it is determined in part by
presentational concerns (for example, column width), rather than by
linguistic characteristics of either the source or target language.
OSIS provides three main elements for marking up poetic material:
12.5.1. lg
The lg or "line group" element is used to contain any group of
poetic lines. Thus it covers for units like couplet, stanza, and
entire poem. Line groups can contain smaller line groups as well.
12.5.2. l
The l element is used to mark poetic lines, as determined by the
linguistic nature of poetry in the language of the work. For example,
much English poetry consists of lines that can be located by the
position of rhyming words, and/or by counting syllables; Hebrew poety
can often be divided into lines based on parallelism of thought or
meaning.
The following example shows an encoding of the first two verses of
Psalm 7 from the CEV which uses the lg
and l elements to mark poetic
material.
<div type='section' scope='Ps.7.1-Ps.7.17'>
<title>The <divineName type='x-yhwh'>LORD</divineName> Always Does Right</title>
<lg>
<l>
<verse sID='Ps.7.1' osisID='Ps.7.1'/>You, <divineName type='x-yhwh'>LORD</divineName> God,<lb type='x-secondLine'/>are my protector.</l>
<l>Rescue me and keep me safe<lb type='x-secondLine'/>from all who chase me.<verse eID='Ps.7.1'/>
</l>
<l>
<verse sID='Ps.7.2' osisID='Ps.7.2'/>Or else they will rip me apart</l>
<l>like lions<lb type='x-secondLine'/>attacking a victim,<lb type='x-secondLine'/>and no one will save me.<verse eID='Ps.7.2'/>
</l>
</lg>
</div>
12.5.3. lb
The lb element, or "line break", is used to mark line breaks that
are not the result of linguistically or poetically significant
structure, but are primarily part of the typography and layout. For
example, a lone line might be broken to fit into a narrow column. The
lb element is an empty element used to mark where such breaks
occurred in an important copy text, or where they should be placed in
a text to be rendered.
Bible typesetting has a long tradition involving placement of such
breaks. In some cases, translators have carefully decided preferred
or required break-points for various set widths. These can be
accommodated by using the type attribute of lb. For example,
type="wide-pref" and type="narrow-pref" might be used to identify the
locations of preferred line-breaks for wide and narrow column
layouts. Similarly, type might be used to distinguish various levels
of indentation following the break, or other typographic factors
deemed important.
The lb element should not be used merely to record where line breaks
in general happened to occur in a source edition. For most source
editions this information is unimportant; for manuscripts it may be
imortant, but must be marked up using the milestone element instead.
12.6. Lists, tables, genealogies, figures and other material
Simple glossaries such as appear at the back of many Bibles, may be
encoded at this time using the simple list, label, item elements
described below. A dicitonary extension is well along in development,
and should be available as an extension module within the next few
months. That module should be used for any but the simplest lexical
tools; and once available, OSIS may decide to recommend against
further use of list to represent even simple glossaries.
12.6.1. list
All types of lists are marked using the list element; they can be
distinguished by type attribute valuess such as "ordered",
"unordered", "compact", "definition", and type. A list consists of
any number of items, some or all preceded by labels, which
corresponded to the definition-terms of definition lists in various
schemas.
12.6.2. label
A leading label for a given list item. Labels are optional.
12.6.3. item
The main content or description for each list item.
(list example forthcoming)
12.6.4. table
OSIS provides only very rudimentary tables: a table consists of
rows, which in turn consist of cells. Formatting and layout is not
part of the table markup; it can either be done automatically, as in
HTML browsers, or by inserting some signal to the layout engine, such
as type attributes or processing instructions. Note that a table can be nested inside another table. Simply start a new table element inside a cell element.
12.6.6. cell
(table example forthcoming)
12.6.7. figure
The figure element is used to insert graphic
non-textual materials, in other words, maps, pictures,
drawings into an encoded text. The figure
element in OSIS may contain caption (see
next section) along with optional index
and note elements.
An example of a figure in an OSIS text might be:
<figure src="Beckmann_1917.jpg" alt="Painting by Max Beckmann, titled
Christ and the Woman taken in Adultery"><caption>Christ and
the Woman Taken in Adultery by Max Beckmann,
1917</caption><index index="illustrations"
index1="Beckmann, Max">
</figure>
At first it may look odd that the material in the alt attribute is repeated in the caption element. The alt
attribute is important for situations where the application
or user (for the visually impaired) cannot use or see the
image that has been inserted in the text. The alt attribute is a friendly way of insuring
that the encoded text will be understandable by the widest
range of both applications and users.
The index attribute allows the encoder to encode
the information necessary to automatically create an index,
for either an online version of this material or a more
traditional back of the book index. The index attribute gives the type of index where this
item will appear and index1 provides the material that
will appear in that index. See index
(below) for more information on this element.
12.6.8. caption
(see example above, fuller examples forthcoming)
12.7. milestone
The milestone element is an empty element, and so is represented as
<milestone/> rather than as a typical start- or end-tag. It is
used to mark point events in a text, often involving the layout of
the original text, or special points of access into the electronic
text.
For example, when digitizing a manuscript, it may be considered
important to record where the page, column, and line boundaries of
the original manuscript fell. This would be done as shown here:
<milestone type="pb" n="37-verso"/>
<p>The Lord said to Eliphaz:<milestone type="line"/>
What my servant Job has said about me is true, <milestone type="line"/>
but I am angry with you and your two friends for <milestone type="line"/>
not telling the truth. <verse osisID="Job.42.8">So I want you to go
over to <milestone type="line"/>
Job and offer seven bulls and seven goats on an <milestone type="line"/>
alter as a sacrifice to please me. After this, Job <milestone type="line"/>
will pray, and I will agree not to punush you for <milestone
type="line"/>your foolishness.</verse><milestone type="line"/>
<verse osisID="Job.42.9">Eliphaz, Bildad, and Zophar obeyed the Lord,
and he answered Job's prayer.</verse>
Note that because milestone is an empty or point element, not a
container, it may be placed freely without concern about violating
the boundaries of other elements in the same region.
Where a break to be represented by a milestone occurs between other
units, such as verses or paragraphs, the milestone should be placed
between those units, rather than just within either one.
When setting attribute n on a milestone, it should indicate the
number of the unit starting, not the unit ending. For example,
<milestone type="page" n="3"/> indicates the break between
pages 2 and 3, not between pages 3 and 4. Numbering does not need to
be unique across various types of milestones -- for example, the 24th
line on page 5 of a manuscript may be marked simpley n="5", rather
than n="24.5" or similar.
Several predefined types are provided for the milestone element
(the value for the type attribute is shown
in bold):
- pb
Marks the location of a page break in the source text.
- column
Marks the
location of a column break in the source text. Assuming page
boundaries are also marked, the start of the first column need not be
marked unless something else (such as a footer) precedes it in the
encoding of the page. Columns should be numbered in the order of
reading (for example, right to left in Hebrew texts). In the case of,
say, an English/Hebrew diglot edition, where there is no principled
order of reading among the columns, the direction used for the pages
(Hebrew or Greek) should be considered the dominant direction, and the
same direction should be used for numbering columns.
- header
A milestone of type "header" should precede the encoding of the
page header if it is being included in the encoded text. This would
normally be true only for digitized editions of manuscripts or other
important copy editions, because in modern print Bibles headers are
typically automatically generated.
- footer
Type "footer" should be used just like type "header", except that
it marks the page footer area instead.
- line
Line milestones should be used to mark line breaks in the copy
text when they are considered significant. This will normally only be
true for important manuscripts, where line numbering may be needed
for paleographic or reference use. Line milestones must not be used
to represent linguistically significant line breaks, such as in
poetry, for which the lg and l elements are provided.
- halfLine
In certain languages it is important to mark half-line units, and
this type is provided for such cases.
- screen
The milestone of type "screen" is to be used to mark preferred
break points in an on-screen rendering of the text. For example, if
the user requests to be taken to the book of Psalms in a given
electronic edition, it may be best not to take them to Psalm.1.1, but
to an earlier point, preceding any introductory material. In many
cases this can be accomplished by taking them to the appropriate div
(since the <div type="book" osisID="Ps"> should precede and
Psalms-specific introductory material); but this milestone type is
available for other cases. The OSIS specification does not impose
requirements on how applications make use of such
milestones.
13. Common elements in all texts
The elements found in this section can be found in almost any
encoded text.
13.1. a
The a element is exactly analogous to the HTML
a element, and likewise may be used to encode
links within a document. This eases integration of OSIS documents
into the Web environment. For example:
<p>See Edwards' famous treatise on <a
href="http://www.ccel.org/e/edwards/affections/religious_affections.html">religious
affections</a> for additional information.</p>
13.2. index
The index element may be placed at any point in
the document to indicate a topic under which that location should be
indexed. It is always an empty element. Multiple indexes (such as of
places, names, theological or ethical issues, etc) must be
distinguished via the name attribute.
Indexes with up to 4 levels of headings are supported. The primary
index entry name is specified on the level1
attribute, followed by sub-headings level2, level3, and level4. For example:
<head>On Justice<index name="topic" level1="Virtues"
level2="Justice"/>
There is also a see attribute, which may be
used to represent the need for a cross-reference to another index
entry; such elements should be placed together at the end of the
document body (since they do not refer to a particular location). For
example:
<index name="topic" level1="Virtues" level2="Justice" see="Fairness"/>
No separate "see also" type is provided at this time.
13.3. reference
The reference element is used to encode an
explicit cross-reference to another passage or work (the work referred
to need not be Biblical, but must be declared via a work element in the header, and by accessible via the
same canonical referencing scheme defined in osisID syntax. Reference
elements will often occur within notes, but may also occur freely in
text (the latter is more common when encoding non-Biblical works).
For example:
(example forthcoming)
13.4. abbr
Marks a portion of the content as an abbreviation. The expanded
value should be supplied as the value of the expansion attribute. For
example:
<abbr expansion="Journal of Biblical Literature">JBL</abbr>
Most often seen in notes, where citations are often abbreviated and
users may not be familiar with the abbreviation. Putting expansion in
the expansion attribute allows software to chose to diplay the
expansion instead of the abbreviation or to display it upon request by
the reader.
13.5. catchWord
Catchwords and catchphrases are those parts of notes that are
copied from the main text, to orient the reader as to the note's
precise applicability. Catchwords in notes must be marked when
present. For example:
<verse osisID="NRSV:Ezek.19.5">When she saw that she was thwarted,
that her hope was lost, she took another of her cubs and made him a
young lion.</verse> <note>It is uncertain to which king <catchWord
osisRef="Ezek.19.5">another of her cubs</catchWord> refers....</note>
| |