The following statement of background and principles for
content designation in the USMARC formats was approved in 1982 and
revised in 1989 by the American Library Association's
RTSD/LITA/RASD Machine-Readable Bibliographic Information Committee
(MARBI), in consultation with representatives from United States
and Canadian national libraries and designated bibliographic
networks. The statement includes the principles under which the
USMARC formats were developed and constitutes a set of working
principles for the ongoing process of format development. This
document will be revised as necessary.
1. Introduction
1.1. The USMARC formats are standards for the representation
and communication of bibliographic and related
information in machine-readable form.
1.2. A USMARC record involves three elements: the record
structure, the content designation, and the data content
of the record.
1.2.1. The structure of USMARC records is an
implementation of national and international
standards, e.g., Bibliographic Information
Interchange (ANSI Z39.2) and Format for
Bibliographic Information Interchange on Magnetic
Tape (ISO 2709).
1.2.2. Content designation, the codes and conventions
established to identify explicitly and characterize
further the data elements within a record and to
support the manipulation of those data, is defined
in the USMARC formats.
1.2.3. The content of most data elements is defined by
standards outside the formats, e.g., Anglo-American
Cataloguing Rules, Library of Congress Subject
Headings, National Library of Medicine
Classification. The content of other data
elements, e.g., coded data (see section 9. below),
is defined in the USMARC formats.
1.3. A USMARC format is a set of codes and content designators
defined for encoding a particular type of machine-
readable record. USMARC formats are defined for the
following types of data: bibliographic, holdings, and
authority.
1.3.1. USMARC Format for Bibliographic Data contains
format specifications for encoding data elements
needed to describe, retrieve, and control various
forms of bibliographic material. The USMARC Format
for Bibliographic Data is an integrated format
defined for the identification and description of
different forms of bibliographic material. USMARC
specifications are defined for books, archival and
manuscripts control, computer files, maps, music,
visual materials, and serials. With the full
integration of the previously discrete
bibliographic formats, consistent definition and
usage are maintained for different forms of
material.
1.3.2. USMARC Format for Holdings Data contains format
specifications for encoding data elements pertinent
to holdings and location data for all forms of
material.
1.3.3. USMARC Format for Authority Data contains format
specifications for encoding data elements that
identify or control the content and content
designation of those portions of a bibliographic
record that may be subject to authority control.
1.4. The USMARC formats are maintained by the Library of
Congress in consultation with various user communities.
1.4.1. Through maintenance and revision, content
designation is added to and existing content
designation is made obsolete or deleted from
formats. Content designation is made obsolete when
it is found to be no longer appropriate or when the
data element involved is no longer needed. An
obsolete content designator may continue to appear
in records created prior to the date it was made
obsolete. Obsolete content designators are not
used in new records. A deleted content designator
is one that had been reserved in USMARC but had not
been defined or one that had been defined but it is
known with near certainty that it had not been
used.
1.4.2. The principles stated in this document have
developed over time. The formats contain
exceptions to the principles due to early format
development decisions. While many exceptions have
been made obsolete, others remain because of the
need to maintain upward compatibility of the
formats in current development.
2. General Considerations
2.1. The USMARC formats are communication formats, primarily
designed to provide specifications for the exchange of
bibliographic and related information between systems.
They are widely used in a variety of exchange and
processing environments. As communication formats, they
do not mandate internal storage or display formats to be
used by individual systems.
2.2. The USMARC formats, particularly the bibliographic and
authority formats, were developed to enable the Library
of Congress to communicate its catalog records to other
institutions. The formats have had a close relationship
to the needs and practices of United States libraries.
They reflect both the various cataloging codes applied in
the library community and the requirements of the
archives community.
2.3. The USMARC formats were designed to facilitate the
exchange of bibliographic and related information on
magnetic tape within the United States. An attempt has
been made to preserve compatiblity with other national
and international formats, e.g., CANMARC and UNIMARC.
Lack of international agreement on cataloging codes and
practices has made complete compatibility impossible.
2.4. National agencies in the United States and Canada
(Library of Congress, National Agricultural Library,
National Library of Medicine, United States Government
Printing Office, and National Library of Canada) are
given special emphasis and consideration in the formats
because they serve as sources of authoritative cataloging
and as agencies responsible for certain data elements.
2.5. The institutions responsible for the content, content
designation, and transcription accuracy of bibliographic
and authority data within a USMARC record are identified
at the record level in field 008/39 (Fixed-Length Data
Elements�Cataloging source) and in field 040 (Cataloging
Source). This responsibility may be evaluated in terms
of the following rule.
2.5.1. Responsible Parties Rule:
2.5.1.1. Unmodified records�The institution identified as
the cataloging institution (field 040$a) is
considered responsible for data content in the
record except for agency-assigned data (see section
2.5.2.1. below). The institution identified as the
transcribing institution (field 040$c) is
considered responsible for content designation and
transcription accuracy for all data.
2.5.1.2. Modified records�Institutions identified as
cataloging or modifying institutions (field
040$a,$d) are considered collectively responsible
for data content in the record except for agency-
assigned and authoritative-agency data (see section
2.5.2. below). Institutions identified as
transcribing or modifying institutions (field
040$c,$d) are considered collectively responsible
for content designation and transcription accuracy.
2.5.2. Exceptions to Responsible Parties Rule:
2.5.2.1. Certain data elements are defined in the USMARC
formats as being exclusively assigned by particular
agencies, e.g., International Standard Serial
Number (field 022), Library of Congress Control
Number (field 010). The content of such agency-
assigned elements is always the responsibility of
the agency.
2.5.2.2. Certain data elements have been defined in the
USMARC formats in relation to one or more
authoritative agencies that maintain the lists or
rules upon which the data is based, e.g., Library
of Congress Call Number (field 050), National
Library of Medicine Call Number (field 060). Where
it is possible for other agencies to create similar
or identical content for these data elements,
content designation may be provided to distinguish
between content actually assigned by the
authoritative agency and that assigned by other
agencies. In the former case, responsibility for
content rests with the authoritative agency. In
the latter case, the Responsible Parties Rule
applies, and no further identification of the
assigning agency is provided.
2.6. The USMARC bibliographic format provides content
designation only for data that are applicable to all
copies of the bibliographic entity described.
2.6.1. Information which applies only to some copies (or
even to a single copy) of a title may be of
interest beyond the institutions holding such
copies. The USMARC formats provide limited content
designation for the encoding of this information
and for identifying the holding institution, e.g.,
subfield $5 in the 700-740 added entry fields in
the bibliographic format.
2.6.2. Information that does not apply to all copies of a
title, and is not of interest to other
institutions, is coded in local fields. For
instance, the 59X block is reserved for local notes
in the bibliographic format (see section 6.7
below).
2.7. Although a USMARC record is usually autonomous, data
elements are provided that contain information used to
link related records. These linkages may be implicit,
through identical access points in each record, or
explicit, through a linking entry field. The 76X-78X
linking entry fields in the bibliographic format may
contain either selected data elements that identify the
related item or a control number that identifies the
related record. In addition, an explicit code in the
leader identifies a record that is linked to another
record through a control number.
3. Structural Features
3.1. The USMARC formats are an implementation of the
Bibliographic Information Interchange (ANSI Z39.2). The
formats also incorporate other relevant ANSI standards,
e.g., Magnetic Tape Labels and File Structure for
Information Interchange (ANSI X3.27).
3.2. All information in a USMARC record is stored in character
form. USMARC communications records are coded in
Extended ASCII, as defined in the USMARC Specifications
for Record Structure, Character Sets, Tapes.
3.3. The length of each variable field can be determined
either from the length-of-field portion of the directory
entry or from the occurrence of the field terminator
character [1E16, 8-bit]. The length of a record can be
determined either from the logical record length element
in Leader/00-04 or from the occurrence of the record
terminator character [1D16, 8-bit]. The location of each
variable field is explicitly stated in the starting
character position element in its directory entry.
4. Content Designation
4.1. The goal of content designation is to identify and
characterize the data elements that comprise a USMARC
record with sufficient precision to support manipulation
of the data for a variety of functions.
4.2. USMARC content designation is designed to support
functions that include:
a. Display�the formatting of data for display on a
CRT, for printing on 3x5 cards or in book catalogs,
for production of COM catalogs, or for other visual
presentation of the data.
b. Information retrieval�the identification,
categorization, and retrieval of any identifiable
data element in a record.
4.3. Some fields serve multiple functions. For example, field
245 (Title Statement) serves both as the bibliographic
transcription of the title and the statement of
responsibility and as an access point for the title.
4.4. The USMARC formats provide for display constants. A
display constant is a term, phrase, and/or spacing or
punctuation convention that may be system generated under
prescribed circumstances to make a visual presentation of
data in a record more meaningful to a user. Such display
constants are not carried in the data, but may be
supplied for display by the processing system. For
example, subfield $x in Series Statement field 490 (and
in some other fields) implies the display constant ISSN;
also, the combination of tag 780 (Preceding Entry) and
second indicator value 3 implies the display constant
Supersedes in part:.
4.5. The USMARC formats support the sorting of data only to a
limited extent. In general, sorting must be accomplished
through the application of external algorithms to the
data.
5. Organization of the Record
5.1. A USMARC record consists of three main sections: the
leader, the directory, and the variable fields.
5.2. The leader consists of data elements that contain coded
values and are identified by relative character position.
Data elements in the leader define parameters for
processing the record. The leader is fixed in length (24
characters) and occurs at the beginning of each USMARC
record.
5.3. The directory contains the tag, starting location, and
length of each field within the record. Directory
entries for variable control fields appear first, in
ascending tag order. Entries for variable data fields
follow, arranged in ascending order according to the
first character of the tag. The order of the fields in
the record does not necessarily correspond to the order
of directory entries. Duplicate tags are distinguished
only by location of the respective fields within the
record. The length of the directory entry is defined in
the entry map elements in Leader/20-23. In the USMARC
formats, the length of a directory entry is 12
characters. The directory ends with a field terminator
character.
5.4. The data content of a record is divided into variable
fields. The USMARC formats distinguish two types of
variable fields: variable control fields and variable
data fields. Control and data fields are distinguished
only by structure (see sections 7 and 8 below). The term
fixed fields is occasionally used in USMARC
documentation, referring either to control fields
generally or to specific coded-data fields, e.g., 007
(Physical Description Fixed Field) or 008 (Fixed-Length
Data Elements).
6. Variable Fields and Tags
6.1. The data in a USMARC record is organized into fields,
each identified by a three-character tag.
6.2. According to ANSI Z39.2, the tag must consist of
alphabetic or numeric ASCII graphic characters, i.e.,
decimal integers 0-9 or letters A-Z (uppercase or
lowercase, but not both). The MARC formats have used
only numeric tags.
6.3. The tag is stored in the directory entry for the field,
not in the field itself.
6.4. Variable fields are grouped into blocks according to the
first character of the tag, which identifies the function
of the data within a record, e.g., main entry, added
entry, subject entry. The type of information in the
field, e.g., personal name, corporate name, or title, is
identified by the remainder of the tag.
6.4.1. Bibliographic format blocks:
0XX = Control information, numbers, and
codes
1XX = Main entry
2XX = Titles and title paragraph (title,
edition, imprint)
3XX = Physical description, etc.
4XX = Series statements
5XX = Notes
6XX = Subject access fields
7XX = Added entries other than subject
or series; linking fields
8XX = Series added entries, etc.
9XX = Reserved for local implementation
6.4.2. Authority format blocks:
0XX = Control information, numbers, and
codes
1XX = Heading
2XX = Complex see references
3XX = Complex see also references
4XX = See from tracings
5XX = See also from tracings
6XX = Reference notes, treatment
decisions, notes, etc.
7XX = Not defined
8XX = Not defined
9XX = Reserved for local implementation
6.4.3. Holdings format blocks:
0XX = Control information, numbers, and
codes
1XX = Not defined
2XX = Not defined
3XX = Not defined
4XX = Not defined
5XX = Notes
6XX = Not defined
7XX = Not defined
8XX = Holdings and location data, notes
9XX = Reserved for local implementation
6.5. Certain blocks in the USMARC bibliographic and authority
formats contain data which may be subject to authority
control (1XX, 4XX, 6XX, 7XX, 8XX for bibliographic
records; 1XX, 4XX, 5XX for authority records).
6.5.1. In these blocks, certain parallels of content
designation are preserved. The following meanings
are generally given to the final two characters of
the tag:
X00 = Personal names
X10 = Corporate names
X11 = Meeting names
X30 = Uniform titles
X40 = Bibliographic titles
X50 = Topical terms
X51 = Geographic names
Further content designation (indicators and
subfield codes) for data elements subject to
authority control are defined consistently across
the bibliographic and authority formats. These
guidelines apply only to the main range of fields
in each block, not to secondary ranges, e.g., the
linking entry fields 760-787 in the bibliographic
format.
6.5.2. Within fields subject to authority control, data
elements may exist which are not subject to
authority control and which may vary from record to
record containing the same heading, e.g., subfield
$e, Relator.
6.5.3. In fields not subject to authority control, each
tag is defined independently. Parallel meanings
have been preserved whenever possible.
6.6. Principles have been established to assist in determining
when a separate field should be defined for note data and
when the data should be included in a general note field.
6.6.1. In the USMARC bibliographic format, a specific 5XX
note field is defined when at least one of the
following is true:
a. Categorical indexing or retrieval is required
on the data defined for the note. The note is
used for structured access purposes but does
not have the nature of a controlled access
point.
b. Special manipulation of that specific category
of data is a routine requirement. Such
manipulation includes special print/display
formatting or selection/suppression from
display or printed product.
c. Specialized structuring of information for
reasons other than those given in (a) or (b),
e.g., to support particular standards of data
content when they cannot be supported in
existing fields.
6.6.2. In the USMARC authority format, the specifications
for notes are covered in the following two
conditions:
a. A specific note field is needed when special
manipulation of that specific category of data
is a routine requirement. Such manipulation
includes special print/display formatting or
selection/suppression from display or printed
product.
b. Multiple notes are generally not established
to accommodate the same type of information
for different types of authorities. Notes are
thus not differentiated by or limited to
subject, name, or series if the same
information applies to more than one type.
6.7. Certain tags have been reserved for local implementation.
The USMARC formats specify no structure or meaning for
local fields. Communication of local fields between
systems is governed by mutual agreements on the content
and content designation of the fields communicated.
6.7.1. The 9XX block is reserved for local implementation.
6.7.2. In general, any tag containing the character 9 is
reserved for local implementation within the block
structure (see section 6.4 above).
6.7.3. The historical development of the USMARC formats
has left one exception to this general principle:
field 490 (Series Statement) in the bibliographic
format. There are several obsolete fields with
tags containing the character 9.
6.8. Theoretically, all fields, except field 001 (Control
Number) and field 005 (Date and Time of Latest
Transaction), may be repeated. The nature of the data,
however, often precludes repetition. For example, a
bibliographic record may contain only one field 245
(Title Statement) and an authority record may contain
only one 1XX heading field. The
repeatability/nonrepeatability of each field is defined
in the USMARC formats.
7. Variable Control Fields
7.1. The 00X fields in the USMARC formats are variable control
fields.
7.2. Variable control fields consist of data and a field
terminator. They contain neither indicators nor
subfield codes (see sections 8.3 and 8.4 below).
7.3. Variable control fields contain either a single data
element or a series of fixed-length data elements
identified by relative character position.
8. Variable Data Fields
8.1. All fields except 00X are variable data fields.
8.2. Four levels of content designation are provided for
variable data fields in ANSI Z39.2:
a. a three-character tag, stored in the directory
entry;
b. indicators stored at the beginning of each variable
data field, the number of indicators being
reflected in Leader/10 (Indicator count);
c. subfield codes preceding each data element, the
length of the code being reflected in Leader/11
(Subfield code count); and
d. a field terminator following the last data element
in the field.
8.3. Indicators
8.3.1. Indicators contain values conveying information
that interprets or supplements the data found in
the field.
8.3.2. The USMARC formats specify two indicator positions
at the beginning of each variable data field.
8.3.3. Indicators are defined independently for each
field. Parallel meanings are preserved whenever
possible.
8.3.4. Indicator values are interpreted independently;
meaning is not ascribed to the two indicators taken
together.
8.3.5. Indicators may be any lowercase alphabetic or
numeric character or a blank (#). Numeric values
are defined first. A blank (#) is used in an
undefined indicator position or to mean information
not provided in a defined indicator position.
8.3.6. The value 9 is reserved for local implementation.
8.4. Subfield Codes
8.4.1. Subfield codes identify data elements within a
field that require (or might require) separate
manipulation.
8.4.2. Subfield codes in the USMARC formats consist of two
characters�a delimiter [1F16, 8-bit], followed by
a data element identifier. A data element
identifier may be any lowercase alphabetic or
numeric character.
8.4.2.1. Numeric identifiers are defined for parametric data used
to process the field, or coded data needed to interpret
the field. (Note that not all numeric identifiers
defined in the past have followed this specification.)
8.4.2.2. Alphabetic identifiers are defined for the separate
elements that constitute the data content of the field.
8.4.2.3. The character 9 and the following graphic symbols are
reserved for local definition as data element
identifiers: ! " # $ % & ' ( ) * + ' - . / : ; < = > ?
8.4.3. Subfield codes are defined independently for each
field. Parallel meanings are preserved whenever
possible.
8.4.4. Subfield codes are defined for purposes of
identification, not arrangement. The order of
subfields is specified by content standards, e.g.,
cataloging rules. In some cases, however, such
specifications may be incorporated in the USMARC
format documentation.
8.4.5. Theoretically, all data elements may be repeated.
The nature of the data, however, often precludes
repetition. The repeatability/nonrepeatability of
each subfield code is defined in the USMARC
formats.
9. Coded Data
9.1. In addition to content designation, the USMARC formats
include specifications for the content of certain data
elements, particularly those that provide for the
representation of data by coded values.
9.2. Coded values consist of fixed-length character strings.
Individual elements within a coded-data field or subfield
are identified by relative character position.
9.3. Although coded data occur most frequently in the leader,
directory, and variable control fields, any field or
subfield may be defined for coded-data elements.
9.4. Certain common values have been defined whenever
applicable:
# Undefined (element not defined)
n Not applicable (element is not
applicable to the item)
u Unknown (record creator was unable to
determine value)
z Other (value other than those defined for the
element)
| Fill character (record creator has chosen not
to provide information)
Historical exceptions do occur in the formats. In
particular, the blank (#) often has been defined as not
applicable or has been assigned a specific meaning.
STANDARDS AND OTHER DOCUMENTS RELATED TO USMARC FORMATS
National and international standards:
These publications are available from the American
National Standards Institute, Inc., 1430 Broadway, New York, NY
10018.
Bibliographic Information Interchange (ANSI Z39.2-1985)
Format for Bibliographic Information Interchange on Magnetic Tape
(ISO 2709-1981)
Magnetic Tape Labels and File Structure for Information Interchange
(ANSI X3.27-1987)
USMARC standards:
These publications are available from the Library of
Congress, Cataloging Distribution Service, Washington, DC 20541.
USMARC Concise Formats
USMARC Format for Authority Data
USMARC Format for Bibliographic Data
USMARC Format for Holdings Data
USMARC Specifications for Record Structure, Character Sets, Tapes
USMARC Code List for Languages
USMARC Code List for Countries
USMARC Code List for Geographic Areas
USMARC Code List for Relators, Sources, Descriptive Conventions