The goal of MARC4J is to provide an easy to use Application Programming Interface (API) for working with MARC and MARCXML in Java. MARC stands for MAchine Readable Cataloging and is a widely used exchange format for bibliographic data. MARCXML provides a loss-less conversion between MARC (MARC21 but also other formats like UNIMARC) and XML.
MARC4J releases beta 6 through beta 8a where based on an event based parser like SAX for XML. The MARC4J project started as James (Java MARC events), but since there is already an open source project called James, the project is renamed to MARC4J to avoid confusion in open source communities.
The MARC4J library includes:
- An easy to use interface that can handle large record sets.
- Readers and writers for both MARC and MARCXML.
- A build-in pipeline model to pre- or post-process MARCXML using XSLT stylesheets.
- A MARC record object model (like DOM for XML) for in-memory editing of MARC records.
- Support for data conversions from MARC-8 ANSEL, ISO5426 or ISO6937 to UCS/Unicode and back.
- A forgiving reader which can handle and recover from a number of structural or encoding errors in records.
- Implementation independent XML support through JAXP and SAX2, a high performance XML interface.
- Support for conversions between MARC and MARCXML.
- Tight integration with the JAXP, DOM and SAX2 interfaces.
- Easy to integrate with other XML interfaces like DOM, XOM, JDOM or DOM4J.
- Command-line utilities for MARC and MARCXML conversions.
- Javadoc documentation.
MARC4J provides readers and writers for MARC and MARCXML. A org.marc4j.MarcReader
implementation parses input data and provides an iterator over a collection of org.marc4j.marc.Record
objects. The record object model is also suitable for in-memory editing of MARC records, just as DOM is used for XML editing purposes. Using a org.marc4j.MarcWriter
implementation it is possible to create MARC or MARCXML. Once MARC data has been converted to XML you can further process the result with XSLT, for example to convert MARC to MODS .
Although MARC4J is primarily designed for Java development you can use the command-line utilities org.marc4j.util.MarcXmlDriver
and org.marc4j.util.XmlMarcDriver
to convert between MARC and MARCXML. It is also possible to pre- or postprocess the result using XSLT, for example to convert directly from MODS to MARC or from MARC to MODS.
Crosswalking: Processing MARC in XML Environments with MARC4J is a concise book for library programmers who want to learn to use MARC4J to process bibliographic data. It is available through Lulu.com. Read the announcement or visit Lulu.com for more book details.
Check the releases tab to find and download the latest releases
Check here for changes.
A MARC4J tutorial is available in the Documents & Files section. Check the Tutorial folder.
Javadoc documentation is included in the distribution. Please note that since verison 2.1 the Javadoc for org.marc4j.marc provides extensive examples to get you started using the record object model.
MARC stands for MAchine Readable Cataloguing. The MARC format is a popular exchange format for bibliographic records. The structure of a MARC record is defined in the ISO 2709:1996 (Format for Information Exchange) standard (or ANSI/NISO Z39.2-1994, available online from NISO). The MARC4J API is not a full implementation of the ISO 2709:1996 standard. The standard is implemented as it is used in the MARC formats.
The most popular MARC formats are MARC21 and UNIMARC. The MARC21 format is maintained by the Library of Congress . If you’re not familiar with MARC21, you might want to read Understanding MARC Bibliographic , “a brief description and tutorial” on the standard. For more information on the MARC21 format, visit the MARC formats home page at the Library of Congress Web site. For more information about UNIMARC visit the UNIMARC Manual .