Skip to content

Useful XPaths

Amanda Ross edited this page Apr 14, 2018 · 43 revisions

The following XPaths may be useful when working in a single volume or across volumes. To use in oxygenXML:

  1. Open the XPath/XQuery Builder [Window -> Show View -> XPath/XQuery Builder].
  2. Copy or type the XPath into the XPath/XQuery Builder (eliminate any line breaks, which are rendered below for readability).
  3. In the Project pane, select/highlight desired volume(s).
  4. Press the RUN (red forward arrow) button to run the XPath across your selections.
  5. Results should appear in a new pane at the bottom of your oxygenXML.

To add as a favorite for re-use:

  1. In the XPath/XQuery builder, enter the desired XPath/XQuery expression and click on the star outline on the top right toolbar.
  2. Enter a descriptive name for your tool (e.g. "Dates in (French)" and click OK.
  3. You may now re-use your favorited XPath/XQuery by clicking on the arrow to the right of the star and selecting your desired tool.

NOTE: Many of these XPaths have been used as the basis for Schematron Quick Fixes in dates-only-secondary-review.sch (currently in draft form).

Animated GIF of Secondary Date Review Schematron Quick Fix Example

[Caption: Animated GIF of Secondary Date Review Schematron Quick Fix Example]

A. Historical Documents

A.1. To find date candidates in postscripts of historical documents without date:

A.1.a. Dates in English-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //postscript[matches(.,'(the\s+)?\d{1,2}(st|d|nd|rd|th)?\s+(of\s+)?
    (January|February|March|April|May|June|July|August|September|October|November|December)
      ,?\s+\d{4}|
    ((January|February|March|April|May|June|July|August|September|October|November|December)
      \s+\d{1,2}(st|d|nd|rd|th)?
      ,?\s+\d{4})')]

A.1.b. Dates in French-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //postscript[matches(.,'(le\s+)?\d{1,2}(eme|ème|re)?\s+(de\s+)?
    (janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      ,?\s+\d{4}|
    ((janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      \s+\d{1,2}(eme|ème|re)?
      ,?\s+\d{4})')]

A.1.c. Dates in Spanish-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //postscript[matches(.,'(el\s+)?\d{1,2}\s+((de|del)\s+)?
    (enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      ,?\s+((de|del)\s+)?\d{4}|
    ((enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      \s+\d{1,2}
      ,?\s+\d{4})')]

A.2. To find date candidates in last paragraphs of historical documents without date:

A.2.a. Dates in English-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //p[last()][matches(.,'(the\s+)?\d{1,2}(st|d|nd|rd|th)?\s+(of\s+)?
    (January|February|March|April|May|June|July|August|September|October|November|December)
      ,?\s+\d{4}|
    ((January|February|March|April|May|June|July|August|September|October|November|December)
      \s+\d{1,2}(st|d|nd|rd|th)?,?
      \s+\d{4})')]

A.2.b. Dates in French-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //p[last()][matches(.,'(le\s+)?\d{1,2}(eme|ème|re)?\s+(de\s+)?
    (janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      ,?\s+\d{4}|
    ((janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      \s+\d{1,2}(st|d|nd|rd|th)?
      ,?\s+\d{4})')]

A.2.b. Dates in Spanish-language text

//div[attribute::type='document']
  [not(attribute::subtype='editorial-note')][not(descendant::date)]
  //p[last()][matches(.,'(el\s+)?\d{1,2}?\s+((de|del)\s+)?
    (enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      ,?\s+((de|del)\s+)?\d{4}|
    ((enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      \s+\d{1,2}
     ,?\s+\d{4})')]

A.3. To find dateline//date in document with head containing "Conversation" (in order to add subtype="conversation-or-meeting-date"):

//div[attribute::subtype = 'historical-document'][matches(head/., '[Mm]emorandum\s+of\s+(a\s+)?((Trans-Atlantic|Transatlantic)?\s+)?(Telephone\s+)?[Cc]onversation')]/dateline[not(descendant::*[local-name() = 'attachment'])]//date

B. Attachments

B.1. To find date candidates in last paragraphs of attachments without date:

B.1.a. Dates in English-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //p[last()][matches(., '(the\s+)?\d{1,2}(st|d|nd|rd|th)?\s+(of\s+)?
   (January|February|March|April|May|June|July|August|September|October|November|December)
    ,?\s+\d{4}|
   ((January|February|March|April|May|June|July|August|September|October|November|December)
    \s+\d{1,2}(st|d|nd|rd|th)?
    ,?\s+\d{4})')]

B.1.b. Dates in French-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //p[last()][matches(.,'(le\s+)?\d{1,2}(eme|ème|re)?\s+(de\s+)?
    (janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      ,?\s+\d{4}|
    ((janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      \s+\d{1,2}(eme|ème|re)?
      ,?\s+\d{4})')]

B.1.c. Dates in Spanish-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //p[last()][matches(.,'(el\s+)?\d{1,2}?\s+((de|del)\s+)?
    (enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      ,?\s+((de|del)\s+)?\d{4}|
    ((enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      \s+\d{1,2}(st|d|nd|rd|th)?
      ,?\s+\d{4})')]

B.2. To find date candidates in postscripts of attachments without date:

B.2.a. Dates in English-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //postscript[matches(., '(the\s+)?\d{1,2}(st|d|nd|rd|th)?\s+(of\s+)?
    (January|February|March|April|May|June|July|August|September|October|November|December)
      ,?\s+\d{4}|
    ((January|February|March|April|May|June|July|August|September|October|November|December)
     \s+\d{1,2}(st|d|nd|rd|th)?
      ,?\s+\d{4})')]

B.2.b. Dates in French-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //postscript[matches(.,'(le\s+)?\d{1,2}(eme|ème|re)?\s+(de\s+)?
    (janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      ,?\s+\d{4}|
    ((janvier|février|fevrier|mart|avril|mai|juin|juillet|août|aout|septembre|octobre|novembre|décembre|decembre)
      \s+\d{1,2}(eme|ème|re)?
     ,?\s+\d{4})')]

B.2.c. Dates in Spanish-language text

//div[attribute::type='document'][not(attribute::subtype='editorial-note')]
  //*[local-name()='attachment'][not(descendant::date)]
  //postscript[matches(.,'(el\s+)?\d{1,2}?\s+((de|del)\s+)?
    (enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      ,?\s+((de|del)\s+)?\d{4}|
    ((enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|setiembre|octubre|noviembre|diciembre)
      \s+\d{1,2}?
      ,?\s+\d{4})')]

C. Across all FRUS

C.1. To find date without attributes within the FRUS body:

C.1.a. Including date containing "undated"

//body//date[not(attribute::*)]

C.1.b. Excluding date containing "undated"

//body//date[not(attribute::*)][not(matches(data(.),'[Uu]ndated'))]

C.2. To find compilations/chapters/subchapters without descendant documents:

//body//div[attribute::type = ('compilation', 'chapter', 'subchapter')][not(attribute::subtype = ('index', 'referral'))][not(descendant::div[attribute::type = ('document')])]

C.3. To find div without frus:doc-dateTime-min:

//body//div[not(attribute::subtype = ('editorial-note','errata_document-numbering-error'))][not(attribute::*[local-name() eq "doc-dateTime-min"])]
Clone this wiki locally