Skip to content

Data description & contribution

Greta Franzini edited this page Feb 26, 2017 · 28 revisions

The Catalogue is continuously growing, and information related to previously processed entries is also updated or corrected retrospectively. This page describes the data categories and values.

Empty cells need filling-in. Cells containing the words not-provided indicate that the project doesn't make this information available via its website (or not clearly, at least).

How to contribute to this open repository

This repository contains data previously collected with Google Spreadsheets and now hosted here in GitHub (thank you Kyle for the suggestion!). The reason for the migration was to allow for controlled contributions and better performance. The data is all collected in a single Comma Separated Value (.csv) file. There are two ways for you to contribute an edition or suggest changes:

  1. If you're familiar with GitHub, you know what to do. Fork this repository, edit and send me a pull request.
  2. If you're not familiar with GitHub, you can create a GitHub issue with as much information about the edition as possible. The more information you provide, the sooner the edition will appear in the Catalogue.

It is important you are consistent and follow these guidelines for the web application to display your contribution correctly.

Catalogue fields

The fields the Catalogue uses to classify information are listed below. Most of them take free text but some should only use predefined values. If you think the Catalogue should contain more fields, please add an issue with details and I'll get back to you!

NOTE: Blank cells in the Catalogue need to be filled in. The words "not provided" are used to indicate that the website or project does not provide the information.

 

Historical Period

This field broadly categorises an edition by the following periods:

  • Antiquity [700 BC - 500 AD]
  • Middle Ages [500 - 1500]
  • Early Modern [1500 - 1789]
  • Long Nineteenth Century [1789 - 1914]
  • Modern [1914 - 1965]
  • Contemporary [1965 - Today]

Time/Century

The specific year or century of the edition's source text. If it is more than one text, the date range is used.

Edition_ID

The name of the edition (project).

URL

The URL of the edition (project).

Scholarly

Here the values 1 or 0 should be used to say whether the edition is scholarly in accordance with Patrick Sahle's definition of the term:

An edition must be critical, must have critical components - a pure facsimile is not an edition, a digital library is not an edition.

Digital

Here the values 1 or 0 should be used to say whether the edition is digital in accordance with Patrick Sahle's definition of the term:

A digital edition can not be converted to a printed edition without substantial loss of content or functionality - vice versa: a retrodigitized printed edition is not a Scholarly Digital Edition (but it may evolve into a Scholarly Digital Edition through new content or functionalities).

Edition

Here the values 1 or 0 should be used to say whether the edition is edition in accordance with Patrick Sahle's definition of the term:

An edition must represent its material (usually as transcribed/edited text) - a catalog, an index, a descriptive database is not an edition.

Language

The language(s) of the source text. Three-letter ISO Codes are used.

Writing support

The nature of the source text (manuscript, letter, notebook, etc.). Use the singular form of the tag only (e.g. "Letter" even if the edition contains multiple) and capitalise the first letter.

Begin date

Year the project started. If not specified, use 0.

End date

Year the project ended. If ongoing type 'present'. If not specified, use 0.

Manager

Name of project manager(s). If multiple, separate the values with a semicolon. If not specified, use 0.

Institution(s)

Name(s) of institution(s) involved in the project. If multiple, separate the values with a semicolon. If not specified, use 0.

Audience

The target audience of the edition project (scholars, general public, etc.). If not specified, use 0.

Philological statement

  • 0: No information on the editorial methods and practices nor on the source (digital or printed) of the text.
  • 0.5: No information on the source, but some information about the author, date and accuracy of the digital edition.
  • 1: Complete information on the source of the text, as well as on the author, date and accuracy of the digital edition. Digital Humanities standards implemented, including modelling, markup language, data structure and software. Values may include a large range of standards used, including HTML, XML-TEI etc.

Account of textual variance

  • 0: No account of textual variance is given. The digital edition is a reproduction of a given print edition without any account of variants.
  • 0.5: The digital edition is a reproduction of a given print scholarly edition and reproduces the selected textual variants extant in the apparatus criticus of that edition, or: the edition does not follow a digital paradigm, in that the variants are not automatically computable the way they are encoded.
  • 1: This edition is “based on full-text transcription of original texts into electronic form” (vd. Proposition 2 in this article by P. Robinson).

Value of witnesses

  • N/A: Not applicable, as no information about the source of the text is given, though it is easily assumable that the source is another digital edition or a printed edition (possibly even a scholarly edition).
  • 0: The only witness modelled digitally is a printed non-scholarly edition, used as a source for the digital edition.
  • 0.5: Same as above, but the witness/source is a scholarly edition.
  • 1: The witnesses are traditional philological primary sources (including manuscripts, inscriptions or papyri).

XML-TEI transcription

The source text is encoded in XML-TEI. Possible values:

  • 0: XML not used
  • 0.5: XML but not TEI
  • 1: XML-TEI is used

XML-TEI transcription to download

The XML-TEI encoded text is available for download.

  • 0: no
  • 0.5: partially
  • 1: yes

Images

The values 1 or 0 are used to tell us if the edition comes with images.

Zoom images

The values 1 or 0 are used to tell us if the user can zoom in or out of images within the edition.

Image manipulation

The values 1 and 0 are used to tell us whether the user can manipulate these images in any way within the edition (brightness, saturation, etc.).

Text-image linking

The values 1 and 0 are used to tell us whether the transcription and the image are linked so that clicking on a word in the image brings up the corresponding textual token and vice-versa.

Source text translation

The project provides a translation (not necessarily in English!) of the source text. If so, the corresponding three-letter ISO code should be used. If not, type 0.

Website language

The language the project website is written in. Three-letter ISO Codes should be used.

Glossary

The values 1 or 0 are used to tell us if the edition provides a glossary.

Indices

The values 1 or 0 are used to tell us if the edition provides indices.

String matching search

The values 1 or 0 are used to tell us if the edition provides string matching (full text) search possibilities.

Advanced search

The values 1 or 0 are used to tell us if the edition provides an advanced search functionality.

Creative Commons License

The values 1 or 0 are used to specify if the project is protected by a Creative Commons License.

Open Source/Open Access

  • 0: Proprietary, all material is copyrighted. The ‘source’ is closed and not reusable by other research projects. To access the material, users must pay a subscription.
  • 0.5: Same as above, but the subscription is free of charge.
  • 1: Open Access. The texts may be accessed through specific software, but the source is not accessible.
  • 1.5: Open Access and Open Source. Part of the data underlying the digital edition (e.g. text but not images) is freely available for access, study, redistribution and improvement (reuse).
  • 2: Open Access and Open Source. All data underlying the digital edition is freely available for access, study, redistribution and improvement (reuse).

Linked Open Data (LOD)

The values 1 or 0 are used to specify if the project makes use of LOD standards and if it is linked to other projects.

API

The values 1 or 0 are used to specify if the project comes with an API (Application Programming Interface).

Crowd-sourcing

The values 1 or 0 are used to specify if the project relies/relied on crowd-sourced contributions.

Feedback

The values 1 or 0 are used to specify if the project provides a feedback space or contact information for users to make comments or suggestions.

Technological statement

This category assesses whether the project provides explicit and complete information about the Digital Humanities standards implemented and the ‘openness’ policies listed above.

  • 0: no explicit statement on standards and openness policies.
  • 0.5: partial information.
  • 1: complete information.

Links to external resources

The values 1 or 0 are used to tell if the project provides links to external relevant resources.

OCR or keyed?

The source text was digitised with Optical Character Recognition (OCR) software or manually keyed in.

Mobile compatibility/app

The values 1, 0 and 0.5 are used to tell if the project can be seamlessly consulted on mobile devices. 0.5 means that some functionality is mobile-friendly but not all.

Print-friendly view

The values 1 or 0 are used to tell if the project provides a print-friendly view of the text (e.g. PDF) or if the browser produces a suitable, printable version of the content.

Print facsimile (complementary output)

The values 1 or 0 are used to specify if the digital project is complemented by a printed facsimile.

Manuscript(s') repository

The institution in which the source text is kept today.

Manuscript(s') place of origin (if known)

If known, the location from which the source text originated or where it was produced.

Funding body

The name of the funding agency. N/A is used if the project isn't supported by third-party funding.

Budget

How much the project cost.

Infrastructure

The technologies used to run the project (Drupal, Omeka, MySQL, etc.).

Homepage

This GitHub repository's homepage.

Data description and contribution

Click here for more information about the data and how you can contribute digital editions.

Web application

For an interactive version of the Catalogue, use the Catalogue of Digital Editions web application, a collaboration with the Austrian Centre for Digital Humanities (ACDH).

Not applicable

A list of projects considered but not suitable for inclusion in the Catalogue.

Feedback

Feedback from users is always welcome!

Cite

DOI

Clone this wiki locally