Skip to content

Data description & contribution

Greta Franzini edited this page Jun 20, 2017 · 28 revisions

How the data is collected and organised in the Catalogue

The Catalogue of Digital Editions is continuously updated with new digital editions and the latest information pertaining to the previously processed entries. All of the data is collected in a single comma separated value (.csv) file. Updates to the .csv file are regularly synced with the Catalogue of Digital Editions Web App.

The .csv file contains a header row with numerous fields, each addressing a particular aspect or feature of digital editions. Each digital edition occupies a row in the .csv file and should provide information for all the fields listed in the header row. No cell should be left empty. Having said that, the Catalogue today does contain many empty cells - this is the result of previous developments but efforts are being made to correct the issue.

This repository contains two .csv files:

  1. digEds_cat.csv: this is the complete Catalogue and the file you'll want to download, reuse or contribute to.
  2. institutions_places_enriched.csv: this file collects the coordinates of geographical locations mentioned in the Catalogue for their display in the web application.

How to contribute a digital edition to the Catalogue

There are three different ways to contribute a digital edition:

  1. If you're familiar with GitHub, you know what to do: fork this repository, edit the .csv file and create a pull request.
  2. If you're not familiar with GitHub, you can create a GitHub issue with as much information about the edition as possible (see "Data fields" section below). The more information you provide, the sooner the edition will appear in the Catalogue.
  3. If you'd rather not use GitHub at all, you can fill-in a Google Form at this address: https://goo.gl/forms/4Ya3jwRCBi0VSexx2 Your entry will be moderated and added to the Catalogue by an administrator.

It's important you're consistent and follow these guidelines for the web application to display your contribution correctly.

If you want to suggest changes to existing editions, please add these as issues in this repository with the name of the edition in question.

Data fields in the Catalogue

The fields the Catalogue uses to classify information are listed below. Some take free text, others use predefined values. The words "not provided" are used to indicate that the website or project does not provide the relevant information. However you choose to contribute a digital edition to the Catalogue, please ensure you address as many of the following fields as possible.

You can print a PDF-version of these fields here.  

Historical Period

This field broadly categorises the source material of a digital edition by the following periods:

  • Antiquity [700 BC - 500 AD]
  • Middle Ages [500 - 1500]
  • Early Modern [1500 - 1789]
  • Long Nineteenth Century [1789 - 1914]
  • Modern [1914 - 1965]
  • Contemporary [1965 - Today]

Time/Century

The specific year(s) or century of the digital edition's source material. Year ranges can also be added in the format YYYY-YYYY.

Edition name

The name of the digital edition project.

URL

The URL of the digital edition project.

Scholarly

Here the values 0 [no] or 1 [yes] should be used to say whether the edition is scholarly in accordance with Patrick Sahle's definition of the term:

An edition must be critical, must have critical components - a pure facsimile is not an edition, a digital library is not an edition.

Digital vs. Digitised

Here the values 0 [no] or 1 [yes] should be used to say whether the digital edition is digital in accordance with Patrick Sahle's definition of the term:

"A digitized print edition is not a "digital edition". If the paradigm of an edition is limited to the two-dimensional space of the "page" and to typographic means of information representation, then it's not a digital edition." (see: http://www.digitale-edition.de/vlet-about.html)

Edition

Here the values values 0 [no] or 1 [yes] should be used to say whether the digital edition is an edition in accordance with Patrick Sahle's definition of the term:

An edition must represent its material (usually as transcribed/edited text) - a catalog, an index, a descriptive database is not an edition.

Language

The language(s) the source material is written in three-letter ISO Codes are used.

Writing support

The nature of the source material (manuscript, letter, notebook, etc.). Use the singular form of the tag only (e.g. "Letter" even if the edition contains multiple) and capitalise the first letter.

Begin date

Year the project started. If not specified, use "not provided".

End date

Year the project ended. If ongoing type 'present'. If not specified, use "not provided".

Manager

Name and surname of principal investigator/manager/coordinator.

Participating institution(s)

Institution(s) involved in the project. If multiple, separate with a semicolon. If not specified, use "not provided".

Audience

The target audience of the digital edition project (scholars, students, general public, etc.). If not specified, use "not provided".

Philological statement

  • 0: No information on the editorial methods and practices nor on the source (digital or printed) of the text.
  • 0.5: No information on the source, but some information about the author, date and accuracy of the digital edition.
  • 1: Complete information on the source of the text, as well as on the author, date and accuracy of the digital edition. Digital Humanities standards implemented, including modelling, markup language, data structure and software.

Account of textual variance

  • 0: No account of textual variance is given. The digital edition is a reproduction of a given print edition without any account of variants.
  • 0.5: The digital edition is a reproduction of a given print scholarly edition and reproduces the selected textual variants extant in the apparatus criticus of that edition, or: the edition does not follow a digital paradigm, in that the variants are not automatically computable the way they are encoded.
  • 1: This edition is “based on full-text transcription of original texts into electronic form” (vd. Proposition 2 in this article by P. Robinson).

Value of witnesses

  • N/A: Not applicable, as no information about the source of the text is given, though it is easily assumable that the source is another digital or printed edition (possibly even a scholarly edition).
  • 0: The only witness modelled digitally is a printed non-scholarly edition used as a source for the digital edition.
  • 0.5: Same as above, but the witness/source is a scholarly edition.
  • 1: The witnesses are traditional philological primary sources (including manuscripts, inscriptions or papyri).

XML-TEI transcription

The source material is encoded in XML-TEI. Values:

  • 0: XML not used
  • 0.5: XML but not TEI
  • 1: XML-TEI is used

XML(TEI) transcription is available to download

The XML(TEI) encoded text is available for download.

  • 0: no
  • 0.5: partially
  • 1: yes

Images

The values 0 [no], 0.5 [some] or 1 [yes] are used to specify if the edition comes with images.

Zoom images

The values 0 [no] or 1 [yes] are used to specify if the user can zoom in or out of images.

Image manipulation

The values 0 [no] or 1 [yes] are used to specify whether the user can manipulate the images (e.g. rotation, brightness, etc.).

Text-image linking

The values 0 [no] or 1 [yes] are used to tell us whether the transcription and the image are linked so that clicking on a word in the image brings up the corresponding word in the transcription and vice-versa.

Source text translation

The project provides a translation of the source material (not necessarily into English). If so, the corresponding three-letter ISO codes should be used. If not, type 0.

Website language

The project website is available in multiple languages. If so, the corresponding three-letter ISO codes should be used. If not, simply type '0'.

Glossary

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides a glossary.

Indices

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides indices.

String matching search

The values 0 [no] or 1 [yes] are used to specify if the edition provides string matching (full text) search possibilities.

Advanced search

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides advanced search functionality.

Creative Commons License

The values 0 [no], 0.5 [partially] or 1 [yes] are used to specify if the digital edition is protected by a Creative Commons License.

Open Source/Open Access

  • 0: Proprietary, all material is copyrighted. The source is closed and not reusable by other research projects. To access the material, users must pay a subscription fee.
  • 0.5: Same as above but the subscription is free of charge.
  • 1: Open Access. The texts may be accessed through specific software but the source is not accessible.
  • 1.5: Open Access and Open Source. Part of the data underlying the digital edition (e.g. text but not images) is freely available for access and reuse.
  • 2: Open Access and Open Source. All data underlying the digital edition is freely available for access and reuse.

Linked Open Data (LOD)

The values 0 [no] or 1 [yes] are used to specify if the digital edition makes use of LOD standards and if it is linked to other projects/data.

Application Programming Interface (API)

The values 0 [no] or 1 [yes] are used to specify if the digital edition comes with an API.

Crowdsourcing

The values 0 [no] or 1 [yes] are used to specify if the digital edition relies/relied on crowdsourced contributions.

Feedback

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides a feedback space or contact information for users to make comments or suggestions.

Technological statement

This category assesses whether the digital edition provides complete information about technical aspects and practices.

  • 0: no information.
  • 0.5: partial information.
  • 1: complete information.

Links to external resources

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides links to external relevant resources.

OCR'd or keyed

The source text was digitised with Optical Character Recognition (OCR) software or manually keyed in.

Mobile-friendly/application

The values 1, 0 and 0.5 are used to tell if the project is mobile friendly in accordance with Google's Search Console Mobile-Friendly Test. The value 0.5 means that some of the project's functionality is mobile-friendly but not all.

Print-friendly view

The values 0 [no] or 1 [yes] are used to specify if the digital edition provides a print-friendly view of the text (e.g. PDF) or if the browser produces a suitable, printable version of the content.

Print facsimile (complementary output)

The values 0 [no] or 1 [yes] are used to specify if the digital project is complemented by a printed facsimile.

Repository of source material

The institution(s) that house the source text(s).

Place of origin of source material

If known, the location from which the source text originated or where it was produced.

Sponsor/Funding body

The name of the funding agency. N/A is used if the project isn't supported by third-party funding.

Budget (rough)

How much the project cost. All currencies are supported and the numeric value should use commas as thousands separators (e.g. £10,000). The value no information provided is used to indicate that the project website does not make this information known; 0 is used to indicate that the project specifies that it does not rely on funding.

Infrastructure

The technologies used to build the digital edition (Drupal, Omeka, MySQL, etc.). If multiple, please separate with a semicolon.

Current availability

Even if completed in the past, the digital edition is still viewable online today. The values 0 [no] or 1 [yes] are used.

Homepage

This GitHub repository's homepage.

Data description and contribution

Click here for more information about the data and how you can contribute digital editions.

Web application

For an interactive version of the Catalogue, use the Catalogue of Digital Editions web application, a collaboration with the Austrian Centre for Digital Humanities (ACDH).

Not applicable

A list of projects considered but not suitable for inclusion in the Catalogue.

Feedback

Feedback from users is always welcome!

Cite

DOI

Clone this wiki locally