Skip to content

Web entity codebook

jrault edited this page Dec 21, 2012 · 1 revision

The web entity codebook is a configuration file which set for the corpus the way the users want to describe the web entities.

This is the reference which will be used through the system to handle metadata.

It will be first used by the user interface to dynamically built the metadata interface.

It will also be used by the memory structure as a specification about the metadata field of the web entity index

Table of Contents

Fields

The codebook is a set of field.

Each field as this set of parameters :

  • name String
  • multiple boolean
  • stock_type choice in ["string",]
  • default_value choice in ["string",]
  • interface_position int
  • interface_mandatory boolean
  • interface_vdex vocabIdentifier (see below)
  • interface_input_type choice in ["tag","]

example

( name : language multiple : false stock_type : string default_value : french ), ( name : "plateform type" multiple : false stock_type : string default_value : "website" ),

about vdex

VDEX is a good candidate for the vocabulary definition exchange : http://www.imsglobal.org/vdex/ http://en.wikipedia.org/wiki/IMS_VDEX

crosswalk dictionnaries

A crosswalk dictionnary is used to set the translation needed for normed webservices to pull data from the corpus. It will set the correspondance between :

Those crosswalk will be used by webservices plugins to allow centralized metadata harvesting as OAI-PMH.

implementation

system wild

The codebook should be written in a strctured open format such as XML or JSON.

in the memory structure

web entity index

The memory structure will use the codebook to set/check the specification of the web entity metadata field

metadata harvesting webservices

the crosswalk will be used by the metadata harvesting webservices to set the right format from the user codebook.

Clone this wiki locally