-
Notifications
You must be signed in to change notification settings - Fork 18
Support for Google Dataset Search
Introduced in #916 & #924, based on https://developers.google.com/search/docs/data-types/dataset (28.02.2020); the code injects script type="application/ld+json"
in item-view pages based on a predefined mapping and under certain conditions.
Test with https://search.google.com/test/rich-results when deployed.
This feature can be disabled in dspace.cfg
by setting google-dataset.enable
to false
The default mapping (which can be overridden/extended in dspace/config/crosswalks/google-metadata.properties
)
name = dc.title
description = dc.description
keywords = dc.subject
license = dc.rights.uri
url = dc.identifier.uri
citation = dc.relation.isreferencedby
identifier = dc.identifier.uri
creator = dc.contributor.author
name
and description
have a special treatment as those are mandatory; the description must fit between 50 and 5000 characters. Creator (if present) has a special treatment too, as that must be converted to object. To extend, if you don't need to create an object, just add another mapping line.
DataDownload
is left out on purpose, so everyone has to go through the landing page. Moreover; many our datasets are split into multiple files and the documentation seems unclear in that matter.
- the code looks for
google-dataset.blacklistedTypes
(a comma separated list of type values) indspace.cfg
. If an item has a blacklisted dc.type the google dataset metadata are not injected. Eg. we blacklist toolService type - by default it only injects the metadata when the item has bitstreams; this can be overriden in
dspace.cfg
by settinggoogle-dataset.onlyItemsWithBitstreams
to false.