USE CASE BERD BY Mannheim University Library (UBM)

Context

Motivation. German company data are spread over many providers, registers and time spans. The company identifiers in Germany are sadly famous for their lack of uniqueness, inconsistent representations and multiple registrations per legal entity (see OpenCorporates blog). The modern data were scraped and processed by OpenCorporates. The main historical datasets were digitized and processed by Mannheim University Library.

Goal. Create a knowledge graph-based research infrastructure for German company datasets in order to improve access to German Business, Economic and Related Data (BERD).

Software. We chose Wikibase for creating and maintaining a knowledge graph.

Challenges

Data integration
Data quality: non-unique identifiers & OCR-ed data
Scaling a Wikibase-based knowledge graph

Resources

Websites:
- BERD@BW
- BERD@NFDI
Data:
- Historical - unstructured and semi-structured datasets
- Modern - OpenCorporates dataset
Tools:
- bbw - semantic annotator for tabular data
- RaiseWikibase - a tool for speeding up data integration and knowledge graph construction using Wikibase

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ubm_berd.md

ubm_berd.md

USE CASE BERD BY Mannheim University Library (UBM)

Context

Challenges

Resources

Files

ubm_berd.md

Latest commit

History

ubm_berd.md

File metadata and controls

USE CASE BERD BY Mannheim University Library (UBM)

Context

Challenges

Resources