Skip to content
Sebastian Kruse edited this page Feb 13, 2016 · 2 revisions

Use cases

This page lists use cases that benefit from having an MDMS.

Database reverse engineering

Database reverse engineering describes the process of recapturing the schema and semantics of a dataset. According to literature, this process involves in particular application code (e.g., analyzing existing queries), domain experts (e.g., interviewing them), and other external artifacts. Arguably, these resources are oftentimes not available for several reasons, such as attrition or datasets obtained from the internet. In those cases, the database reverse engineering can only resort to the raw data and, consequently, derived metadata/data profiles. To this end, it is important to allow users to gain as much knowledge from the metadata as possible. In the following, we take a closer look at two phases of the database reverse engineering, namely schema reconstruction and comprehension.

Schema reconstruction. Database reverse engineering typically entails a reconstruction of the schema of a given dataset. Amongst others, the following is to be done:

  • Datatype discovery: For all columns of a given dataset, propose a suitable (SQL) datatype. One can take this one step further and also annotate the type of contained information for each of the columns, such as JSON or amount of money.
  • Constraint discovery: Propose suitable (SQL) constraints for a dataset. Primary keys and foreign keys are of paramount importance. Also interesting are not-null constraints, value ranges, and text pattern.
  • Naming: To work with and maintain a dataset, it is important to name its tables and columns.
Clone this wiki locally