Skip to content

RACK Data Model

John Interrante edited this page Mar 5, 2021 · 9 revisions

The nature of time in RACK

RACK represents instances of things that have happened, rather than things that might happen or the ways in which things happen in general. That's why activities or entities in RACK typically contain timestamps: they say when things happened or were created (or deleted), for example. When you create data to ingest into RACK, or when you query RACK data to construct an assurance case, remember that it's all about the past.

Logical Layer Data Model

The logical layer of a data model describes data in terms of the domain, in the way that users would describe their data. RACK uses a specialization of a semantic data model called the entity-relationship (E-R) model as its conceptual model framework. In this type of data model, there are entities that often have relationships with other entities, and both the entities and relationships may have attributes.

RACK leverages two distinct sub-classes of logical model: provenance models and structural models.

Understanding the diagrams

The diagrams below give a visual representation of the data model using a graph representation. In addition to the diagrams below, interactive browsing of the ontology is available through the folder RACK-Ontology/Graphs and the starting point is Project.svg.

  • Nodes in the graph represent classes of data that can be stored in the database.
  • The attributes that can be associated with instances of those classes are listed on those nodes.
  • Solid edges in the graph represent named relationships between instances of these classes.
  • Dashed edges capture the sub-classing relationships between classes. When presenting the larger portions of the ontology the subclassing relationships are omitted to keep the graph understandable.
  • Blue nodes are classes being defined in the current diagram
  • Yellow nodes are classes that are already defined and are imported into the current graph. Notably any attributes or relationships specific to an imported node are not displayed.

DiagramExample

In this isolated example we can see that:

  • HAZARD is a class of data that can be stored
  • HAZARD is a subclass of ENTITY
  • HAZARD has a source relationship to other ENTITY data
  • HAZARD has a definition attribute
  • HAZARD is being defined in this section
  • ENTITY is already defined as is imported from an earlier section

Provenance Modeling in RACK

One part of the RACK data model deals with provenance: how and from where things documented in the RACK database came to be. This part of the RACK data model follows the W3C-PROV provenance model:

PROVSRelations

As shown in the figure above, the basic PROV model includes

  • Entities, that are physical, digital, or other things. RACK can describe the provenance of these things, as well as associations between things.
  • Activities, that explain how entities come into existence, or how the attributes of those entities change over time. Activities include actions by individuals or computer systems, as well as processes that combine multiple actions to achieve creation or alteration of entities. Activities not only create or alter entities: they may also use entities as inputs to control or affect that creation of alteration.
  • Agents, that take roles in Activities or have responsibility for those Activities. An agent may be physical (a person, for example), digital (a piece of software), an organization, or anything else that may be held accountable for an Activity.
  • Derivation, that describes how one entity's existence or characteristics are due to or derived in some measure from another entity. A specialized form of derivation is when one entity is a REVISION OF another entity.
  • Traceability, which is a variant of derivation that describes how an entity satisfies a need expressed by another entity. For example, a piece of software may SATISFY a requirement.

In this model all objects added to the system are described by a domain-specific unique identifier string called uniqueIdentifier -- effectively a primary key. All of the objects in the model will be instances of ENTITY, AGENT, ACTIVITY, or one of the sub-classes of these three core classes as described below. Note that some relationship classes in the ontology relate sub-class instances to class instances. This approach offers some generality (we hope not too much) in data modeling, that we hope will make it easier for TA1 users to ingest data successfully into RACK.

The full PROV model is more expressive than we currently need for ARCOS. However, as the program progresses, we may leverage other PROV expressions. We might for example use Roles that describe the function that an entity played in an Activity; and Plans that describe a process used to guide or control an Activity.

PROVSSubclasses

Structural Modeling in RACK

Another part of the RACK data model deals with structural relationships among entities. Systems for which ARCOS may build assurance cases are complex structures with many interoperable components. Unlike with provenance, we know of no de facto standard for describing the structures we see in the ARCOS domain. Instead, we fall back on intuitive compositional relationships such as isPartOf.

Physical Layer Data Model

The physical layer of a data model describes how data is represented. The aim of RACK is to prevent users from needing insight into our physical layer model, instead providing APIs that operate at the logical layer. At present, RACK aims to use the Resource Data Framework (RDF) triple store model, and may in future extend to a polystore model that mixes RDF with a graph database.

Details of the RACK Logical Data Model

The RACK data model is subdivided into modules. Interconnectivity within each module is intended to be rich, but interconnectivity between modules also exists. In the following sections, we describe each module's current definition.

In the diagrams below, elements shown in tan are base PROV classes, while elements in blue are our specializations for ARCOS. In our diagrams, and in some of our narrative descriptions of them, we use informal wording to describe our data model. While intended to be more intuitive than a formal meta-model description, it is also somewhat less precise. For example, we may say that a Requirement is an ENTITY, when in a formal description, we would instead say that each instance of the Requirement class is an instance of the ENTITY super-class.

Agents

AgentsModel

Analysis

AnalysisModel

Analysis is a sub-class of ACTIVITY. Each instance of Analysis is performed by an instance of the AGENT class. Each Analysis instance produces an instance of Analysis Report. Analysis report is a sub-class of ENTITY. Each Analysis Report instance analyzes an instance of ENTITY, and has as a result an instance of analysis result.

TODO Add screen shot of ingest templates for this portion of the data model.

See the details here.

Confidence

ConfidenceModel

File

FileModel

Hazards

HazardModel

Hazard is a sub-class of ENTITY. A Hazard instance has as its source an instance of ENTITY. Hazard Identification is a sub-class of ACTIVITY. Each instance of Hazard Identification is authored by an instance of AGENT. Not shown yet in this model is the relationship class that connects the Hazard class to the Hazard Identification class.

TODO Add screen shot of ingest templates for this portion of the data model.

See the details here.

Requirements

RequirementsModel

Requirement is a sub-class of ENTITY, and may represent a requirement at any level of abstraction. For example, a low-level requirements may be satisfied by a code files, while a higher-level requirement may be satisfied by lower-level requirements. A requirement instance has a human-readable representation of the requirement it models, stored in the text attribute.

The governs relation connects requirements to entities which are the subject of the requirement. These will typically be SYSTEM entities.

The satisfies relation connects requirements to higher-level requirements. These will typically be REQUIREMENT entities.

The mitigates relation connects requirements to the hazards being mitigated by the requirement. These will typically be HAZARD entities.

A DATA_DICTIONARY_TERM captures input and output elements of requirements. A human-readable representation of the term will be stored in the text property. These terms are linked to the requirements that use them as inputs with consumedBy and those that use them as outputs with providedBy. These terms help requirements to be linked at a finer granularity that a simple satisfies relation.

The createdBy relation connects requirements and data dictionary terms to the REQUIREMENT_DEVELOPMENT activity that generated them.

A REQUIREMENT_DEVELOPMENT activity tracks all of the data associated with creating one or more requirements.

The author relation connects a requirement development activity to the authors agents.

The referenced relation connects a requirement development activity to the source materials used in development. There currently isn't a detailed class for these materials so they would need to be represented as a generic ENTITY.

The governedBy relation connects a requirement development activity to the process documents guiding the development of these requirements. There currently isn't a detailed class for these materials so they would need to be represented as a generic ENTITY.

See the details here.

RequirementsSubclasses

Reviews

ReviewModel

A Review is an ACTIVITY conducted by an Author and possibly Reviewers, governed by a review Process. The subject of the Review is the thing Reviewed.

A Review Log documents the Review of the thing Reviewed, and describes the Review Result, which is one of several possible outcomes. Each Review Log is created by a Review.

TODO Add screen shot of ingest templates for this portion of the data model.

See details here.

Assurance Arguments

An Argument Package is comprised of Argumentation Elements. This part of our ontology is still in development.

Structure of Software

This part of the ontology describes the structure of the software entities for which we store evidence. This structure information is meant as a framework off which we can hang that evidence, so we can easily find relevant evidence later, during assurance case construction.

Software comes in both executable and source flavors. A Package File is deployable as executable and comprised of one or more Object File, while a Code File is something from which a Package File is built. A Code File may satisfy Requirements, and is written in one or more Languages (probably one, but you never know who might embed ASM code in C, for example), each possibly with a particular Version. A Code File is created by an Author AGENT.

The Software ontology object properties are shown in this diagram. Due to the higher complexity of this ontology the subclassing relations are hidden from the diagram and are listed below the diagram.

SoftwareRelations

Software Entities
  • CODE_FILE represents a single file-system object containing uncompiled source code. These entities can exist to satisfy various REQUIREMENTs. They can be implemented in one or more programming languages. Their authorship can be attributed to one or more AGENTs. These files are typically used by COMPILE and CODE_GEN activities.
  • OBJECT_FILE represents the output targets of various compilation activities. These files are typically machine-readable and targeted at a specific architecture. A COMPILE activity is the most likely source of this entity.
  • LIBRARY represents collection of object code ready to be linked into an EXECUTABLE. These entities will typically be composed of many OBJECT_FILEs depending on the language and environment being used to generate them.
  • EXECUTABLE represents an output of the software system development process that could be invoked on the target system. An executable might be something generated by a BUILD activity from OBJECT_FILE and LIBRARY entities, or it might be an executable CODE_FILE in the case of interpreted languages.
  • LANGUAGE represents the programming languages used to implement SOURCE_FILE entities. Languages can be defined at different levels of abstraction. For example there might be a single high-level C language or a finer distinction could be made between versions of the language like C99, GNU89.
  • PACKAGE_FILE represents bundles of software, documentation, and configuration files suitable for distribution to end-users.
  • CONFIG_FILE represents both the human and machine-readable configuration information that can be used to influence any of the activities captured in the software structure. It can also represent configuration data used for testing or deployment of the system.
Software Agents

Software-specific agents will be programs that are run in the course of building, testing, or developing a software system. These agents are distinct from the system that is actually being developed; they exist as part of the development process itself.

  • COMPILER agents will typically be associated with COMPILE activities. These agents correspond to compiler tools like gcc, clang, rustc, icc, etc.
  • PACKAGER agents will typically be associated with PACKAGE activities. These agents are responsible for assembling all of the artifacts needed to create a software package as found on various operating systems and software distribution tools.
  • CODE_GENERATOR agents will typically be associated with CODE_GEN activities. These agents correspond to code-generation tools that might be responsible for generating object model interfaces, parser generators, project skeleton generators, source-file preprocessors, etc.
Software Activities
  • COMPILE is the activity that uses CODE_FILEs in order to create OBJECT_FILEs. This will typically be associated with a COMPILER.
  • CODE_DEVELOPMENT is the activity of generating CODE_FILES and other entities. It can be informed by processes, requirements, tests, or any other entity in the ontology. This activity will be associated with one or more developers.
  • CODE_GEN is the activity of running a CODE_GEN to produce concrete CODE_FILEs.
  • BUILD is the activity the can include connect many other software activities into a single high-level action. Many other activities are likely to be informed by a build activity. This could include the whole process of code-generation, compilation, linking executables, packaging, etc.
  • PACKAGE is the activity that assembles any number of software executables, source files, configuration files, documentation into a distributable archive.

See details here.

System Structure

SystemModel

A System is a composable ENTITY that may be part of a larger ENTITY (also a System). Each System is produced by a System Development ACTIVITY. A System may require other entities to be present in order to work, and may provide entities as output. Each System has Interfaces, each of which has sources and destinations.

SystemSubclasses

See details here.

Testing

TestingModel

A Test is an ENTITY that Verifies another ENTITY (usually a Requirement). Tests are generated by Test Development ACTIVITIES that are conducted by AGENTs. Tests are Executed on ENTITIES (usually Systems) and generate Test Results that include a Status (Pass, Fail, Indeterminate) that Confirms an ENTITY (usually a Requirement).

TODO Add screen shot of ingest templates for this portion of the data model.

See details here.

Documents

DocumentModel

Documents are collections of ENTITYs originating from the same published artifact.

  • DESCRIPTION an ENTITY that represents a planned or actual concept, function, design or object.
  • PLAN an ENTITY that presents a systematic course of action for achieving a declared purpose, including when, how, and by whom specified activities are to be performed.
  • PROCEDURE an ENTITY that presents an ordered series of steps to perform a process, activity, or task.
  • REPORT an ENTITY that describes the results of activities such as investigations, observations, assessments, or test.
  • REQUEST an ENTITY that initiates a defined course of action or changed to fulfill a need.
  • SPECIFICATION an ENTITY that identifies, in a complete, precise, and verifiable manner, the requirements, design, behavior, or other expected characteristics of a system, service or process.
  • SECTION an ENTITY that provides a generic grouping of ENTITYs with a source document.

Process

ProcessModel

Process objectives describe the activities/services/strategies that will be delivered as part of implementing the program.

  • OBJECTIVE an ENTITY that identifies tasks from a process for which evidence must be provided to show that the task has been completed.

System Component Dependencies

The ontology structure defined and the data for the Turnstile model allows for incorporating system structure including interfaces / dependencies between system components. To illustrate this, here is a visualization that shows the Turnstile system structure on which the requirements and the interfaces between system components is also shown.

TurnstileVis