The demo/ directory of the cogs repository holds a demonstration of
one way to make use of cogs
. It includes:
- A mocked up framework, application and components.
- Schema to generate configuration classes.
- Example
cogs
configuration stream file. - Simple tooling to rerun code generation.
- Integration into
cogs
build system.
This document describes how to build and run the demo. It then gives
a tour of the mocked framework to understand one possible way to allow
cogs
to be used. Details on the code generation steps come next and
it ends with a section that uses the demo’s schema to illustrate how
one may develop schema for applications.
This section describes issues about building the demo code.
In addition to what cogs
library requires, the demo requires:
moo
Python program from moo (only to re-codegen)
With prerequisites satisfied, the demo builds with the cogs
library. For example:
waf configure --prefix=$(pwd)/install \ --with-nljs=$HOME/opt/nljs \ --with-ers=$HOME/opt/ers waf install
You should be rewarded with:
./install/bin/cogs-demo || /bin/true
2020-Jun-25 12:17:22,617 INFO [main(...) at unknown/demo/cogs-demo.cpp:12] usage: ./install/bin/cogs-demo <uri>
The demo relies on generated code which is committed to the repository
to reduce the build-time dependency of cogs
. If the additional
prerequisites are satisfied, it may be regenerated:
./demo/generate.sh
Below we will look more at what this script does.
The demo provides a ready made cogs
configuration stream file:
./install/bin/cogs-demo file://demo/demo-config.json | sed -e 's/^[^]]*\]//'
The sed
is simply to remove ERS output augmentation more appropriate
to log files. The output shows the configuration driving the
construction of a “node” and a “component” (the “source”) followed by
their configuration. When the node is configured it makes a (dummy)
“port” and hands that C++ object to the source. The next section on
the demo framework describes these terms. They are not inherent to
cogs
itself, just this demo but they represent typical code patterns.
It is possible to use cogs
in a variety of patterns. This demo
illustrates one particular pattern such as may be used in an
“application framework”. This pattern might be named something like
“factory configuration”. It provides for a highly flexible,
configuration-driven method for “aggregating” an application instance
from a set of factory-instantiated components.
The construction of the demo application and its configuration is
driven by a cogs
configuration stream. The stream is composed of a
sequence of pairs of configuration objects. The first object in each
pair co responds to a fixed type of demo::ConfigurableBase
. It
provides information required to locate a component instance. The
second object in a pair corresponds to the configuration of that
component instance.
The stream is illustrated as:
component 1: democfg::ConfigHeader |
---|
component 1: corresponding cfg object |
… |
component N: democfg::ConfigHeader |
component N: corresponding cfg object |
The ConfigHeader
provides two attributes:
- implementation identifier
- this is some name associated with a
construction method for an implementation of
ConfigurableBase
. This identifier is some simple name, likely derived from the component’s C++ class name. - instance identifier
- multiple instances of one component may be constructed and this identifier keeps then distinct.
The main application walks the cogs::Stream
using the ConfigHeader
to
retrieve an instance from the demo factory. It then reads the next
object from the cogs::Stream
and passes it to the component’s
configure()
method. When the stream is exhausted the demo app simply
exits. A real app would of course go on to some other phase of
execution.
The demo adds some non-trivial complexity by considering two types of configurable objects:
- node
- an object which has a collection of ports such may be associated with sockets. The demo keeps ports as dummies but they represent some shared resource that is non-trivial to construct.
- components
- an object which is configurable and may also want to use ports. There is only a single component in the demo called a “source”. It represents some arbitrary “code execution unit” aka “user module”.
The node is really just another component but it is called out special
here as it uses the factory to locate instances of other components,
as directed by its configuration and in order to deliver fully formed
“ports”. A component must inherit from
demo::PortuserBase
and be listed in the node’s configuration in order
to receive its ports.
This pattern is a mock of a real implementation found in ZIO which uses a zio::Node to create and link zio::Port instances either directly or automatically with the help of a zio::Peer performing distributed network discovery.
This section provides a tour of code generation part of the demo. The tour focuses on the short generate.sh script. This script runs its commands from the demo/ directory and that should be taken into consideration when reading excerpts of the script which are shown below.
User code should not be burdened with validating and interpreting a
configuration byte stream or even interpreting dynamic C++ object like
nlohman::json
. Instead, with cogs
the user code receives a fully
typed C++ struct
, thus guaranteeing at least valid object structure.
The code for a struct
and its serialization methods is generated from
an application schema realized with the Avro domain schema and a few
extra bits of information. This information for the demo’s “node”,
“comp” and “head” application schema is brought together in the short
file demo-codegen.jsonnet included here:
To produce the set of six header files (one for structs and one for their serialization for each application schema) one runs:
We finally generate an example cogs
configuration stream in the form
of a JSON file holding an array. This file is created from Jsonnet by
moo
:
The demo-config.json file is what was used above to run the demo. It is not long and so is included here:
You can see the paired objects, each preceded by what will be come a
demo::ConfigHeader
followed a an object of a specific type
corresponding to the component named in the preceding header.
Note, the choice of ordering is intentional. It leads to the
construction and configuration of the demoSource
prior to the use of
this component inside the node. That use calls back to the component
in order to pass in the requested “port” objects.
This section describes how to develop schema. It first describes the layer of “application schema” and “abstract base schema”. It then illustrates the elements of the the latter and walks through an example of the former.
The demo assumes two layers or schema. The lowest is called an “abstract base schema”. Strictly speaking it is a specification of a set of function names and their arguments. The demo then provides a number of implementations of this base schema. A implementation of a base schema function then returns a corresponding data structure that adheres to the schema vocabulary of a particular domain.
For example, one base schema implementation provides structures
suitable for directly producing Avro schema JSON. Another provides
structures which adhere to JSON Schema vocabulary. Another example
given above is one that produces structure that may be applied to a
message.proto.j2
template to produce Protobuf .proto
file that can
then be compiled into C++ classes via protoc
.
Using these primitive base functions, an application developer writes the next layer of functions which emit schema that describes the specific data types required by the developers components.
The next section describes the functions provided by a base schema
followed by a tour of the application level schema for the
configuration used by the “node” component in the cogs
demo.
The base schema in its abstract form is a set of Jsonnet function prototypes which are summarized here. An implementation of an abstract function is expected to return a description of the type named by the function in some domain vocabulary. For example the demo provides one base implemented for the Avro schema domain and one for that of JSON Schema.
Domains will differ in what they can meaningfully accept. This means
that some domains may ignore some arguments to their functions.
Furthermore, some arguments are optional which are indicated by
setting default value to Jsonnet null
. A domain may either provide a
default inside the function body or the argument shall be ignored (no
null
values should “leak out” from the functions).
The abstract base schema functions are:
boolean()
- a Boolean type
number(dtype, extra={})
- a numeric type. The
dtype
argument should provide specific type information using Numpy codes (egi4
for C++int
,u2
for C++uint16_t
). Theextra
may specify JSON Schema constraints. bytes(encoding=null, media_type=null)
- a sequence of byte values
string(patern=null, format=null)
- a string type,
pattern
andformat
are JSON Schema arguments specifying a regular expression or a named format that a valid string must match. field(name, type, default=null, doc=null)
- an named and typed
element in the context of a
record
. If the type is not scalar (eg, is a record) thentype
should be given as the name of the type. Thedefault
may provide a default value of this field. Thedoc
provides a brief English description of the meaning of the field. record(name, fields=[], doc=null)
- a type which aggregates fields.
This corresponds to a JSON object or a C++
struct
orclass
, etc. Thefields
array is a sequence of objects returned from thefield()
function (from the same domain). sequence(type)
- an ordered sequence holding elements of type
type
. enum(name, symbols, default=null, doc=null)
- an enumerated type.
The
symbols
is an array of string literals naming the enumerated values. Thedefault
may specify an enumerated value to be used if otherwise not specified.
The concept of a “node” in this demo has been descried above. Here we examine the node-schema.jsonnet file as an example of an application-level schema.
First we look at the high-level structure of the file:
function(schema) {
// defines types
types: [ typeA, typeB, ...]
}
This Jsonnet compiles down to a single function object which takes the
argument schema
which provides a set of base schema functions such as
described in the previous sections. The primary result of this
function is to return a Jsonnet object ({...}
) which contains an
attribute types
holding an array of objects describing types
constructed through calls to functions provided by schema
.
Looking at the first few lines:
Here, the Jsonnet file re.jsonnet is imported and is provided by moo
.
It contains a set of regular expressions that are to be used to
constrain the validity of strings in the schema. For example it
begins with:
{
// Basic identifier (restrict to legal C variable nam)
ident: '[a-zA-Z][a-zA-Z0-9_]*',
ident_only: '^' + self.ident + '$',
Thus a string with pattern
set to ident
may be validated to hold only
a limited alphanumeric content.
Back to node-shcema.jsonnet
, we ident
defined as a string with a
pattern re.ident_only
as a Jsonnet local
. This means the variable is
temporary and known only in the scope of the object. This lets it be
referred to simply by the name ident
later.
Next we find an example of an enum
and a record
which describe a “link”:
A “link” is intended to generalize the concept of relationship of a
socket and an address. The LinkType
represents one of the two allowed
link mechanism (a bind()
or a connect()
). Note how the Jsonnet
representation of this type is used to further define the Link
along
with the representation of a string type Address
. In a real system
like ZIO, an address is in the form of a URL like tcp://127.0.0.1:5678
for direct ZeroMQ addressing or zyre://nodename/portname
for automated
network peer discovery.
Next we come to a “port”:
A port is another record
with an identifier (name) of a type ident
which we defined above. That is, a string which may be validated
against a regular expression. The second field is links
which is a
sequence
. In ZIO a port corresponds to a ZeroMQ socket which may have
a multitude of both bind()
and connect()
links.
Next we define the part of the node configuration which describes what a node needs to know in order to interact with a component:
The first two fields are identifiers used to look up the component
using the factory (ie, matching what is also provided to main()
in the
header object). The portlist
is a sequence of identifiers which must
mach those used in defining a Port
above. This required consistency
can be enforced by Jsonnet when generating actual configuration
objects as described in the next section. And, finally, an arbitrary
extra string is provided which the demo does not actually use for
anything. It may be used by the node to interpret some special action
on the component (eg, “ignore” or something).
Penultimately, we get to the top level of the “node” schema:
This defines a record
type Node
with fields meant to hold the port and
component definitions.
And, finally, the “return” value collects all the types:
The demo generates a configuration stream (JSON array) with
demo-config.jsonnet. It defines two top level attributes: model
which
provides the configuration sequence and schema
which is a sequence of
objects in JSON Schema vocabulary.
The configuration stream can first be validated with moo
:
moo -S schema -D model \ validate --sequence \ -s demo/demo-config.jsonnet \ demo/demo-config.jsonnet
This tells moo
to assume both the named model and schema are actually
each a sequence of model or schema, respectively and then to test each
pair one by one. Success is marked by a null
for each. Failure will
be greeted with information indicating where the model is invalid.
When valid the model may be compiled into a form that can be consumed
as a cogs
stream:
moo -D model compile demo/demo-config.jsonnet > demo/demo-config.json
One may now go back to the section above Run where we ran this configuration through the cogs-demo
application.