Skip to content

Possible Cylc Futures

Oliver Sanders edited this page Jun 19, 2018 · 16 revisions

This wiki outlines some of the topics which were discussed in June 2018 in Exeter. These are futuristic descriptions of directions the cylc scheduler potentially could take.

System Architecture

Separate the current code base into a kernel - shell model.

Motivation

The cylc kernel is currently pretty heavyweight with configuration logic mixed in with runtime logic (e.g. cylc.config). Separating the two paves the path to a cleaner leaner kernel which scales better and is able to be configured and run in different ways.

Kernel Structure

The Cylc kernel is currently monolithic, we could design it as a pool of micro-services but that's probably a bad difficult idea (cite GNU Hurd). Perhaps best to use a hybrid architecture where the scheduler is monolithic but with some functionality factored out into dynamically loaded modules enabling the kernel to be run in different configurations (see "Frames").

Kernel Shell Divide

Shell:

  • Suite configuration
  • Suite reload states
  • Suite control commands
  • Log retrieval
  • ...

Kernel:

  • Scheduling algorithm

  • Message broker

  • Modules dynamically loaded based on suite configuration:

    • Job submission
    • File creation (i.e. job output files, directory structure, etc)
    • Cycling?
    • ...

The shell should create the suite configuration object which is then loaded by the kernel. This way the configuration code is left behind in the shell process making the kernel much leaner.

Data Objects

The kernel would be stateless and work with two data objects, configuration and suite state. These objects would be python interfaces mapping onto the data storage model (i.e. database for the suite state object, Python object store for the suite configuration?).

This would allow us to easily change the data storage method, for example to support a different database solution a single plugin can be written.

If all modifications to suite state were written to the database then all would be transactioned which would allow us to easily roll back the suite state on demand.

Hierarchical Suites

An Advanced (though fairly simple) model for providing

Motivation

Cylc suites are Monolithic beasts. Cylc/Jinj2 imports/includes allow us to spread out over multiple files but fail to provide a modular architecture.

As suites continue to become more complicated the ability to divide them into components and develop each in some form of isolation becomes desirable.

File Structure

In this model a Cylc suite becomes a python module, tasks become executables. Sub suites are python modules. Nesting can go as many levels deep as desired.

mysuite/
   __init__.py
   foo
   bar
   baz/
      __init__.py
      pub

Sub Suites

Sub suites (e.g. baz in the example above) can be composed into the parent suite in the same manner in which tasks are:

foo >> bar >> baz

This enables us to run sub-workflows without requiring prior knowledge of their structure. In this case baz represents a single task, it could be an independently cycling workflow consisting of many tasks.

For deeper integration (e.g. writing a dependency to or from tasks in the sub suite) components of the sub suite can be imported:

import baz

bar >> baz.pub

As sub suites are python modules parts of the implementation can be imported from the sub suite into the parent suite. The parent can inspect the child but not the other way around.

Frames

Each suite is written in a "frame", this determines the way in which it is run and the way in which it may compose sub suites or be composed by a parent suite.

Cylc currently offers one "frame", the CLI frame in which all tasks are bash commands. Possible frames:

  • CLI Frame (necessary)
  • Python Frame (sensible)
  • Bash Frame (possible)

Sub Suite Execution/Composition

The simplest way to handle sub suites is to compose them into the parent suite. This is probably the main use case. We could also permit running a sub suite as an independent [standalone] suite if scalability constraints require it.

Particularly for workflows written using the Python frame it may also be desirable to submit a sub suite as a job. This is especially useful for regular(ish) shaped python job arrays. To do this the kernel would need to be run in a different configuration:

  • The job submission module need not be loaded.
  • A different file creation module would be required.

Config API And Python Config API

See https://github.com/cylc/cylc/issues/1962

Motivation

The Python API should:

  • Make writing programmatic suites easier.
  • Configure cylc logic with Python objects rather than with strings (e.g. foo >> bar rather than 'foo => bar').
  • Provide "on the fly" validation meaning that (for the first time since Jinja2) we can provide sensible error messages with line numbers.

Approaches

Python API could either:

  1. Open up the OrderedDictWithDefaults datastructure.

    import cylc.config
    
    cfg = cylc.config.Config()
    
    cfg['scheduling']['dependencies']['graph'] = 'hello_world'
    cfg['runtime']['hello_world']['script'] = 'echo "Hello World!"'

    This option allows us to replace Jinja2 with Python but otherwise offers what we currently have.

  2. Provide functions for constructing a configuration.

    from cylc import Task, Graph
    
    hello_world = Task('hello_world')  # would run the task mysuite/hello_world
    Graph(hello_world)

    This offers a much more elegant solution but requires an interface which can build a cylc configuration object.

Basics

Assuming option 2 (above), some possible syntax:

# Tasks
foo = Task('foo')
bar = Task('bar',
   env={'BAR': 'pub'}
)

# Dependencies

foo >> bar

# States

foo.state['fail'] >> bar

# Parameters
num = Param('num', range(10))

baz = TaskArray('baz', num)
pub = TaskArray('pub', num)

baz >> pub[0:5]

# Useful Parameters
num = Param('num', {
   1: {'foo': 'a', 'bar': 'b', 'baz': 'c'}
   2: {'foo': 'd', 'bar': 'e', 'baz': 'f'}
})

bol = TaskArray('bol', num
   env={'variable': num['foo']}
)

foo >> bol

# Cycling (somewhat tricker, a simple option)

graph = Graph(
   foo >> bar,
   bar >> baz
)
graph.cycle('P1D')

# Inter-cycle offsetd

graph = Graph(
   foo >> bar,
   bar('-PT1H') >> baz
)

Event Driven Scheduler

Presently the scheduler is self-assembling based on a "pool" of self-spawning tasks.

Spawn-on-demand requires the suite to have knowledge of the pattern of its graph.

This would allow us to spin up task proxies only when required cleaning up the distinction between tasks and jobs (TaskDefs become Tasks (nodes in the pattern), TaskProxies become Jobs (nodes in the Graph)).

Graphs

The new Graph object would need to be walkable in both directions, i.e. tasks know their prerequisites but also their postrequisites.

Perhaps a more extreme implementation of this would see multiple graphs as distinct entities. Graphs could then be "driven" by a cycler or an external trigger opening up the opportunity for suites to respond to irregular events which do not fit into any current cycling model e.g. certain observations type workflows.

Events

The scheduler should be able to associate events with nodes in the graph and propagate them through it. When an event occurs it triggers actions in the graph which may spawn new jobs in response. We shouldn't need to iterate over the task pool to evaluate prerequisites etc.