xblock and course addressing

Problem statement

EdX code throughout the stack currently treats xblock and course ids as transparent, introspectable, and frequently manipulated addresses with a known set of fields. The applications serialize and deserialize these ids as parsable urls with fixed syntax (org, partial course id, category, block id or org, partial course id, course unique id). This treatment makes it almost impossible to add semantics such as versioning, subordinate organizations, lifecycle snapshots (e.g., draft, review, alpha, beta, published), or course encapsulation to the applications.

The db rearchitecture task known as "Split Mongo" uses all of the additional semantics listed above as well as separating the identity of courses and reusable definitions from the identity of xblocks within courses (usages). Xblock also distinguishes usages from definitions; so, this change is not unilateral. LMS has also always informally separated course identity from xblock usage identity; however, without treating the course id as an object. Split makes these distinctions formal.

We had a plan to have the two id representations exist simultaneously with continuous translation between them as necessary (Rollout Options); however, this plan raised significant performance questions as well as risks around fragility (possibly not finding some hardcoded ids and translating them to the proper representation for the caller or receiver.)

Discussion with Cale, Rob, Ned, DB, and others drove home the point that only the persistence layer should care about xblock usage and definition ids and only the persistence, auth, and registration layers should care about course ids. The application layers (LMS, CMS, analytics, ORA, Forums) should treat these ids as opaque tokens used to perform data CRUD and reference. The application layers should not parse the ids and should not request semantic information from the ids (e.g., category, org, course_id).

All edX apps currently serialize and deserialize the ids as subfields from urls; thus, to make the apps able to handle any persistence layer's id representations, we will need to change all of the urls and the url parsing. Nevertheless, we must provide deprecated backward compatibility which can still interpret hardcoded and bookmarked urls using the existing syntax.

Proposed solution

The crux of the proposed solution is that all ids (course, xblock, definition) will be opaque keys to the applications. The persistence, auth, and enrollment modules may introspect these keys as necessary for additional information. If an application needs additional information such as org, course_id, category and just has an id, the application must ask the owning service.

No application should assume such requests are free: that is, the service may perform db operations or other non-trivial lookups to answer the query. Sometimes context will make it clear what behaviors the service will likely support for the id (e.g., course_id for an xblock usage but not for an xblock definition, org for a course_id but not necessarily for a definition, category for an xblock usage or definition but not for a course id) but some services may support behaviors which other services don't support (e.g., version and version history for an xblock usage, list of available lifecycle snapshots such as draft v preview v live). Xblocks will themselves support some of these behaviors and, so, in many cases, it may make more sense for the app to retrieve the xblock given the address and query it.

A corollary of this change is that no application should assume the serialized (url, id, etc) form of an id has any particular parsable fields but only that the id is serialized and deserialized as a single field. THUS, the serialized form will not have any slashes in it so that url parsing can digest the whole id as a single field.

Proposed classes and behaviors

Key common superclass representing an address or id of something.

unicode(Key) produces a string representation which will have no slashes, question marks, nor ampersands but may have spaces and other non url safe characters for the given Key. NOTE, rather than storing a much more restricted stringified version of the key in an html id attr, store this string or the url below in the data-id or any other data-xxx field which allows any string chars unlike html attrs which gravely restrict the char set.
Key.url() produces a url safe string for the key which will have no url tag (that is, no i4x://), slashes, ampersands, nor question marks and is safe to use as a field in a url.
Key(unicode) constructs a Key from the unicode. The following must be true: Key(unicode(key)) == key.
Key.from_url(url) constructs a Key from the url. The following must be true: Key.from_url(key.url()) == key.

Key is an abstract class and cannot be instantiated. Keys will have namespaces (type indicators) which Key uses to figure out which concrete subclass to instantiate; however, no application should ever try to interpret the namespace indicator nor the payload.

Key concrete classes register themselves and their unique namespace id with the Key class. The namespaces, of course, cannot collide. If they do, then Key may arbitrarily choose which concrete class it uses. (or should it raise an Exception?)

Key Services

To do more than just serialize and deserialize a key, apps will need to send keys to services which will answer reasonable queries about the keys. Such queries may include org, category, course_id for keys which are not CourseKey, version, branch, etc. Key Services define concrete Key classes which implement the Key class interface and register their namespaces with Key.

We will define two Key services; however, any persistence layer may add any other Key services.

LocationService

LocationService is a Key service roughly corresponding to our existing Location type. It defines CourseId and XblockLocation classes and accepts instances of these to answer queries such as category and org.

LocatorService

LocatorService is a Key service which handles the Locator concrete classes.

Course identity key classes

CourseKey is an abstract subclass of Key to represent keys which are course ids. These should add support for the following:

course_id is a property of a CourseKey instance which returns the unique id for a given course offering. Note, this is not a course xblock but instead the offering to which students register, instructors add staff, and SplitMongo provides indices. A course_id is a unicode string which uniquely identifies the course. It must obey the same syntax rules as Key.url().
CourseKey.from_course_id(course_id) is a CourseKey constructor which inverts the course_id property with the concomitant equality requirements.
org? I'm concerned that org is actually ambiguous. harvard is an org, but so are harvard.humanities and harvard.humanities.political_science.

CourseId is a concrete implementation of CourseKey which represents edX's traditional org, partial course id, and run id triple. Applications should never assume that a CourseKey is or will be a CourseId and thus should not depend on these 3 fields having any meaning with respect to a CourseKey. The LocationService defines and supports the CourseId class. NOTE, no application should treat course ids as strings of triples any more.

CourseLocator is the existing CourseLocator class. LocatorService defines and supports this class.

Usage id key class

UsageKey is a key which an xblock can use as the usage id: identifies a particularly xblock in a particular xblock tree (which may or may have a course as its root). xblocks which usages identify have not only Scope.content fields but also Scope.children and Scope.settings fields. This class adds support for one more property:

block_id: a string id for the xblock which is guaranteed to be unique within the context of its course. It is not invertible by itself.
from_course_block_ids(course_id, block_id): returns a UsageKey such that UsageKey.from_course_block_ids(key.course_id, key.block_id) == key
category? Does it make sense to require the services to answer category queries given usage keys or to make category something apps should get from the xblock instance?
course_id? If the usage key was retrieved in the context of a CourseKey, it would seem reasonable to be able to retrieve the course_id or CourseKey from the usage id. Some usage keys will be to usages not in the context of a course; so, it won't make sense for those (e.g., from an orphaned xblock tree fragment) This property is None if the usage was not retrieved as part of a course.

Location: the LocationService defines the Location class as the concrete implementation of this key class. Note that its implementation of block_id combines the information from the old category and name fields.

BlockUsageLocator: the LocatorService defines the existing BlockUsageLocator concrete implementation of usage key ids. In addition to the above properties and functions, this supports the following which only the persistence layer should count on.

version which returns a unique id for the version such that any other BlockUsageLocator with the same version is guaranteed to exist in the same snapshot at the same time.
branch: if the Usage was retrieved in the context of a course, it may have a branch which indicates its lifecycle snapshot (e.g., draft, beta, alpha, stage, preview, live, archive).

Definition id key class

DefinitionKey is the id of the context independent definition of the xblock's content. That is, it points to the xblock's Scope.content fields. Courseware development apps will want to use this to reuse content among courses or even within a course.

The LocationService has no separate definition key. When asked for the definition key for an xblock, it will give back the Location which is really a UsageKey and is not reusable.

DefinitionLocator: The LocatorService defines DefinitionLocator to represent the unique id of context independent definitions. It provides no additional properties over the implementation of the Key properties and functions.

Stories

Create Key class and its abstract subclasses

must have mechanism for services to register namespaces mapping to concrete classes (or the services?)
define the Key class methods and properties
define UsageKey and DefinitionKey classes w/ their interfaces too

Create `LocationService` and refactor `Location` to this api

Create `LocatorService` and refactor `Locator` (and its subclasses) to this api

Change all LMS code to no longer assume the ids are Locations and no longer access the subfields

use LocationService where necessary, but preferably remove any introspection or move to introspection on xblock

Change all LMS urls to no longer parse id components but just take the key as a whole

leave old urls in place as deprecated for backward compatibility
ensure both can work at the same time

Change all Studio code to no longer assume ids are either Locators nor Locations and no longer access the subfields.

Change all Studio urls to no longer parse id components

no need to leave deprecated urls in place

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xblock and course addressing

Problem statement

Proposed solution

Proposed classes and behaviors

Key Services

LocationService

LocatorService

Course identity key classes

Usage id key class

Definition id key class

Stories

Create Key class and its abstract subclasses

Create `LocationService` and refactor `Location` to this api

Create `LocatorService` and refactor `Locator` (and its subclasses) to this api

Change all LMS code to no longer assume the ids are Locations and no longer access the subfields

Change all LMS urls to no longer parse id components but just take the key as a whole

Change all Studio code to no longer assume ids are either Locators nor Locations and no longer access the subfields.

Change all Studio urls to no longer parse id components

Change any other edx platform accessors or manipulators of id fields or url patterns

Clone this wiki locally

xblock and course addressing

Problem statement

Proposed solution

Proposed classes and behaviors

Key Services

LocationService

LocatorService

Course identity key classes

Usage id key class

Definition id key class

Stories

Create Key class and its abstract subclasses

Create LocationService and refactor Location to this api

Create LocatorService and refactor Locator (and its subclasses) to this api

Change all LMS code to no longer assume the ids are Locations and no longer access the subfields

Change all LMS urls to no longer parse id components but just take the key as a whole

Change all Studio code to no longer assume ids are either Locators nor Locations and no longer access the subfields.

Change all Studio urls to no longer parse id components

Change any other edx platform accessors or manipulators of id fields or url patterns

Clone this wiki locally

Create `LocationService` and refactor `Location` to this api

Create `LocatorService` and refactor `Locator` (and its subclasses) to this api