Skip to content

Query Builder

snej edited this page Dec 12, 2014 · 16 revisions

The CBLQueryBuilder class provides a high-level interface for defining database queries, somewhat like Core Data's NSFetchRequest. Instead of thinking about map functions and key ranges, you provide "select", "where" and "order by" expressions; the query builder uses these to define a view and configure the query.

CBLQueryBuilder does not make CBLView obsolete! It doesn't support everything you can do with map/reduce, so there are cases where you'll run into its limits and need to define your own view and queries. But it does take care of most common types of queries.

Compatibility: CBLQueryBuilder is a new class in Couchbase Lite 1.1. (An earlier version was formerly available in version 1.0.3 as an unsupported 'extra', but not built into the library.) It is iOS/Mac specific, since it's based on Cocoa classes like NSPredicate and NSSortDescriptor. Comparable functionality will be available on other platforms too, with APIs that are idiomatic for those platforms.

1. Overview

A CBLQueryBuilder defines a query or a family of related queries (since it can contain variables whose values can be substituted later.) It has four attributes:

  1. The database to query.
  2. A "where" predicate (boolean-valued function) that chooses which documents match the query.
  3. An array of values to select from the matching documents. Each value is an expression based on properties of the document.
  4. An optional array of sort descriptors that determine the order the results appear.

(This may look something like a SQL SELECT statement, but the details are different.)

Based on these attributes, the query builder will create a view and register its map function. You don't need to worry about this. The query builder will create a pre-configured CBLQuery object for you, and all you need to do is run it and use the result.

Often a query's predicate will depend on specific values that aren't known until runtime, such as a minimum or maximum property value. You can specify these as as placeholder variables; in a predicate string they're $-prefixed identifiers. When creating a query, you provide specific values for these variables.

The best practice is to create query builders ahead of time, keeping references to them in global variables or long-lived instance variables. This reduces the overhead of parsing the attributes and creating the views. If you're creating a query builder each time you run a query, you're probably doing it wrong.

2. Example

Data model

Let's assume a simple blog database that we want to search for posts tagged with a given keyword. The posts will look like this:

{
    "type": "post",
    "title": "Introducing CBLQueryBuilder",
    "author": "Jens Alfke",
    "date": "2014-12-11",
    "body": "Yo, 'sup. Check out this new class...",
    "tags": ["couchbase lite", "iOS", "queries", "views", "API"]
}

Creating the query builder

When the application's search controller object is initialized, it creates a query builder:

    self.tagQueryBuilder = [[CBLQueryBuilder alloc]
            initWithDatabase: self.database
                      select: @[@"title", @"body", @"author", @"date"]
                       where: @"type == 'post' and tags contains $TAG"
                     orderBy: @[@"-date"]
                       error: &error];

We are limiting the search to documents whose type property is "post", since there are likely to be other types of documents (comments, authors...) Also, we don't know the specific tag to search for, so we leave it as a variable $TAG.

The language used in the where: parameter is Apple's predicate syntax, not SQL or N1QL or any other query language.

The orderBy: array is using the string @"-date" as a shortcut meaning "in descending order of date property". This could also have been specified as an NSSortDescriptor object.

Querying

Then when a tag is given, the controller can perform a query:

- (void) queryForTag: (NSString*)tag {
    NSError* error;
    CBLQueryEnumerator* e = [self.tagQueryBuilder runQueryWithContext: @{@"TAG": tag}
                                                                error: &error];
    if (e) {
        [self displayRows: e.allObjects];
    } else {
        [self handleError: error];
    }
}

To create a query, we have to give a context dictionary that provides a value for each variable, in this case TAG.

Important: The $ prefix used in the predicate is not part of the variable name itself, and isn't used in the dictionary key. If you put it in by accident, you'll get a runtime exception complaining that the variable TAG doesn't have a value.

Interpreting the query results

Since we asked the query builder to return four results (@"title", @"body", @"author", @"date"), each CBLQueryRow's value property will be an array of four objects corresponding to those four values. (If we'd asked for only a single result, the value would just be the result, not wrapped in an array.)

3. Configuring Query Builders

As stated before, a query builder has a predicate, an array of values to select, and an optional array of sort descriptors.

Predicate ("where")

The predicate is a boolean-valued function implemented as an NSPredicate object. It takes a document's properties as input and returns true if the document should appear in the query results, or false if it shouldn't.

NSPredicate is a complex class. An NSPredicate is actually the root of a tree of NSExpression objects and nested NSPredicates, representing a parse tree or Abstract Syntax Tree (AST) of a predicate in Apple's syntax. Fortunately you usually don't have to worry about this: although you can construct predicates by hand from trees of objects, you usually just give them in a string form that gets parsed into the tree automatically.

The basics of creating the predicate are pretty simple:

  • Refer to document properties by name; for example tags in the predicate refers to the value of a document JSON property named "tags".
  • Standard relational operators (<, ==, etc.) are available and work as you'd expect on numbers and strings.
  • You can combine multiple conditions with AND or &&.
  • String operators like BEGINSWITH and CONTAINS are available.
  • You can look for a value in an array property using CONTAINS (as in the example).
  • String comparisons are, by default, case and diacritic sensitive. You can modify an operator using the key characters c and d within square braces to specify case and diacritic insensitivity respectively, for example firstName BEGINSWITH[cd] $FIRST_NAME.
  • You can see if a value matches multiple candidates using IN (for example, dept IN $departments, where $departments is an array supplied at query time.)

See the Apple documentation for details.

Unsupported predicate features

A number of NSPredicate features aren't supported, either because they can't be performed efficiently or because we just haven't had time to implement them yet.

  • You can't combine sub-predicates using OR or NOT (only AND.)
  • You can't use multiple inequality tests against variables, for example date >= $SINCE and price < $MAXPRICE.
  • You can't use the string operators CONTAINS, ENDSWITH, LIKE or MATCHES with a variable.

Values ("select")

The select: parameter is an array describing the values that you want from a query: it directly determines what each CBLQueryRow's value property will contain.

Each item in the array can be an NSString or an NSExpression.

An NSString simply represents a property name or key path, relative to the document. So @"price" denotes the value of the price property, and @"address.city" denotes the city sub-property of the JSON object in the address property. Key paths also support some aggregation operators that can be used with array-valued document properties.

An NSExpression is an object similar to an NSPredicate (in fact, expressions appear as child nodes in a predicate's tree.) It represents an expression that can be evaluated at map time from the properties of the document. Like a predicate, it can be parsed from a string or constructed as an object tree.

When you iterate over the result rows, each one's value property will correspond to the builder's select: parameter -- it will be an array whose elements correspond one-to-one. The exception is if you only specified a single value; in that case the row's value property will simply be that value, not an array.

Sort Descriptors

You can optionally specify an ordering for the query result rows. If you don't care about the order, leave the orderBy: parameter nil; this can make the query faster.

Sorting is specified in Cocoa fashion, using an array of NSSortDescriptor objects. The first one represents the primary sort order, the second one would be a secondary order, etc. As above, the key paths in the sort descriptors are interpreted relative to the document.

4. Troubleshooting

If the CBLQueryBuilder initializer returns nil, that means it can't support the specific query you asked for. In this case check the localizedDescription in the NSError returned via the final parameter; this should hopefully identify what it wasn't able to handle. (See above for some discussion of unsupported predicate features.)

If the resulting query returns the wrong results, or runs too slowly, it can be helpful to look at the query builder's explanation property: this returns a string containing a human-readable multi-line description of how the query is implemented. It gives a pseudo-code listing of the underlying view's map function, and the values of some of the query properties like startKey.

Here's the explanation of the query builder used in the previous example section:

view.map = {
    if (type == "post")
        for (i in tags)
            emit([i, date], [title, body, author]);
};
query.startKey = [$TAG];
query.endKey = [$TAG];
query.prefixMatchLevel = 1;

From this you can tell that the view index's key consists of an array containing a tag and a date, and the corresponding value contains the title, body and author. The query will return the rows from the index that start with the requested tag.

The explanation can give you some clues for performance problems:

  • If the map function has no if test, it's going to emit a row or rows for every document in the database. This is usually not what you want, unless the query genuinely applies to all documents. It's usually a side effect of specifying a predicate that can't be resolved during indexing and has to be deferred to a postFilter.
  • A query.postFilter predicate runs at query time; it's called on every row and can slow down the query if the result set is large.
  • Similarly, a query.sortDescriptor is also applied after the query completes (and after any postFilter), and can be slow if there are a lot of rows returned.

The explanation string is also useful if you need to enhance an existing query builder but find that you've exceeded the limits of what CBLQueryBuilder can handle. You can use the explanation of the existing query to re-implement it using CBLView and CBLQuery, and then make modifications to the map function or the query setup to reach your goal.