From ad9676f1642b96341a5fc1e95b3fb13a9fa52075 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Wed, 18 Sep 2024 14:25:14 -0400 Subject: [PATCH] LTR documentation Signed-off-by: Eric Pugh --- _search-plugins/ltr/building-features.md | 311 ++++++++++++ _search-plugins/ltr/core-concepts.md | 255 ++++++++++ _search-plugins/ltr/faq.md | 35 ++ _search-plugins/ltr/feature-engineering.md | 125 +++++ _search-plugins/ltr/fits-in.md | 63 +++ _search-plugins/ltr/index.md | 51 ++ _search-plugins/ltr/logging-features.md | 452 ++++++++++++++++++ .../ltr/searching-with-your-model.md | 128 +++++ _search-plugins/ltr/training-models.md | 348 ++++++++++++++ 9 files changed, 1768 insertions(+) create mode 100644 _search-plugins/ltr/building-features.md create mode 100644 _search-plugins/ltr/core-concepts.md create mode 100644 _search-plugins/ltr/faq.md create mode 100644 _search-plugins/ltr/feature-engineering.md create mode 100644 _search-plugins/ltr/fits-in.md create mode 100644 _search-plugins/ltr/index.md create mode 100644 _search-plugins/ltr/logging-features.md create mode 100644 _search-plugins/ltr/searching-with-your-model.md create mode 100644 _search-plugins/ltr/training-models.md diff --git a/_search-plugins/ltr/building-features.md b/_search-plugins/ltr/building-features.md new file mode 100644 index 0000000000..c6eb625251 --- /dev/null +++ b/_search-plugins/ltr/building-features.md @@ -0,0 +1,311 @@ +--- +layout: default +title: Working with Features +nav_order: 30 +parent: LTR search +has_children: false +--- + +# Working with Features + +In [core concepts]({{site.url}}{{site.baseurl}}/search-plugins/ltr/core-concepts/), we mentioned the main +roles you undertake building a learning to rank system. In +[fits in]({{site.url}}{{site.baseurl}}/search-plugins/ltr/fits-in/) we discussed at a high level +what this plugin does to help you use OpenSearch as a learning to +rank system. + +This section covers the functionality built into the OpenSearch LTR +plugin to build and upload features with the plugin. + +## What is a feature in OpenSearch LTR + +OpenSearch LTR features correspond to OpenSearch queries. The +score of an OpenSearch query, when run using the user's search terms +(and other parameters), are the values you use in your training set. + +Obvious features might include traditional search queries, like a simple +"match" query on title: + +```json +{ + "query": { + "match": { + "title": "{% raw %}{{keywords}}{% endraw %}" + } + } +} +``` + +Of course, properties of documents such as popularity can also be a +feature. Function score queries can help access these values. For +example, to access the average user rating of a movie: + +```json +{ + "query": { + "function_score": { + "functions": { + "field": "vote_average" + }, + "query": { + "match_all": {} + } + } + } +} +``` + +One could also imagine a query based on the user's location: + +```json +{ + "query": { + "bool" : { + "must" : { + "match_all" : {} + }, + "filter" : { + "geo_distance" : { + "distance" : "200km", + "pin.location" : { + "lat" : "{% raw %}{{users_lat}}{% endraw %}", + "lon" : "{% raw %}{{users_lon}}{% endraw %}" + } + } + } + } + } +} +``` + +Similar to how you would develop queries like these to manually improve +search relevance, the ranking function `f` you're training also +combines these queries mathematically to arrive at a relevance score. + +## Features are Mustache Templated OpenSearch Queries + +You'll notice the `{% raw %}{{keywords}}{% endraw %}`, `{% raw %}{{users_lat}}{% endraw %}`, and `{% raw %}{{users_lon}}{% endraw %}` +above. This syntax is the mustache templating system used in other parts of +[OpenSearch]({{site.url}}{{site.baseurl}}/api-reference/search-template/). +This lets you inject various query or user-specific variables into the +search template. Perhaps information about the user for personalization? +Or the location of the searcher's phone? + +For now, we'll focus on typical keyword searches. + +## Uploading and Naming Features + +OpenSearch LTR gives you an interface for creating and manipulating +features. Once created, then you can have access to a set of feature for +logging. Logged features when combined with your judgement list, can be +trained into a model. Finally, that model can then be uploaded to +OpenSearch LTR and executed as a search. + +Let's look how to work with sets of features. + +## Initialize the default feature store + +A *feature store* corresponds to an OpenSearch index used to store +metadata about the features and models. Typically, one feature store +corresponds to a major search site/implementation. For example, +[wikipedia](http://wikipedia.org) compared to [wikitravel](http://wikitravel.org) + +For most use cases, you can simply get by with the single, default +feature store and never think about feature stores ever again. This +needs to be initialized the first time you use OpenSearch Learning to +Rank: + + PUT _ltr + +You can restart from scratch by deleting the default feature store: + + DELETE _ltr + +(WARNING this will blow everything away, use with caution!) + +In the rest of this guide, we'll work with the default feature store. + +## Features and feature sets + +Feature sets are where the action really happens in OpenSearch LTR. + +A *feature set* is a set of features that has been grouped together for +logging & model evaluation. You'll refer to feature sets when you want +to log multiple feature values for offline training. You'll also create +a model from a feature set, copying the feature set into model. + +## Create a feature set + +You can create a feature set simply by using a POST. To create it, you +give a feature set a name and optionally a list of features: + +```json +POST _ltr/_featureset/more_movie_features +{ + "featureset": { + "features": [ + { + "name": "title_query", + "params": [ + "keywords" + ], + "template_language": "mustache", + "template": { + "match": { + "title": "{% raw %}{{keywords}}{% endraw %}" + } + } + }, + { + "name": "title_query_boost", + "params": [ + "some_multiplier" + ], + "template_language": "derived_expression", + "template": "title_query * some_multiplier" + }, + { + "name": "custom_title_query_boost", + "params": [ + "some_multiplier" + ], + "template_language": "script_feature", + "template": { + "lang": "painless", + "source": "params.feature_vector.get('title_query') * (long)params.some_multiplier", + "params": { + "some_multiplier": "some_multiplier" + } + } + } + ] + } +} +``` + +## Feature set CRUD + +Fetching a feature set works as you'd expect: + + GET _ltr/_featureset/more_movie_features + +You can list all your feature sets: + + GET _ltr/_featureset + +Or filter by prefix in case you have many feature sets: + + GET _ltr/_featureset?prefix=mor + +You can also delete a featureset to start over: + + DELETE _ltr/_featureset/more_movie_features + +## Validating features + +When adding features, we recommend sanity checking that the features +work as expected. Adding a "validation" block to your feature creation +let's OpenSearch LTR run the query before adding it. If you don't +run this validation, you may find out only much later that the query, +while valid JSON, was a malformed OpenSearch query. You can imagine, +batching dozens of features to log, only to have one of them fail in +production can be quite annoying! + +To run validation, you simply specify test parameters and a test index +to run: + +```json +"validation": { + "params": { + "keywords": "rambo" + }, + "index": "tmdb" +}, +``` +Place this alongside the feature set. You'll see below we have a +malformed `match` query. The example below should return an error that +validation failed. An indicator you should take a closer look at the +query: + +```json +{ + "validation": { + "params": { + "keywords": "rambo" + }, + "index": "tmdb" + }, + "featureset": { + "features": [ + { + "name": "title_query", + "params": [ + "keywords" + ], + "template_language": "mustache", + "template": { + "match": { + "title": "{% raw %}{{keywords}}{% endraw %}" + } + } + } + ] + } +} +``` + +## Adding to an existing feature set + +Of course you may not know upfront what features could be useful. You +may wish to append a new feature later for logging and model evaluation. +For example, creating the *user_rating* feature, we could +create it using the feature set append API, like below: + +```json +POST /_ltr/_featureset/my_featureset/_addfeatures +{ + "features": [{ + "name": "user_rating", + "params": [], + "template_language": "mustache", + "template" : { + "function_score": { + "functions": { + "field": "vote_average" + }, + "query": { + "match_all": {} + } + } + } + }] +} +``` + +## Feature Names are Unique + +Because some model training libraries refer to features by name, +OpenSearch LTR enforces unique names for each features. In the +example above, we could not add a new *user_rating* feature +without creating an error. + +## Feature Sets are Lists + +You'll notice we *appended* to the feature set. Feature sets perhaps +ought to be really called "lists". Each feature has an ordinal (its +place in the list) in addition to a name. Some LTR training +applications, such as Ranklib, refer to a feature by ordinal (the +"1st" feature, the "2nd" feature). Others more conveniently refer to +the name. So you may need both/either. You'll see that when features +are logged, they give you a list of features back to preserve the +ordinal. + +## But wait there's more + +Feature engineering is a complex part of OpenSearch Learning to Rank, +and additional features (such as features that can be derived from other +features) are listed in `advanced-functionality`{.interpreted-text +role="doc"}. + +Next-up, we'll talk about some specific use cases you\'ll run into when +[Feature Engineering]({{site.url}}{{site.baseurl}}/search-plugins/ltr/feature-engineering/). diff --git a/_search-plugins/ltr/core-concepts.md b/_search-plugins/ltr/core-concepts.md new file mode 100644 index 0000000000..adfc4b1f79 --- /dev/null +++ b/_search-plugins/ltr/core-concepts.md @@ -0,0 +1,255 @@ +--- +layout: default +title: Core Concepts +nav_order: 10 +parent: LTR search +has_children: false +--- + +# Core concepts + +Welcome. You're here if you're interested in adding machine learning +ranking capabilities to your OpenSearch system. This guidebook is +intended for OpenSearch developers and data scientists. + +## What is learning to rank + +*Learning to Rank* (LTR) applies machine learning to search relevance +ranking. How does relevance ranking differ from other machine learning +problems? Regression is one classic machine learning problem. In +*regression*, you're attempting to predict a variable (such as a stock +price) as a function of known information (such as number of company +employees or the company's revenue). In these cases, you're +building a function, say *f*, that can take what's known +(*number of Employees*, *revenue*), and have +*f* output an approximate stock price. + +Classification is another machine learning problem. With classification, +our function *f*, would classify our company into several +categories. For example, profitable or not profitable. Or perhaps +whether or not the company is evading taxes. + +In Learning to Rank, the function *f* we want to learn does +not make a direct prediction. Rather it's used for ranking documents. +We want a function *f* that comes as close as possible to +our user's sense of the ideal ordering of documents dependent on a +query. The value output by *f* itself has no meaning (it's +not a stock price or a category). It's more a prediction of a users' +sense of the relative usefulness of a document given a query. + +Here, we'll briefly walk through the 10,000 meter view of Learning to +Rank. For more information, we recommend blog articles [How is Search +Different From Other Machine Learning +Problems?](http://opensourceconnections.com/blog/2017/08/03/search-as-machine-learning-prob/) +and [What is Learning to +Rank?](http://opensourceconnections.com/blog/2017/02/24/what-is-learning-to-rank/). + +## Judgements: expression of the ideal ordering + +Judgement lists, sometimes referred to as "golden sets", grade +individual search results for a keyword search. For example, our +[demo](http://github.com/o19s/elasticsearch-learning-to-rank/tree/master/demo/) +uses [TheMovieDB](http://themoviedb.org). When users search for +"Rambo" we can indicate which movies ought to come back for "Rambo" +based on our user's expectations of search. + +For example, we know these movies are very relevant: + +- First Blood +- Rambo + +We know these sequels are fairly relevant, but not exactly relevant: + +- Rambo III +- Rambo First Blood, Part II + +Some movies that star Sylvester Stallone are only tangentially relevant: + +- Rocky +- Cobra + +And of course many movies are not even close: + +- Bambi +- First Daughter + +Judgement lists apply "grades" to documents for a keyword, this helps +establish the ideal ordering for a given keyword. For example, if we +grade documents from 0-4, where 4 is exactly relevant. The preceding would +turn into the judgement list: + + grade,keywords,movie + 4,Rambo,First Blood # Exactly Relevant + 4,Rambo,Rambo + 3,Rambo,Rambo III # Fairly Relevant + 3,Rambo,Rambo First Blood Part II + 2,Rambo,Rocky # Tangentially Relevant + 2,Rambo,Cobra + 0,Rambo,Bambi # Not even close... + 0,Rambo,First Daughter + +A search system that approximates this ordering for the search query +"Rambo", and all our other test queries, can said to be performing +well. Metrics such as +[NDCG](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) and +[ERR](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.157.4509&rep=rep1&type=pdf) +evaluate a query's actual ordering compared to the ideal judgement list. + +Our ranking function *f* needs to rank search results as +close as possible to our judgement lists. We want to maximize quality +metrics such as ERR or NDCG over the broadest number of queries in our +training set. When we do this, with accurate judgements, we work to +return results listings that will be maximally useful to users. + +## Features: the raw material of relevance + +Previously in the example of a stock market predictor, our ranking function +*f* used variables such as the number of employees, revenue, +and so on to arrive at a predicted stock price. These are *features* of the +company. Here our ranking function must do the same: using features that +describe the document, the query, or some relationship between the +document and the query (such as query keyword's [term frequency/inverse document frequency (TF/IDF)](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) score in a +field). + +Features for movies, for example, might include: + +- Whether/how much the search keywords match the title field (let's + call this *titleScore*) +- Whether/how much the search keywords match the description field + (*descScore*) +- The popularity of the movie (*popularity*) +- The rating of the movie (*rating*) +- How many keywords are used during search? + (*numKeywords*) + +Our ranking function then becomes +`f(titleScore, descScore, popularity, rating, numKeywords)`. We hope +whatever method we use to create a ranking function can utilize these +features to maximize the likelihood of search results being useful for +users. For example, it seems intuitive in the "Rambo" use case that +*titleScore* matters quite a bit. But one top movie "First +Blood" probably only mentions the keyword Rambo in the description. So +in this case *descScore* comes into play. Also +*popularity/rating* might help determine which movies +are "sequels" and which are the originals. We might learn this feature +doesn't work well in this regard, and introduce a new feature +*isSequel* that our ranking function could use to make +better ranking decisions. + +Selecting and experimenting with features is a core piece of learning to +rank. Good judgements with poor features that don't help predict +patterns in the predicted grades and won't create a good search +experience. Like any other machine learning problem: garbage +in-garbage out! + +For more on the art of creating features for search, check out the book +[Relevant Search](http://manning.com/books/relevant-search) by Doug +Turnbull and John Berryman. + +## Logging features: completing the training set + +With a set of features we want to use, we need to annotate the judgement +list with values of each feature. This data will be used once +training commences. + +In other words, we need to transfer: + + grade,keywords,movie + 4,Rambo,First Blood + 4,Rambo,Rambo + 3,Rambo,Rambo III + ... + +into: + + grade,keywords,movie,titleScore,descScore,popularity,... + 4,Rambo,First Blood,0.0,21.5,100,... + 4,Rambo,Rambo,42.5,21.5,95,... + 3,Rambo,Rambo III,53.1,40.1,50,... + +(here titleScore is the relevance score of "Rambo" for title field in +document "First Blood", and so on) + +Many learning to rank models are familiar with a file format introduced +by SVM Rank, an early learning to rank method. Queries are given ids, +and the actual document identifier can be removed for the training +process. Features in this file format are labeled with ordinals starting +at 1. For the previous example, we'd have the file format: + + 4 qid:1 1:0.0 2:21.5 3:100,... + 4 qid:1 1:42.5 2:21.5 3:95,... + 3 qid:1 1:53.1 2:40.1 3:50,... + ... + +In actual systems, you might log these values after the fact, gathering +them to annotate a judgement list with feature values. In others the +judgement list might come from user analytics, so it may be logged as the +user interacts with the search application. More on this when we cover +it in `logging-features`{.interpreted-text role="doc"}. + +## Training a ranking function + +With judgements and features in place, the next decision is to arrive +at the ranking function. There's a number of models available for +ranking, with their own intricate pros and cons. Each one attempts +to use the features to minimize the error in the ranking function. +Each has its own notion of what "error" means in a ranking system. +For more information read [this blog article](http://opensourceconnections.com/blog/2017/08/03/search-as-machine-learning-prob/). + +Generally speaking there's a couple of families of models: + +- Tree-based models (LambdaMART, MART, Random Forests): These + models tend to be most accurate in general. They're large and complex models that can be fairly expensive to train. + [RankLib](https://sourceforge.net/p/lemur/wiki/RankLib/) and [xgboost](https://github.com/dmlc/xgboost) both focus on tree-based models. + +- SVM based models (SVMRank): Less accurate, but cheap to train. See [SVM Rank](https://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html). + +- Linear models: Performing a basic linear regression over the judgement +list. Tends to not be useful outside of toy examples. See [this blog article](http://opensourceconnections.com/blog/2017/04/01/learning-to-rank-linear-models/). + +As with any technology, model selection can be as much about what a team +has experience with, not only with what performs best. + + +## Testing: is our model any good? + +Our judgement lists can't cover every user query our model will +encounter out in the wild. So it's important to throw our model +curveballs, to see how well it can "think for itself." Or as machine +learning folks say: can the model generalize beyond the training data? A +model that cannot generalize beyond training data is *overfit* to the +training data, and not as useful. + +To avoid overfitting, you hide some of your judgement lists from the +training process. You then use these to test your model. This side data +set is known as the "test set." When evaluating models you'll hear +about statistics such as "test NDCG" compared to "training NDCG." The former +reflects how your model will perform against scenarios it hasn't seen +before. You hope as you train, your test search quality metrics continue +to reflect high quality search. Further: after you deploy a model, +you'll want to try out newer/more recent judgement lists to see if your +model might be overfit to seasonal/temporal situations. + +## Real World Concerns + +Now that you're oriented, the rest of this guide builds on this context +to point out how to use the Learning to Rank plugin. But before we move +on, we want to point out some crucial decisions everyone encounters in +building learning to rank systems. We invite you to watch a talk with +[Doug Turnbull and Jason +Kowalewski](https://www.youtube.com/watch?v=JqqtWfZQUTU&list=PLq-odUc2x7i-9Nijx-WfoRMoAfHC9XzTt&index=5) +where the painful lessons of real learning to rank systems are brought +out. + +- How do you get accurate judgement lists that reflect your users real + sense of search quality? +- What metrics best measure whether search results are useful to + users? +- What infrastructure do you need to collect and log user behavior and + features? +- How will you detect when/whether your model needs to be retrained? +- How will you A/B test your model compared to your current solution? What KPIs + will determine success in your search system. + +Next up, see how exactly this plugin's functionality fits into a +learning to rank system: [How does the plugin fit in?]({{site.url}}{{site.baseurl}}/search-plugins/ltr/fits-in/) diff --git a/_search-plugins/ltr/faq.md b/_search-plugins/ltr/faq.md new file mode 100644 index 0000000000..78b28f2fd4 --- /dev/null +++ b/_search-plugins/ltr/faq.md @@ -0,0 +1,35 @@ +--- +layout: default +title: FAQ +nav_order: 1000 +parent: LTR search +has_children: false +--- + +# FAQ + +This section contains answers to common issues that may trip up users. + +## Negative Scores + +Lucene does not allow queries to have negative scores. This can be +problematic if you have a raw feature that has a negative value. +Unfortunately there is no easy quick fix for this. If you are working +with such features, you need to make them non-negative *BEFORE* you +train your model. This can be accomplished by creating normalized fields +with values shifted by the mininum value or you can run the score thru a +function that produces a value >= 0. + +## I found a bug + +If you've been fighting with the plugin it's entirely possible you've +encountered a bug. Please open an issue on the Github project and we +will do our best to get it sorted. If you need general support, please +see the section below as we will typically close issues that are only +looking for support. + +## I'm still stuck! + +We'd love to hear from you! Consider joining the [Relevance Slack +Community](https://opensourceconnections.com/slack) and join the +#opensearch-learn-to-rank channel. diff --git a/_search-plugins/ltr/feature-engineering.md b/_search-plugins/ltr/feature-engineering.md new file mode 100644 index 0000000000..5e9d1c51e9 --- /dev/null +++ b/_search-plugins/ltr/feature-engineering.md @@ -0,0 +1,125 @@ +--- +layout: default +title: Feature Engineering +nav_order: 40 +parent: LTR search +has_children: false +--- + +# Feature Engineering + +You've seen how to add features to feature sets. We want to show you +how to address common feature engineering tasks that come up when +developing a learning to rank solution. + +## Getting Raw Term Statistics + +Many learning to rank solutions use raw term statistics in training. +For example, the total term frequency for a term, the document +frequency, and other statistics. Luckily, OpenSearch LTR comes +with a query primitive, `match_explorer`, that extracts these +statistics for you for a set of terms. In its simplest form, +`match_explorer` lets you specify a statistic you\'re interested in +and a match you'd like to explore. For example: + +```json +POST tmdb/_search +{ + "query": { + "match_explorer": { + "type": "max_raw_df", + "query": { + "match": { + "title": "rambo rocky" + } + } + } + } +} +``` + +This query returns the highest document frequency between the two terms. + +A large number of statistics are available. The `type` parameter can +be prepended with the operation to be performed across terms for the +statistic `max`, `min`, `sum`, and `stddev`. + +The statistics available include: + - `raw_df` -- the direct document frequency for a term. So if +rambo occurs in 3 movie titles, this is 3. + - `classic_idf` -- the IDF calculation of the classic similarity +`log((NUM_DOCS+1)/(raw_df+1)) + 1`. + - `raw_ttf` -- the total term frequency for the term across the +index. So if rambo is mentioned a total of 100 times in the overview +field, this would be 100. + - `raw_tf` -- the term frequency for a document. So if rambo + occurs in 3 in movie synopsis in same document, this is 3. + +Putting the operation and the statistic together, you can see some +examples. To get stddev of classic_idf, you would write +`stddev_classic_idf`. To get the minimum total term frequency, +you'd write `min_raw_ttf`. + +### Term position statistics + +The `type` parameter can be prepended with the operation to be +performed across term position for the statistic `min`, `max` and +`avg`. For any of the cases, 0 will be returned if there isn’t any occurrence of the terms in the document. + +The statistics available include, e.g. using the query “dance monkey” we have: + + - `min_raw_tp` -- return the minimum occurrence, i.e. the first +one, of any term on the query. So if dance occurs at positions [2,5 ,9], +and monkey occurs at positions [1, 4] in a text in the same document, the minimum is 1. | + - `max_raw_tp` -- return the maximum occurrence, i.e. the last +one, of any term on the query. So if dance occurs at positions [2, 5 ,9] +and monkey occurs at positions [1, 4\] in a text in the same +document, the maximum is 9. + - `avg_raw_tp` -- return the average of all occurrence of the +terms on the query. So if dance occurs at positions [2, 5 ,9] +its average is `5.33`, and monkey has average `2.5` for +positions [1, 4]. So the returned average is `3.91`, computed +by `(5.33 + 2.5)/2`. + +Finally a special stat exists for only counting the number of search +terms. That stat is `unique_terms_count`. + +## Document-specific features + +Another common case in learning to rank is features such as popularity +or recency, tied only to the document. OpenSearch's `function_score` +query has the functionality you need to pull this data out. You already +saw an example when adding features in the last section: + +```json +{ + "query": { + "function_score": { + "functions": [{ + "field_value_factor": { + "field": "vote_average", + "missing": 0 + } + }], + "query": { + "match_all": {} + } + } + } +} +``` + +The score for this query corresponds to the value of the `vote_average` +field. + +## Your index may drift + +If you have an index that updates regularly, trends that held true +today, may not hold true tomorrow! On an e-commerce store, sandals might +be very popular in the summer, but impossible to find in the winter. +Features that drive purchases for one time period, may not hold true for +another. It's always a good idea to monitor your model's performance +regularly, retrain as needed. + +Next up, we discuss the all-important task of logging features in +[Logging Feature Scores]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/) diff --git a/_search-plugins/ltr/fits-in.md b/_search-plugins/ltr/fits-in.md new file mode 100644 index 0000000000..3f2dadff37 --- /dev/null +++ b/_search-plugins/ltr/fits-in.md @@ -0,0 +1,63 @@ +--- +layout: default +title: How does the plugin fit in? +nav_order: 20 +parent: LTR search +has_children: false +--- + +# How does the plugin fit in + +In [core concepts]({{site.url}}{{site.baseurl}}/search-plugins/ltr/core-concepts/) we mentioned a couple +of activities you undertake when implementing learning to rank: + +1. Judgement List Development +2. Feature Engineering +3. Logging features into the Judgement list to create a training set +4. Training and testing models +5. Deploying and using models when searching + +How does OpenSearch LTR fit into this process? + +## What the plugin does + +This plugin gives you building blocks to develop and use learning to +rank models. It lets you develop query-dependent features and store them +in OpenSearch. After storing a set of features, you can log them for +documents returned in search results to aid in offline model +development. + +Then other tools take over. With a logged set of features for documents, +you join data with your Judgement lists you've developed on your own. +You've now got a training set you can use to test/train ranking models. +Using of a tool like Ranklib or XGBoost, you'll hopefully arrive at a +satisfactory model. + +With a ranking model, you turn back to the plugin. You upload the model +and give it a name. The model is associated with the set of features +used to generate the training data. You can then search with the model, +using a custom OpenSearch Query DSL primitive that executes the +model. Hopefully this lets you deliver better search to users. + +## What the plugin is NOT + +The plugin does not help with Judgement list creation. This is work you +must do and can be very domain specific. The Wikimedia Foundation wrote a +[great +article](https://blog.wikimedia.org/2017/09/19/search-relevance-survey/) +on how they arrive at Judgement lists for people searching articles. +Other domains such as e-commerce might be more conversion focused. Yet +others might involve human relevance judges \-- either experts at your +company or mechanical turk. + +The plugin does not train or test models. This also happens offline in +tools appropriate to the task. Instead the plugin uses models generated +by XGboost and Ranklib libraries. Training and testing models is CPU +intensive task that, involving data scientist supervision and offline +testing. Most organizations want some data science supervision on model +development. And you would not want this running in your production +Elasticsearch cluster! + +The rest of this guide is dedicated to walking you through how the +plugin works to get you there. Continue on to +`building-features`{.interpreted-text role="doc"}. diff --git a/_search-plugins/ltr/index.md b/_search-plugins/ltr/index.md new file mode 100644 index 0000000000..2b6f0f2ecb --- /dev/null +++ b/_search-plugins/ltr/index.md @@ -0,0 +1,51 @@ +--- +layout: default +title: LTR search +nav_order: 20 +has_children: true +has_toc: false +redirect_from: + - /search-plugins/ltr/ +--- + +# LTR search + +Short for *Learning to Rank*, the LTR plugin enables you use machine learning and behavioral data to tune the relevance of documents. +It uses models from the XGBoost and Ranklib libraries to rescore the search results that takes into account query dependent features such as click-through data or field matches, which can further improve relevance. + +[Learning to +Rank](http://opensourceconnections.com/blog/2017/02/24/what-is-learning-to-rank/) +applies machine learning to relevance ranking. The [OpenSearch +Learning to Rank +plugin](https://github.com/opensearch-project/opensearch-learning-to-rank-base) +(OpenSearch LTR) gives you tools to train and use ranking models in +OpenSearch. + +## Get started + +- Want a quickstart? Check out the demo in + [hello-ltr](https://github.com/o19s/hello-ltr). +- Brand new to learning to rank? head to + `core-concepts`{.interpreted-text role="doc"}. +- Otherwise, start with `fits-in`{.interpreted-text role="doc"} + +## Installing + +Pre-built versions can be found +[here](https://github.com/opensearch-project/opensearch-learning-to-rank-base/releases). +Want a build for an OS version? Follow the instructions in the [README +for +building](https://github.com/opensearch-project/opensearch-learning-to-rank-base#development) +or [create an +issue](https://github.com/opensearch-project/opensearch-learning-to-rank-base/issues). +Once you've found a version compatible with your OpenSearch, you'd +run a command such as: + + ./bin/opensearch-plugin install https://github.com/opensearch-project/opensearch-learning-to-rank-base/releases/download/ltr-plugin-v2.11.1-RC1/ltr-plugin-v2.11.1-RC1.zip + + + +## History + +The Elasticsearch LTR plugin was initially developed by [OpenSource Connections](http://opensourceconnections.com), with significant contributions by [Wikimedia Foundation](https://diff.wikimedia.org/2017/10/17/elasticsearch-learning-to-rank-plugin/), Snagajob Engineering, Bonsai, and Yelp Engineering. +The OpenSearch version of the plugin is derived from the Elasticsearch LTR plugin. diff --git a/_search-plugins/ltr/logging-features.md b/_search-plugins/ltr/logging-features.md new file mode 100644 index 0000000000..9beab7010e --- /dev/null +++ b/_search-plugins/ltr/logging-features.md @@ -0,0 +1,452 @@ +--- +layout: default +title: Logging Feature Scores +nav_order: 50 +parent: LTR search +has_children: false +--- + +# Logging Feature Scores + +To train a model, you need to log feature values. This is a major +component of the learning to rank plugin: as users search, we log +feature values from our feature sets so we can then train. Then we can +discover models that work well to predict relevance with that set of +features. + +## Sltr Query + +The `sltr` query is the primary way features are run and models are +evaluated. When logging, we'll just use an `sltr` query for executing +every feature-query to retrieve the scores of features. + +For the sake of discussing logging, let's say we created a feature set +like so that works with the TMDB data set from the +[demo](https://github.com/o19s/OpenSearch-learning-to-rank/tree/master/demo): + +```json +PUT _ltr/_featureset/more_movie_features +{ + "name": "more_movie_features", + "features": [ + { + "name": "body_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "overview": "{% raw %}{{keywords}}{% endraw %}" + } + } + }, + { + "name": "title_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "title": "{% raw %}{{keywords}}{% endraw %}" + } + } + } + ] +} +``` + +Next, let's see how to log this feature set in a couple common use +cases. + +## Joining feature values with a judgement list + +Let’s assume, in the simplest case, we have a judgement list already. +We simply want to join feature values for each keyword/document pair to form a complete training set. +For example, assume we have experts in our company, and they’ve arrived at this judgement list: + +``` +grade,keywords,docId +4,rambo,7555 +3,rambo,1370 +3,rambo,1369 +4,rocky,4241 +``` + +We want to get feature values for all documents that have judgement for each search term, one search term at a time. +If we start with “rambo”, we can create a filter for the ids associated with the “rambo” search: + +```json +{ + "filter": [ + {"terms": { + "_id": ["7555", "1370", "1369"] + }} + ] +} +``` + +We also need to point OpenSearch LTR at the features to log. +To do this we use the sltr OpenSearch query, included with OpenSearch LTR. +We construct this query such that it: + + - Has a `_name` (the OpenSearch named queries feature) to refer to it + - Refers to the featureset we created above `more_movie_features` + - Passes our search keywords "rambo" and whatever other parameters our features need + +```json +{ + "sltr": { + "_name": "logged_featureset", + "featureset": "more_movie_features", + "params": { + "keywords": "rambo" + } + } +} +``` + +In [searching with LTR]({{site.url}}{{site.baseurl}}/search-plugins/ltr/searching-with-your-model/) you'll see us use *sltr* for executing a model. +Here we're just using it as a hook to point OpenSearch LTR at the feature set we want to log. +{: .note} + +You might be thinking, wait if we inject `sltr` query into the OpenSearch query, won’t it influence the score? +The sneaky trick is to inject it as a filter. +As a filter that doesn’t actually filter anything, but injects our feature-logging only `sltr` query into our OpenSearch query: + +```json +{ + "query": { + "bool": { + "filter": [ + { + "terms": { + "_id": [ + "7555", + "1370", + "1369" + ] + } + }, + { + "sltr": { + "_name": "logged_featureset", + "featureset": "more_movie_features", + "params": { + "keywords": "rambo" + } + } + } + ] + } + } +} +``` + +Running this, you’ll see the three hits you’d expect. The next step is to turn on feature logging, referring to the `sltr` query we want to log. + +This is what the logging extension gives you. It finds an OpenSearch _sltr_ query, runs the feature set’s queries, scores each document, then returns those as computed fields on each document: + +```json +"ext": { + "ltr_log": { + "log_specs": { + "name": "log_entry1", + "named_query": "logged_featureset" + } + } +} +``` + + +This log extension comes with several arguments: + - `name`: The name of this log entry to fetch from each document + - `named_query` the named query which corresponds to an *sltr* query + - `rescore_index`: if `sltr` is in a rescore phase, this is the index of the query in the rescore list + - `missing_as_zero`: produce a 0 for missing features (when the feature does not match) (defaults to \`false\`) + +Either `named_query` or `rescore_index` must be set so that logging can locate an *sltr* query for logging either in the normal query phase or during rescoring. +{: .note} + + + +Finally the full request: + +```json +POST tmdb/_search +{ + "query": { + "bool": { + "filter": [ + { + "terms": { + "_id": ["7555", "1370", "1369"] + } + }, + { + "sltr": { + "_name": "logged_featureset", + "featureset": "more_movie_features", + "params": { + "keywords": "rambo" + } + }} + ] + } + }, + "ext": { + "ltr_log": { + "log_specs": { + "name": "log_entry1", + "named_query": "logged_featureset" + } + } + } +} +``` + +And now each document contains a log entry: + +```json +{ + "_index": "tmdb", + "_type": "movie", + "_id": "1370", + "_score": 20.291, + "_source": { + ... + }, + "fields": { + "_ltrlog": [ + { + "log_entry1": [ + {"name": "title_query" + "value": 9.510193}, + {"name": "body_query + "value": 10.7808075} + ] + } + ] + }, + "matched_queries": [ + "logged_featureset" + ] +} +``` + +Now you can join your judgement list with feature values to produce a training set! For the line that corresponds to document 1370 for keywords “Rambo” we can now add: + +``` +> 4 qid:1 1:9.510193 2:10.7808075 +``` + +Rinse and repeat for all your queries. + +For large judgement lists, batch up logging for multiple queries, +use OpenSearch’s [multi search]({{site.url}}{{site.baseurl}}/api-reference/multi-search/) capabilities. +{: .note} + +## Logging values for a live feature set + +Let's say you're running in production with a model being executed in +an `sltr` query. We'll get more into model execution in +[searching with LTR]({{site.url}}{{site.baseurl}}/search-plugins/ltr/searching-with-your-model/). But for our +purposes, a sneak peak, a live model might look something like: + +```json +POST tmdb/_search +{ + "query": { + "match": { + "_all": "rambo" + } + }, + "rescore": { + "query": { + "rescore_query": { + "sltr": { + "params": { + "keywords": "rambo" + }, + "model": "my_model" + } + } + } + } +} +``` + +Simply applying the correct logging spec to refer to the `sltr` query +does the trick to let us log feature values for our query: + +```json +"ext": { + "ltr_log": { + "log_specs": { + "name": "log_entry1", + "rescore_index": 0 + } + } +} +``` + +This will log features to the OpenSearch response, giving you an +ability to retrain a model with the same featureset later. + +## Modifying an existing feature set and logging + +Feature sets can be appended to. As mentioned in +[Working with Features]({{site.url}}{{site.baseurl}}/search-plugins/ltr/building-features/), you saw if you want +to incorporate a new feature, such as `user_rating`, we can append that +query to our featureset `more_movie_features`: + +``` json +PUT _ltr/_feature/user_rating/_addfeatures +{ + "features": [ + "name": "user_rating", + "params": [], + "template_language": "mustache", + "template" : { + "function_score": { + "functions": { + "field": "vote_average" + }, + "query": { + "match_all": {} + } + } + } + ] +} +``` + +Then finally, when we log as the examples above, we\'ll have our new +feature in our output: + +``` json +{ + "log_entry1": [ + { + "name": "title_query", + "value": 9.510193 + }, + { + "name": "body_query", + "value": 10.7808075 + }, + { + "name": "user_rating", + "value": 7.8 + } + ] +} +``` + +## Logging values for a proposed feature set + +You might create a completely new feature set for experimental purposes. +For example, let's say you create a brand new feature set, +`other_movie_features`: + +```json +PUT _ltr/_featureset/other_movie_features +{ + "name": "other_movie_features", + "features": [ + { + "name": "cast_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "cast.name": "{% raw %}{{keywords}}{% endraw %}" + } + } + }, + { + "name": "genre_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "genres.name": "{% raw %}{{keywords}}{% endraw %}" + } + } + } + ] +} +``` + +We can log *other_movie_features* alongside a live +production *more_movie_features* by simply appending it as +another filter, just like the first example above: + +```json +POST tmdb/_search +{ +"query": { + "bool": { + "filter": [ + { "sltr": { + "_name": "logged_featureset", + "featureset": "other_movie_features", + "params": { + "keywords": "rambo" + } + }}, + {"match": { + "_all": "rambo" + }} + ] + } +}, +"rescore": { + "query": { + "rescore_query": { + "sltr": { + "params": { + "keywords": "rambo" + }, + "model": "my_model" + } + } + } +} +} +``` + +Continue with as many feature sets as you care to log! + +## \'Logging\' serves multiple purposes + +With the tour done, it's worth point out real-life feature logging +scenarios to think through. + +First, you might develop judgement lists from user analytics. You want to +have the exact value of a feature at the precise time a user interaction +happened. If they clicked, you want to know the recency, title score, +and every other value at that exact moment. This way you can study later +what correlated with relevance when training. To do this, you may build +a large comprehensive feature set for later experimentation. + +Second, you may simply want to keep your models up to date with a +shifting index. Trends come and go, and models lose their effectiveness. +You may have A/B testing in place, or monitoring business metrics, and +you notice gradual degredation in model performance. In these cases, +"logging" is used to retrain a model you're already relatively +confident in. + +Third, there's the "logging" that happens in model development. You +may have a judgement list, but want to iterate heavily with a local copy +of OpenSearch. You're heavily experimenting with new features, +scrapping and adding to feature sets. You of course are a bit out of +sync with the live index, but you do your best to keep up. Once you've +arrived at a set of model parameters that you're happy with, you can +train with production data and confirm the performance is still +satisfactory. + +Next up, let's briefly talk about training a model in +[Uploading A Trained Model]({{site.url}}{{site.baseurl}}/search-plugins/ltr/training-models/) in tools outside +OpenSearch LTR. diff --git a/_search-plugins/ltr/searching-with-your-model.md b/_search-plugins/ltr/searching-with-your-model.md new file mode 100644 index 0000000000..5812708e85 --- /dev/null +++ b/_search-plugins/ltr/searching-with-your-model.md @@ -0,0 +1,128 @@ +--- +layout: default +title: Searching with LTR +nav_order: 70 +parent: LTR search +has_children: false +--- + +# Searching with LTR + +Now that you have a model, what can you do with it? As you saw in +[Logging Feature Scores]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/), the OpenSearch LTR +plugin comes with the *sltr* query. This query is also what +you use to execute models: + +```json + POST tmdb/_search + { + "query": { + "sltr": { + "params": { + "keywords": "rambo" + }, + "model": "my_model" + } + } + } +``` + +you almost certainly don't want to run *sltr* this way :) +{: .warning} + +## Rescore top N with *sltr* + +In reality you would never want to use the `sltr` query this way. Why? +This model executes on *every result in your index*. These models are +CPU intensive. You'll quickly make your OpenSearch cluster crawl +with the query above. + +More often, you'll execute your model on the top N of a baseline +relevance query. You can do this using OpenSearch's built in +[rescore +functionality](https://www.elastic.co/guide/en/OpenSearch/reference/current/search-request-rescore.html): + +```json + POST tmdb/_search + { + "query": { + "match": { + "_all": "rambo" + } + }, + "rescore": { + "window_size": 1000, + "query": { + "rescore_query": { + "sltr": { + "params": { + "keywords": "rambo" + }, + "model": "my_model" + } + } + } + } + } +``` + +Here we execute a query that limits the result set to documents that +match "rambo". All the documents are scored based on OpenSearch\'s +default similarity (BM25). On top of those already reasonably relevant +results we apply our model over the top 1000. + +Viola! + +## Scoring on a subset of features with *sltr* + +Sometimes you might want to execute your query on a subset of the +features rather than use all the ones specified in the model. In this +case the features not specified in `active_features` list will not be +scored upon. They will be marked as missing. You only need to specify +the `params` applicable to the `active_features`. If you request a +feature name that is not a part of the feature set assigned to that +model the query will throw an error. : + +```json + POST tmdb/_search + { + "query": { + "match": { + "_all": "rambo" + } + }, + "rescore": { + "window_size": 1000, + "query": { + "rescore_query": { + "sltr": { + "params": { + "keywords": "rambo" + }, + "model": "my_model", + "active_features": ["title_query"] + } + } + } + } + } +``` + +Here we apply our model over the top 1000 results but only for the +selected features which in this case is title_query + +## Models! Filters! Even more! + +One advantage of having `sltr` as just another OpenSearch query is +you can mix/match it with business logic and other. We won't dive into +these examples here, but we want to invite you to think creatively about +scenarios, such as + +- Filtering out results based on business rules, using OpenSearch + filters before applying the model +- Chaining multiple rescores, perhaps with increasingly sophisticated + models +- Rescoring once for relevance (with *sltr*), and a second + time for business concerns +- Forcing "bad" but relevant content out of the rescore window by + downboosting it in the baseline query diff --git a/_search-plugins/ltr/training-models.md b/_search-plugins/ltr/training-models.md new file mode 100644 index 0000000000..e5dc63c7b2 --- /dev/null +++ b/_search-plugins/ltr/training-models.md @@ -0,0 +1,348 @@ +--- +layout: default +title: Uploading A Trained Model +nav_order: 60 +parent: LTR search +has_children: false +--- + +# Uploading a trained model + +Training models occurs outside OpenSearch LTR. You use the plugin to +log features (as mentioned in [Logging Feature Scores]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/)). Then with whichever technology you choose, you train a +ranking model. You upload a model to OpenSearch LTR in the available +serialization formats (RankLib, XGBoost, and others). Let's first +talk briefly about training in supported technologies (though not at all +an extensive overview) and then dig into uploading a model. + +## RankLib training + +We provide two demos for training a model. A fully-fledged [Ranklib +Demo](http://github.com/o19s/elasticsearch-learning-to-rank/tree/master/demo) +uses Ranklib to train a model from OpenSearch queries. You can see +how features are +[logged](http://github.com/o19s/elasticsearch-learning-to-rank-learning-to-rank/tree/master/demo/collectFeatures.py) +and how models are +[trained](http://github.com/o19s/elasticsearch-learning-to-rank-learning-to-rank/tree/master/demo/train.py) +. In particular, you\'ll note that logging create a RankLib consumable +judgment file that looks like: + + 4 qid:1 1:9.8376875 2:12.318446 # 7555 rambo + 3 qid:1 1:10.7808075 2:9.510193 # 1370 rambo + 3 qid:1 1:10.7808075 2:6.8449354 # 1369 rambo + 3 qid:1 1:10.7808075 2:0.0 # 1368 rambo + +Here for query id 1 (Rambo) we've logged features 1 (a title `TF*IDF` +score) and feature 2 (a description `TF*IDF` score) for a set of +documents. In +[train.py](http://github.com/o19s/elasticsearch-learning-to-rank/demo/train.py) +you'll see how we call Ranklib to train one of it's supported models +on this line: + + cmd = "java -jar RankLib-2.8.jar -ranker %s -train%rs -save %s -frate 1.0" % (whichModel, judgmentsWithFeaturesFile, modelOutput) + +Our "judgmentsWithFeatureFile" is the input to RankLib. Other +parameters are passed, which you can read about in [Ranklib's +documentation](https://sourceforge.net/p/lemur/wiki/RankLib/). + +Ranklib will output a model in it's own serialization format. For +example a LambdaMART model is an ensemble of regression trees. It looks +like: + + ## LambdaMART + ## No. of trees = 1000 + ## No. of leaves = 10 + ## No. of threshold candidates = 256 + ## Learning rate = 0.1 + ## Stop early = 100 + + + + + 2 + ... + +Notice how each tree examines the value of features, makes a decision +based on the value of a feature, then ultimately outputs the relevance +score. You'll note features are referred to by ordinal, starting by +"1" with Ranklib (this corresponds to the 0th feature in your feature +set). Ranklib does not use feature names when training. + +## XGBoost example + +There's also an example of how to train a model [using +XGBoost](http://github.com/o19s/elasticsearch-learning-to-rank/tree/master/demo/xgboost-demo). +Examining this demo, you'll see the difference in how RankLib is +executed compared to XGBoost. XGBoost will output a serialization format for +gradient boosted decision tree that looks like: + +```json + [ { "nodeid": 0, "depth": 0, "split": "tmdb_multi", "split_condition": 11.2009, "yes": 1, "no": 2, "missing": 1, "children": [ + { "nodeid": 1, "depth": 1, "split": "tmdb_title", "split_condition": 2.20631, "yes": 3, "no": 4, "missing": 3, "children": [ + { "nodeid": 3, "leaf": -0.03125 }, + ... +``` + +## XGBoost parameters + +Additional parameters can optionally be passed for an XGBoost model. +This can be done by specifying the definition as an object, with the +decision trees as the 'splits' field. See the following example. + +Currently supported parameters: + +**objective** - Defines the model learning objective as specified in the +[XGBoost +documentation](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters). +This parameter can transform the final model prediction. Using logistic +objectives applies a sigmoid normalization. + +Currently supported values: 'binary:logistic', 'binary:logitraw', +'rank:ndcg', 'rank:map', 'rank:pairwise', 'reg:linear', +'reg:logistic' + +## Simple linear models | + +Many types of models naively output linear weights of each feature such as linear SVM. The LTR model supports simple linear weights for each features, such as those learned from an SVM model or linear regression: + +```json +{ + "title_query" : 0.3, + "body_query" : 0.5, + "recency" : 0.1 +} +``` + +## Feature normalization + +[Feature +Normalization](https://www.google.com/search?client=safari&rls=en&q=wikipedia+feature+normalization&ie=UTF-8&oe=UTF-8) +transforms feature values to a more consistent range (like 0 to 1 or -1 +to 1) at training time to better understand their relative impact. Some +models, especially linear ones (like +[SVMRank](http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html)), +rely on normalization to work correctly. + +## Uploading a model + +Once you have a model, you'll want to use it for search. You'll need +to upload it to OpenSearch LTR. Models are uploaded specifying the +following arguments + +- The feature set that was trained against +- The type of model (such as ranklib or xgboost) +- The model contents + +Uploading a Ranklib model trained against `more_movie_features` looks +like: + +```json + POST _ltr/_featureset/more_movie_features/_createmodel + { + "model": { + "name": "my_ranklib_model", + "model": { + "type": "model/ranklib", + "definition": "## LambdaMART\n + ## No. of trees = 1000 + ## No. of leaves = 10 + ## No. of threshold candidates = 256 + ## Learning rate = 0.1 + ## Stop early = 100 + + + + + 2 + ... + " + } + } + } +``` + +Or an xgboost model: + +```json + POST _ltr/_featureset/more_movie_features/_createmodel + { + "model": { + "name": "my_xgboost_model", + "model": { + "type": "model/xgboost+json", + "definition": "[ { \"nodeid\": 0, \"depth\": 0, \"split\": \"tmdb_multi\", \"split_condition\": 11.2009, \"yes\": 1, \"no\": 2, \"missing\": 1, \"children\": [ + { \"nodeid\": 1, \"depth\": 1, \"split\": \"tmdb_title\", \"split_condition\": 2.20631, \"yes\": 3, \"no\": 4, \"missing\": 3, \"children\": [ + { \"nodeid\": 3, \"leaf\": -0.03125 }, + ..." + } + } + } +``` + +Or an xgboost model with parameters: + +```json + POST _ltr/_featureset/more_movie_features/_createmodel + { + "model": { + "name": "my_xgboost_model", + "model": { + "type": "model/xgboost+json", + "definition": "{ + \"objective\": \"reg:logistic\", + \"splits\": [ { \"nodeid\": 0, \"depth\": 0, \"split\": \"tmdb_multi\", \"split_condition\": 11.2009, \"yes\": 1, \"no\": 2, \"missing\": 1, \"children\": [ + { \"nodeid\": 1, \"depth\": 1, \"split\": \"tmdb_title\", \"split_condition\": 2.20631, \"yes\": 3, \"no\": 4, \"missing\": 3, \"children\": [ + { \"nodeid\": 3, \"leaf\": -0.03125 }, + ... + ] + }" + } + } + } +```` + +Or a simple linear model: +```json + POST _ltr/_featureset/more_movie_features/_createmodel + { + "model": { + "name": "my_linear_model", + "model": { + "type": "model/linear", + "definition": """ + { + "title_query" : 0.3, + "body_query" : 0.5, + "recency" : 0.1 + } + """ + } + } + } +``` + +## Creating a model with Feature Normalization + +We can ask that features be normalized prior to evaluating the model. +OpenSearch Learning to Rank supports min max and standard feature +normalization. + +With standard feature normalization, values corresponding to the mean +will have a value of 0, one standard deviation above/below will have a +value of -1 and 1 respectively: + +```json + POST _ltr/_featureset/more_movie_features/_createmodel + { + "model": { + "name": "my_linear_model", + "model": { + "type": "model/linear", + "feature_normalizers": { + "release_year": { + "standard": { + "mean": 1970, + "standard_deviation": 30 + } + } + }, + "definition": """ + { + "release_year" : 0.3, + "body_query" : 0.5, + "recency" : 0.1 + } + """ + } + } + } +``` + +Also supported is min-max normalization. Where values at the specified +minimum receive 0, at the maximum turn into 1: + +```json + "feature_normalizers": { + "vote_average": { + "min_max": { + "minimum": 0, + "maximum": 10 + } + } + } +``` + +## Models aren't "owned by" featuresets + +Though models are created in reference to a feature set, it's important +to note after creation models are *top level* entities. For example, to +fetch a model back, you use GET: + + GET _ltr/_model/my_linear_model + +Similarly, to delete: + + DELETE _ltr/_model/my_linear_model + +This of course means model names are globally unique across all feature +sets. + +The associated features are *copied into* the model. This is for your +safety: modifying the feature set or deleting the feature set after +model creation doesn't have an impact on a model in production. For +example, if we delete the feature we previously created: + + DELETE _ltr/_featureset/more_movie_features + +We can still access and search with "my_linear_model". The following +still accesses the model and it's associated features: + + GET _ltr/_model/my_linear_model + +You can expect a response that includes the features used to create the +model (compare this with the *more_movie_features* in +[Logging Feature Scores]({{site.url}}{{site.baseurl}}/search-plugins/ltr/logging-features/)): + +```json + { + "_index": ".ltrstore", + "_type": "store", + "_id": "model-my_linear_model", + "_version": 1, + "found": true, + "_source": { + "name": "my_linear_model", + "type": "model", + "model": { + "name": "my_linear_model", + "feature_set": { + "name": "more_movie_features", + "features": [ + { + "name": "body_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "overview": "{{keywords}}" + } + } + }, + { + "name": "title_query", + "params": [ + "keywords" + ], + "template": { + "match": { + "title": "{{keywords}}" + } + } + } + ]}}} +``` + +With a model uploaded to OpenSearch, you're ready to search! Head to +[searching with LTR]({{site.url}}{{site.baseurl}}/search-plugins/ltr/searching-with-your-model/) to see put +model into action.