From 84f93955ef0e76da180675892bd1ce67484f8e72 Mon Sep 17 00:00:00 2001 From: Eric Pugh Date: Wed, 22 May 2024 14:57:20 -0400 Subject: [PATCH] more vale violations --- _search-plugins/ubi/data-structures.md | 4 +-- _search-plugins/ubi/documentation.md | 12 +++---- _search-plugins/ubi/schemas.md | 36 +++++++++---------- _search-plugins/ubi/ubi_dashboard_tutorial.md | 14 ++++---- 4 files changed, 33 insertions(+), 33 deletions(-) diff --git a/_search-plugins/ubi/data-structures.md b/_search-plugins/ubi/data-structures.md index 7f8701c46e..63e21a4a92 100644 --- a/_search-plugins/ubi/data-structures.md +++ b/_search-plugins/ubi/data-structures.md @@ -6,8 +6,8 @@ has_children: false nav_order: 7 --- -# Sample Client data structures -The client data structures can be used to create events that follow the [UBI event schema]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas). +# Sample client data structures +The client data structures can be used to create events that follow the [UBI event schema]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas/). The developer provides an implementation for the following functions: - `getClientId()` diff --git a/_search-plugins/ubi/documentation.md b/_search-plugins/ubi/documentation.md index 2660e9826b..578e46b9c0 100644 --- a/_search-plugins/ubi/documentation.md +++ b/_search-plugins/ubi/documentation.md @@ -90,10 +90,10 @@ The plugin has a concept of a "store", which is a logical collection of the even index is used to store events, and the other index is for storing queries. ### OpenSearch data mappings -Ubi has 2 primary indexes: +UBI has 2 primary indexes: - **UBI Queries** stores all queries and results. - **UBI Events** store that the Ubi client writes events to. -*Follow the [schema deep dive]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas) to understand how these two indexes make UBI into a causal framework for search.* +*Follow the [schema deep dive]({{site.url}}{{site.baseurl}}/search-plugins/ubi/schemas/) to understand how these two indexes make UBI into a causal framework for search.* ## Plugin API @@ -125,9 +125,9 @@ DELETE http://localhost:9200/_plugins/ubi/mystore ``` {% include copy-curl.html %} -This will delete the UBI store and all contained events and queries. Please use this with caution. +This will delete the UBI store and all contained events and queries. Use this with caution. -### Get a List of UBI stores +### Get a list of UBI stores To get a list of stores, send a `GET` request: @@ -136,7 +136,7 @@ GET http://localhost:9200/_plugins/ubi ``` {% include copy-curl.html %} -### Persist a Client-Side Event into a UBI store +### Persist a client-side event into a UBI store To persist a client-side event into a store, send a `POST` request where the body of the request is the event: @@ -162,7 +162,7 @@ the plugin cannot associate a query with the client-side events associated with To make this association, queries need to have a header value that indicates the user ID. -### Example Queries +### Example queries The following query tells the plugin that the query being run should be persisted to the store `mystore` and be associated with user ID `john`: diff --git a/_search-plugins/ubi/schemas.md b/_search-plugins/ubi/schemas.md index ef57711a08..ba43fb801b 100644 --- a/_search-plugins/ubi/schemas.md +++ b/_search-plugins/ubi/schemas.md @@ -12,7 +12,7 @@ nav_order: 7 UBI is not functional unless the links between the following are consistently maintained within your UBI-enabled application: - [`user_id`](#user_id) represents a unique user. -- [`object_id`](#object_id) represents an id for whatever item the user is searching for, such as `epc`, `isbn`, `ssn`, `handle`, etc. +- [`object_id`](#object_id) represents an id for whatever item the user is searching for, such as `epc`, `isbn`, `ssn`, `handle`. - [`query_id`](#query_id) is a unique id for the raw query language executed and the resultant `object_id`'s that the query returned. \ - [`action_name`](#action_name), though not technically an *id*, the `action_name` tells us what exact action (such as `click` or `add_to_cart`) was taken (or not) with this `object_id`. @@ -20,16 +20,16 @@ To summarize: the `query_id` signals the beginning of a `user_id`'s *Search Jour ## UBI roles - **Search Client**: in charge of searching, and then recieving *objects* from some document index in OpenSearch. - (1, 2, *5* and 7, below) + (1, 2, *5* and 7, in following sections) - **User Behavior Insights** module: once activated, manages the **UBI Queries** store in the background, indexing each underlying, technical, DSL, index query with a unique [`query_id`](#query_id) along with all returned resultant [`object_id`](#object_id)'s, and then passing the `query_id` back to the **Search Client** so that events can be linked to this query. - (3, 4 and *5*, below) -- **objects**: are whatever items the user is searching for with the queries. Activating UBI involves mapping your real-world objects (via its `isbn`, `ssn`) to the [`object_id`](#object_id) fields in the schemas below. + (3, 4 and *5*, in following sections) +- **objects**: are whatever items the user is searching for with the queries. Activating UBI involves mapping your real-world objects (using it's `isbn`, `ssn`) to the [`object_id`](#object_id) fields in the schemas. - The **Search Client**, if separate from the **UBI Client**, forwards the indexed [`query_id`](#query_id) to the **UBI Client**.   *Note:* We break out the roles of *search* and *UBI event indexing* here, but many implementations will likely use the same OpenSearch client instance for both roles of searching and index writing. -  (6, below) +  (6, following section) - The **UBI Client** then indexes all user events with this [`query_id`](#query_id) until a new search is performed, and a new `query_id` is generated by **User Behavior Insights** and passed back to the **UBI Client** -- If the **UBI Client** interacts with a result *object*, such as `onClick`, that [`object_id`](#object_id), *onClick* [`action_name`](#action_name) and `query_id` are all indexed together, signalling the causal link between the *search* and the *object*. - (8 and 9, below) +- If the **UBI Client** interacts with a result *object*, such as `onClick`, that [`object_id`](#object_id), `onClick` [`action_name`](#action_name) and `query_id` are all indexed together, signalling the causal link between the *search* and the *object*. + (8 and 9, following section) @@ -112,10 +112,10 @@ The only obvious difference will be in the `ubi` stanze of the json response, *w Since UBI manages the **UBI Queries** store, the developer should never have to write directly to this store (except for importing data). - `timestamp` -   A unix timestamp of when the query was received +   A UNIX timestamp of when the query was received - `query_id` -   A unique ID of the query provided by the client or generated automatically. The same query text issued multiple times would generate different `query_id`. +   A unique ID of the query provided by the client or generated automatically. The same query text issued multiple times would generate different `query_id`. - `user_id`   A user ID provided by the client @@ -128,17 +128,17 @@ Since UBI manages the **UBI Queries** store, the developer should never have to -### 2) **UBI Events** +### 2) **UBI events** This is the event store that the client side directly indexes events to, linking the event [`action_name`](#action_name), [`object_id`](#object_id)'s and [`query_id`](#query_id)'s together with any other important event information. -Since this schema is dynamic, the developer can add any new fields and structures (such as *user* information, *geo-location* information, etc.) at index time that are not in the current **UBI Events** [schema](../src/main/resources/events-mapping.json): +Since this schema is dynamic, the developer can add any new fields and structures (such as *user* information, *geo-location* information) at index time that are not in the current **UBI Events** [schema](../src/main/resources/events-mapping.json): - `application`

-   (size 100) - name of the application tracking UBI events (e.g. *amazon-shop*, *ABC-microservice*) +   (size 100) - name of the application tracking UBI events (e.g. `amazon-shop`, `ABC-microservice`) - `action_name`

-   (size 100) - any name you want to call your event. For example, with *javascript* events, you could include `on_click`, `logon`, `add_to_cart`, `page_scroll`.... _This should be formalized. A list of standard ones and then custom ones._ +   (size 100) - any name you want to call your event. For example, with *JavaScript* events, you could include `on_click`, `logon`, `add_to_cart`, `page_scroll`.... _This should be formalized. A list of standard ones and then custom ones._ - `query_id`

@@ -150,7 +150,7 @@ Since this schema is dynamic, the developer can add any new fields and structure The `user_id` must be consistent in both the **UBI Queries** and **UBI Events** stores. - `timestamp`: -   UTC-based, unix epoch time. +   UTC-based, UNIX epoch time. - `message_type` @@ -164,7 +164,7 @@ Since this schema is dynamic, the developer can add any new fields and structure - `event_attributes`'s structure is where any relevant information about the event can be stored. There are two primary structures in the `event_attributes`: - - **`event_attributes.position`** - structure that contains information on the location of the event origin, such as screen *x,y* coordinates, or the *n*th object out of 10 results, .... + - **`event_attributes.position`** - structure that contains information on the location of the event origin, such as screen *x,y* coordinates, or the *n-th* object out of 10 results, .... - `event_attributes.position.ordinal` @@ -188,12 +188,12 @@ Since this schema is dynamic, the developer can add any new fields and structure

- - **`event_attributes.object`**, which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post, etc..). + - **`event_attributes.object`**, which contains identifying information of the object returned from the query that the user interacts with (i.e.: a book, a product, a post). The `object` structure has two ways to refer to the object, with `object_id` being the id that links prior queries to this object: - - `event_attributes.object.internal_id` is a unique id that OpenSearch can use to internally to index the object, think the `_id` field in the indices. + - `event_attributes.object.internal_id` is a unique id that OpenSearch can use to internally to index the object, think the `_id` field in the indexes. - `event_attributes.object.object_id` -   is the id that a user could look up amd find the object instance within the **document corpus**. Examples include: *ssn*, *isbn*, *primary_ean*. Variants need to be incorporated in the `object_id`, so for a t-shirt that is red, you would need SKU level as the `object_id`. +   is the id that a user could look up amd find the object instance within the **document corpus**. Examples include: `ssn`, `isbn`, `ean`. Variants need to be incorporated in the `object_id`, so for a t-shirt that is red, you would need SKU level as the `object_id`. Initializing UBI requires mapping from the **Document Index**'s primary key to this `object_id` - `event_attributes.object.object_type` diff --git a/_search-plugins/ubi/ubi_dashboard_tutorial.md b/_search-plugins/ubi/ubi_dashboard_tutorial.md index 44a7c8fe09..302f37e176 100644 --- a/_search-plugins/ubi/ubi_dashboard_tutorial.md +++ b/_search-plugins/ubi/ubi_dashboard_tutorial.md @@ -7,7 +7,7 @@ nav_order: 7 --- # Build an analytic dashboard for UBI -Whether you've been collecting user events and queries for a while, or [you uploaded some sample events](https://github.com/o19s/chorus-opensearch-edition/blob/main/katas/003_import_preexisting_event_data.md), now you're ready to visualize them in the dashboard! +Whether you've been collecting user events and queries for a while, or [you uploaded some sample events](https://github.com/o19s/chorus-opensearch-edition/blob/main/katas/003_import_preexisting_event_data.md), now you're ready to visualize them in the dashboard using User Behavior Insights. ## 1) Fire up the OpenSearch dashboards @@ -47,11 +47,11 @@ Most of the visualization require some sort of aggregate function on an bucket/f Save that visualization and it will be added to your new dashboard. Now that you have a visualization on your dashboard, you can save your dashboard. ## 4) Add a "Tag Cloud" vizualization to your dashboard -Let's add a word cloud for trending searches. Choose the Tag Cloud visualization of the terms in the `message` field where the javascript client logs the raw text that the user searches on. (Note: the true query, as processed by OpenSearch with filters, boosting, etc. will be in the `.{store}_queries` index, but what we are looking at is the `message` field of the `.{store}_events` index, where the javascript client captures what the user actually typed. ) +Let's add a word cloud for trending searches. Choose the Tag Cloud visualization of the terms in the `message` field where the javascript client logs the raw text that the user searches on. (Note: the true query, as processed by OpenSearch with filters, boosting, and so on will be in the `.{store}_queries` index, but what we are looking at is the `message` field of the `.{store}_events` index, where the JavaScript client captures what the user actually typed. ) ![Word Cloud]({{site.url}}{{site.baseurl}}/images/ubi/tag_cloud1.png "Word Cloud") -**But there's a problem!** The `message` field is on *every* event --not just query/search events-- and can be used in anyway the client developer decides to use it; so, it can contain error messages, debug messages, click information, etc. -We need to add a filter to only see search terms on query events. Since the developer gave a `message_type` of `QUERY` for each search event, we will filter on that message type to isolate just the users' searches. +**But there's a problem!** The `message` field is on *every* event --not only query/search events-- and can be used in anyway the client developer decides to use it; so, it can contain error messages, debug messages, click information, and so on. +We need to add a filter to only see search terms on query events. Since the developer gave a `message_type` of `QUERY` for each search event, we will filter on that message type to isolate the specific users' searches. ![Word Cloud]({{site.url}}{{site.baseurl}}/images/ubi/tag_cloud2.png "Word Cloud") You should now have two visualizations on your dashboard. @@ -62,15 +62,15 @@ To add a histogram, first, add a vertical bar chart. Vertical Bar Chart -The data field we want to examine is `event_attributes.position.ordinal`, meaning the user clicked on the *n*th item in a list. The y-axis will be the number of times that *n*th was clicked. The x-axis will be the ordinal number itself that was clicked, using the `Histogram` aggregation. +The data field we want to examine is `event_attributes.position.ordinal`, meaning the user clicked on the *n*th item in a list. The y-axis will be the number of times that *n-th* was clicked. The x-axis will be the ordinal number itself that was clicked, using the `Histogram` aggregation. ![Vertical Bar Chart]({{site.url}}{{site.baseurl}}/images/ubi/histogram.png "Vertical Bar Chart") -## 6) Have fun slicing and dicing! +## 6) Have fun slicing and dicing For example, let's see how the click position changes when there is a purchase, by adding this filter `action_name:product_purchase`. ![Product Purchase]({{site.url}}{{site.baseurl}}/images/ubi/product_purchase.png "Product Purchase") Or let's see what event messages include "\*UBI\*" somewhere between the wildcards. ![UBI]({{site.url}}{{site.baseurl}}/images/ubi/ubi.png "UBI") -You now have a basic dashboard that lets you look at the data. In the next Katas we'll focus on some typical ecommerce driven scenarios. +You now have a basic dashboard that lets you look at the data. In the next katas we'll focus on some typical ecommerce driven scenarios.