From 26c53abf4f1fa275eee1d20dbf6ec6d1c6b38a2a Mon Sep 17 00:00:00 2001
From: Jing Zhang <jngz@amazon.com>
Date: Thu, 28 Mar 2024 14:38:18 -0700
Subject: [PATCH] Add guardrails for remote model (#6750)

* guardrails for remote model

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Doc review

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add guardrails dedicated page

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Reword and reformat

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add prerequisites

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Change example

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add a link to query string query

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add regex and responses

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Add a sentence about regex

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Apply suggestions from code review

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

* Add type to guardrails

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Jing Zhang <jngz@amazon.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
---
 .../api/model-apis/register-model.md          |  69 +++-
 .../api/model-apis/update-model.md            |  33 +-
 .../remote-models/guardrails.md               | 298 ++++++++++++++++++
 _ml-commons-plugin/remote-models/index.md     |   1 +
 4 files changed, 398 insertions(+), 3 deletions(-)
 create mode 100644 _ml-commons-plugin/remote-models/guardrails.md

diff --git a/_ml-commons-plugin/api/model-apis/register-model.md b/_ml-commons-plugin/api/model-apis/register-model.md
index 880cbd68e5..dd157ed264 100644
--- a/_ml-commons-plugin/api/model-apis/register-model.md
+++ b/_ml-commons-plugin/api/model-apis/register-model.md
@@ -183,8 +183,9 @@ Field | Data type | Required/Optional | Description
 `description` | String | Optional| The model description. |
 `model_group_id` | String | Optional | The model group ID of the model group to register this model to. 
 `is_enabled`| Boolean | Specifies whether the model is enabled. Disabling the model makes it unavailable for Predict API requests, regardless of the model's deployment status. Default is `true`.
+`guardrails`| Object | Optional | The guardrails for the model input. For more information, see [Guardrails](#the-guardrails-parameter).|
 
-#### Example request: Remote model with a standalone connector
+#### Example request: Externally hosted with a standalone connector
 
 ```json
 POST /_plugins/_ml/models/_register
@@ -198,7 +199,7 @@ POST /_plugins/_ml/models/_register
 ```
 {% include copy-curl.html %}
 
-#### Example request: Remote model with a connector specified as part of the model
+#### Example request: Externally hosted with a connector specified as part of the model
 
 ```json
 POST /_plugins/_ml/models/_register
@@ -248,6 +249,70 @@ OpenSearch responds with the `task_id` and task `status`.
 }
 ```
 
+### The `guardrails` parameter
+
+Guardrails are safety measures for large language models (LLMs). They provide a set of rules and boundaries that control how an LLM behaves and what kind of output it generates. 
+
+To register an externally hosted model with guardrails, provide the `guardrails` parameter, which supports the following fields. All fields are optional.
+
+Field | Data type | Description
+:---  | :--- | :---
+`type` | String | The guardrail type. Currently, only `local_regex` is supported.
+`input_guardrail`| Object |  The guardrail for the model input. |
+`output_guardrail`| Object |  The guardrail for the model output. |
+`stop_words`| Object | The list of indexes containing stopwords used for the model input/output validation. If the model prompt/response contains a stopword contained in any of the indexes, the predict request on this model is rejected. |
+`index_name`| Object | The name of the index storing the stopwords. |
+`source_fields`| Object | The name of the field storing the stopwords. |
+`regex`| Object |  A regular expression used for input/output validation. If the model prompt/response matches the regular expression, the predict request on this model is rejected. |
+
+#### Example request: Externally hosted model with guardrails
+
+```json
+POST /_plugins/_ml/models/_register
+{
+  "name": "openAI-gpt-3.5-turbo",
+  "function_name": "remote",
+  "model_group_id": "1jriBYsBq7EKuKzZX131",
+  "description": "test model",
+  "connector_id": "a1eMb4kBJ1eYAeTMAljY",
+  "guardrails": {
+    "type": "local_regex",
+    "input_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "stop_words_input",
+          "source_fields": ["title"]
+        }
+      ],
+      "regex": ["regex1", "regex2"]
+    },
+    "output_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "stop_words_output",
+          "source_fields": ["title"]
+        }
+      ],
+      "regex": ["regex1", "regex2"]
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+For a complete example, see [Guardrails]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/guardrails/).
+
+#### Example response
+
+OpenSearch responds with the `task_id` and task `status`:
+
+```json
+{
+  "task_id" : "ew8I44MBhyWuIwnfvDIH",
+  "status" : "CREATED"
+}
+```
+
 ## Check the status of model registration
 
 To see the status of your model registration and retrieve the model ID created for the new model version, pass the `task_id` as a path parameter to the Tasks API:
diff --git a/_ml-commons-plugin/api/model-apis/update-model.md b/_ml-commons-plugin/api/model-apis/update-model.md
index 380f422272..877d0b5c51 100644
--- a/_ml-commons-plugin/api/model-apis/update-model.md
+++ b/_ml-commons-plugin/api/model-apis/update-model.md
@@ -36,6 +36,7 @@ Field | Data type |  Description
 `rate_limiter` | Object | Limits the number of times any user can call the Predict API on the model. For more information, see [Rate limiting inference calls]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#rate-limiting-inference-calls).
 `rate_limiter.limit` | Integer | The maximum number of times any user can call the Predict API on the model per `unit` of time. By default, there is no limit on the number of Predict API calls. Once you set a limit, you cannot reset it to no limit. As an alternative, you can specify a high limit value and a small time unit, for example, 1 request per nanosecond.
 `rate_limiter.unit` | String | The unit of time for the rate limiter. Valid values are `DAYS`, `HOURS`, `MICROSECONDS`, `MILLISECONDS`, `MINUTES`, `NANOSECONDS`, and `SECONDS`.
+`guardrails`| Object | The guardrails for the model.
 
 #### Example request: Disabling a model
 
@@ -62,6 +63,35 @@ PUT /_plugins/_ml/models/T_S-cY0BKCJ3ot9qr0aP
 ```
 {% include copy-curl.html %}
 
+#### Example request: Updating the guardrails
+
+```json
+PUT /_plugins/_ml/models/MzcIJX8BA7mbufL6DOwl
+{
+  "guardrails": {
+    "input_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "updated_stop_words_input",
+          "source_fields": ["updated_title"]
+        }
+      ],
+      "regex": ["updated_regex1", "updated_regex2"]
+    },
+    "output_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "updated_stop_words_output",
+          "source_fields": ["updated_title"]
+        }
+      ],
+      "regex": ["updated_regex1", "updated_regex2"]
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
 #### Example response
 
 ```json
@@ -78,4 +108,5 @@ PUT /_plugins/_ml/models/T_S-cY0BKCJ3ot9qr0aP
   "_seq_no": 48,
   "_primary_term": 4
 }
-```
\ No newline at end of file
+```
+
diff --git a/_ml-commons-plugin/remote-models/guardrails.md b/_ml-commons-plugin/remote-models/guardrails.md
new file mode 100644
index 0000000000..ca34eb335c
--- /dev/null
+++ b/_ml-commons-plugin/remote-models/guardrails.md
@@ -0,0 +1,298 @@
+---
+layout: default
+title: Guardrails
+has_children: false
+has_toc: false
+nav_order: 70
+parent: Connecting to externally hosted models 
+grand_parent: Integrating ML models
+---
+
+# Configuring model guardrails
+**Introduced 2.13**
+{: .label .label-purple }
+
+Guardrails can guide a large language model (LLM) toward desired behavior. They act as a filter, preventing the LLM from generating output that is harmful or violates ethical principles and facilitating safer use of AI. Guardrails also cause the LLM to produce more focused and relevant output. 
+
+To configure guardrails for your LLM, you can provide a list of words to be prohibited in the input or output of the model. Alternatively, you can provide a regular expression against which the model input or output will be matched.
+
+## Prerequisites
+
+Before you start, make sure you have fulfilled the [prerequisites]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/#prerequisites) for connecting to an externally hosted model.
+
+## Step 1: Create a guardrail index
+
+To start, create an index that will store the excluded words (_stopwords_). In the index settings, specify a `title` field, which will contain excluded words, and a `query` field of the [percolator]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/percolator/) type. The percolator query will be used to match the LLM input or output:
+
+```json
+PUT /words0
+{
+  "mappings": {
+    "properties": {
+      "title": {
+        "type": "text"
+      },
+      "query": {
+        "type": "percolator"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Step 2: Index excluded words or phrases
+
+Next, index a query string query that will be used to match excluded words in the model input or output:
+
+```json
+PUT /words0/_doc/1?refresh
+{
+  "query": {
+    "query_string": {
+      "query": "title: blacklist"
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+```json
+PUT /words0/_doc/2?refresh
+{
+  "query": {
+    "query_string": {
+      "query": "title: \"Master slave architecture\""
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+For more query string options, see [Query string query]({{site.url}}{{site.baseurl}}/query-dsl/full-text/query-string/).
+
+## Step 3: Register a model group
+
+To register a model group, send the following request:
+
+```json
+POST /_plugins/_ml/model_groups/_register
+{
+    "name": "bedrock",
+    "description": "This is a public model group."
+}
+```
+{% include copy-curl.html %}
+
+The response contains the model group ID that you'll use to register a model to this model group:
+
+```json
+{
+ "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
+ "status": "CREATED"
+}
+```
+
+To learn more about model groups, see [Model access control]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-access-control/).
+
+## Step 4: Create a connector
+
+Now you can create a connector for the model. In this example, you'll create a connector to the Anthropic Claude model hosted on Amazon Bedrock:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+  "name": "BedRock test claude Connector",
+  "description": "The connector to BedRock service for claude model",
+  "version": 1,
+  "protocol": "aws_sigv4",
+  "parameters": {
+      "region": "us-east-1",
+      "service_name": "bedrock",
+      "anthropic_version": "bedrock-2023-05-31",
+      "endpoint": "bedrock.us-east-1.amazonaws.com",
+      "auth": "Sig_V4",
+      "content_type": "application/json",
+      "max_tokens_to_sample": 8000,
+      "temperature": 0.0001,
+      "response_filter": "$.completion"
+  },
+  "credential": {
+      "access_key": "<YOUR_ACCESS_KEY>",
+      "secret_key": "<YOUR_SECRET_KEY>"
+  },
+  "actions": [
+    {
+      "action_type": "predict",
+      "method": "POST",
+      "url": "https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-v2/invoke",
+      "headers": { 
+        "content-type": "application/json",
+        "x-amz-content-sha256": "required"
+      },
+      "request_body": "{\"prompt\":\"${parameters.prompt}\", \"max_tokens_to_sample\":${parameters.max_tokens_to_sample}, \"temperature\":${parameters.temperature},  \"anthropic_version\":\"${parameters.anthropic_version}\" }"
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+The response contains the connector ID for the newly created connector:
+
+```json
+{
+  "connector_id": "a1eMb4kBJ1eYAeTMAljY"
+}
+```
+
+## Step 5: Register and deploy the model with guardrails
+
+To register an externally hosted model, provide the model group ID from step 3 and the connector ID from step 4 in the following request. To configure guardrails, include the `guardrails` object:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+  "name": "Bedrock Claude V2 model",
+  "function_name": "remote",
+  "model_group_id": "wlcnb4kBJ1eYAeTMHlV6",
+  "description": "test model",
+  "connector_id": "a1eMb4kBJ1eYAeTMAljY",
+  "guardrails": {
+    "type": "local_regex",
+    "input_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "words0",
+          "source_fields": [
+            "title"
+          ]
+        }
+      ],
+      "regex": [
+        ".*abort.*",
+        ".*kill.*"
+      ]
+    },
+    "output_guardrail": {
+      "stop_words": [
+        {
+          "index_name": "words0",
+          "source_fields": [
+            "title"
+          ]
+        }
+      ],
+      "regex": [
+        ".*abort.*",
+        ".*kill.*"
+      ]
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+For more information, see [The `guardrails` parameter]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/#the-guardrails-parameter).
+
+OpenSearch returns the task ID of the register operation:
+
+```json
+{
+  "task_id": "cVeMb4kBJ1eYAeTMFFgj",
+  "status": "CREATED"
+}
+```
+
+To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/):
+
+```bash
+GET /_plugins/_ml/tasks/cVeMb4kBJ1eYAeTMFFgj
+```
+{% include copy-curl.html %}
+
+When the operation is complete, the state changes to `COMPLETED`:
+
+```json
+{
+  "model_id": "cleMb4kBJ1eYAeTMFFg4",
+  "task_type": "DEPLOY_MODEL",
+  "function_name": "REMOTE",
+  "state": "COMPLETED",
+  "worker_node": [
+    "n-72khvBTBi3bnIIR8FTTw"
+  ],
+  "create_time": 1689793851077,
+  "last_update_time": 1689793851101,
+  "is_async": true
+}
+```
+
+## Step 6 (Optional): Test the model
+
+To demonstrate how guardrails are applied, first run the predict operation that does not contain any excluded words:
+
+```json
+POST /_plugins/_ml/models/p94dYo4BrXGpZpgPp98E/_predict
+{
+  "parameters": {
+    "prompt": "\n\nHuman:this is a test\n\nnAssistant:"
+  }
+}
+```
+{% include copy-curl.html %}
+
+The response contains inference results:
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "dataAsMap": {
+            "response": " Thank you for the test, I appreciate you taking the time to interact with me. I'm an AI assistant created by Anthropic to be helpful, harmless, and honest."
+          }
+        }
+      ],
+      "status_code": 200
+    }
+  ]
+}
+```
+
+Then run the predict operation that contains excluded words:
+
+```json
+POST /_plugins/_ml/models/p94dYo4BrXGpZpgPp98E/_predict
+{
+  "parameters": {
+    "prompt": "\n\nHuman:this is a test of Master slave architecture\n\nnAssistant:"
+  }
+}
+```
+{% include copy-curl.html %}
+
+The response contains an error message because guardrails were triggered:
+
+```json
+{
+  "error": {
+    "root_cause": [
+      {
+        "type": "illegal_argument_exception",
+        "reason": "guardrails triggered for user input"
+      }
+    ],
+    "type": "illegal_argument_exception",
+    "reason": "guardrails triggered for user input"
+  },
+  "status": 400
+}
+```
+
+Guardrails are also triggered when a prompt matches the supplied regular expression.
+
+## Next steps
+
+- For more information about configuring guardrails, see [The `guardrails` parameter]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/#the-guardrails-parameter).
\ No newline at end of file
diff --git a/_ml-commons-plugin/remote-models/index.md b/_ml-commons-plugin/remote-models/index.md
index 657d7254be..0b92adaab6 100644
--- a/_ml-commons-plugin/remote-models/index.md
+++ b/_ml-commons-plugin/remote-models/index.md
@@ -328,3 +328,4 @@ To learn how to use the model for vector search, see [Using an ML model for neur
 - For more information about connector parameters, see [Connector blueprints]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/blueprints/).
 - For more information about managing ML models in OpenSearch, see [Using ML models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/model-serving-framework/).
 - For more information about interacting with ML models in OpenSearch, see [Managing ML models in OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/ml-commons-plugin/ml-dashboard/)
+For instructions on how to configure model guardrails, see [Guardrails]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/guardrails/).