Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flow framework documentation #6257

Merged
merged 28 commits into from
Feb 7, 2024
Merged

Add flow framework documentation #6257

merged 28 commits into from
Feb 7, 2024

Conversation

kolchfa-aws
Copy link
Collaborator

Closes #5386

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <[email protected]>
@kolchfa-aws kolchfa-aws self-assigned this Jan 25, 2024
@kolchfa-aws kolchfa-aws added v2.12.0 3 - Tech review PR: Tech review in progress release-notes PR: Include this PR in the automated release notes labels Jan 25, 2024
Copy link
Member

@dbwiddis dbwiddis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A few comments!

Comment on lines 12 to 15
* Workflow steps requiring an OpenSearch plugin which is not installed
* Workflow steps relying on previous node input that is provided by those steps
* Workflow step fields with invalid values
* Workflow graph (node/edge) configurations containing cycles or having duplicate IDs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good place to reference the get workflow steps API.


Once a workflow is created, provide its `workflow_id` to other APIs.

The `POST` method creates a new workflow. The `PUT` method updates an existing workflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow must not have been provisioned for PUT to be successful. We might want to add that here.


# Get workflow steps

OpenSearch validates workflows using the [JSON file](https://github.com/opensearch-project/automating-workflows/blob/main/src/main/resources/mappings/workflow-steps.json) that lists the required inputs and generated outputs for all steps. The Get Workflow Steps API retrieves this file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links to the github repo should have flow-framework rather than automating-workflows.

But I'm not sure we should refer to this file by link, particularly on the main branch:

  • We have an open issue to store it programmatically and eliminate the file, which almost certainly will be done before 2.13.
  • Any new steps we add will change this file on main for upcoming versions but wouldn't be reflected in the released API which includes this file as a classpath resource.


### Example response

OpenSearch responds with the [JSON file](https://github.com/opensearch-project/automating-workflows/blob/main/src/main/resources/mappings/workflow-steps.json) containing the steps. The order of fields in the returned steps may not exactly match the original JSON but will function identically.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here about a direct file link. Do you think we should include the JSON from a sample workflow step showing what it contains (input, output, required plugins, and default step timeout)? Such as

{
    "register_remote_model": {
        "inputs": [
            "name",
            "connector_id"
        ],
        "outputs": [
            "model_id",
            "register_model_status"
        ],
        "required_plugins":[
            "opensearch-ml"
        ]
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will delete the links to the file and add an example. The users will see the template once they call the API.

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall with few comments.

```

YAML templates permit comments.
{: .tip}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding. What does .tip does?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure it's a highlighted extra tip to users, like this:

Screenshot 2024-01-25 at 12 53 07 PM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It formats the text above it with a "callout" so it looks like this in rendered form:
Screenshot 2024-01-25 at 4 00 52 PM

_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
- source: create_connector_1
dest: register_model_2
- source: register_model_2
dest: deploy_model_3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbwiddis do you think we should also mention that if previous_node_inputs are defined then defining edges is optional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mentioned immediately above in the comment (line 142-144)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a callout in the tutorial so this info is explicitly mentioned

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to mention it in multiple places, such as where previous_node_inputs are defined as well.

_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
|`create_connector` |[Create Connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/connector-apis/create-connector/) |Creates a connector to a model hosted on a third-party platform. |
|`delete_connector` |[Delete Connector]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/connector-apis/delete-connector/) |Deletes a connector to a model hosted on a third-party platform. |
|`register_remote_model` |[Register Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/) |Registers a model hosted on a third-party platform. If the `user_inputs` field contains a `deploy` key that is set to `true`, also deploys the model. |
|`register_model_group` |[Register Model Group]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-group-apis/register-model-group/) |Registers a model group. The model group will be deleted automatically once no model is present in the group. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshpalis should we also add local models workflow steps here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we have RegisterLocalPretrainedModel, RegisterLocalCustomModel, and RegisterLocalSparseEncodingModel steps

Copy link
Member

@owaiskazi19 owaiskazi19 Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshpalis Can you provide a one liner for each step here so that @kolchfa-aws can update the doc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshpalis I updated the doc with these 3 steps. I assume they also have an option of deploying when user_inputs contains deploy=true. Please review. Thanks!

Copy link
Collaborator

@vagimeli vagimeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolchfa-aws Exceptional writing--clear, simple language for a technically complex topic. Well done!

_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved
_automating-workflows/api/create-workflow.md Outdated Show resolved Hide resolved

## 2. Use the deployed model for inference

A CoT agent can use the deployed model in a tool using the [ML Commons Agent Framework](Link TBD). This step doesn’t strictly correspond to an API but represents a component of the body that the Register Agent API requires. This simplifies the register request and allows reuse of the same tool in multiple agents.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comments re: MLCommons

_automating-workflows/workflow-tutorial.md Outdated Show resolved Hide resolved
_automating-workflows/workflow-tutorial.md Outdated Show resolved Hide resolved
_automating-workflows/workflow-tutorial.md Outdated Show resolved Hide resolved
_automating-workflows/workflow-tutorial.md Outdated Show resolved Hide resolved
kolchfa-aws and others added 6 commits January 29, 2024 08:55
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Provisioning a workflow refers to a one-time setup process, usually performed by a cluster administrator to create resources which will be used by end users.

The `workflows` field of the template may contain multiple workflows. The workflow with the key `provision` can be executed with this API. This API is also executed when the [Create or Update Workflow API]({{site.url}}{{site.baseurl}}/automating-workflows/api/create-workflow/) is called with the `provision` parameter set to `true`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding code now to prevent provisioning a workflow that has already been provisioned. This is very similar (and will use the same code) as the create workflow not allowing you to update after it's been provisioned:

You can only update a workflow if it has not yet been provisioned.
{: .note}

Can we add a similar note:

You can only provision a workflow if it has not yet been provisioned. Deprovision the workflow if you need to repeat provisioning.
{: .note}

@kolchfa-aws kolchfa-aws added 6 - Done but waiting to merge PR: The work is done and ready to merge and removed 3 - Tech review PR: Tech review in progress labels Jan 30, 2024
@minalsha
Copy link

Overall LGTM

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @kolchfa-aws!

@minalsha
Copy link

minalsha commented Feb 7, 2024

Hi @kolchfa-aws , for 2.12, Flow framework would be launched as experimental. Is there a way to capture it in your docs?

@minalsha minalsha closed this Feb 7, 2024
@minalsha minalsha reopened this Feb 7, 2024
@minalsha
Copy link

minalsha commented Feb 7, 2024

sorry i accidentally closed the PR but have reopened it.

@kolchfa-aws
Copy link
Collaborator Author

@minalsha Yes, I'll add an experimental label to all pages. We also include a link for users to provide feedback for experimental issues. Is there an issue or a forum thread where you'd like to collect feedback?

@dbwiddis
Copy link
Member

dbwiddis commented Feb 7, 2024

Is there an issue or a forum thread where you'd like to collect feedback?

I would default to just asking them to create an Issue on Github?

{: .info}

OpenSearch use case templates provide a compact description of the setup process in a JSON or YAML document. These templates describe automated workflow configurations for conversational chat or query generation, AI connectors, tools, agents, and other components that prepare OpenSearch as a backend for generative models.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a link here to sample templates? I have PR open opensearch-project/flow-framework#498 which will add some sample templates in this folder. (I'll have a README there with more detailed instructions shortly.)

https://github.com/opensearch-project/flow-framework/tree/main/sample-templates

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great, thanks! Added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: issue on github. Normally, we want the users to follow the progress of the feature, which they can't do by opening an issue. Is there a meta issue or rfc that we can reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opensearch-project/flow-framework#475 is probably a good place!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Added. Let me know if there is anything else for this issue. Otherwise, we can merge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's it!

@kolchfa-aws kolchfa-aws merged commit 6c92b54 into main Feb 7, 2024
3 of 4 checks passed
@hdhalter hdhalter added 3 - Done Issue is done/complete and removed 6 - Done but waiting to merge PR: The work is done and ready to merge labels Feb 13, 2024
@Naarcha-AWS Naarcha-AWS deleted the flow-framework branch March 28, 2024 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Done Issue is done/complete release-notes PR: Include this PR in the automated release notes v2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] Add document for OpenSearch Flow Framework
8 participants