[PROPOSAL] Restructuring The Repo #183

nhtruong · 2024-01-22T16:55:48Z

One of the biggest hurdles contributors have to deal with when contributing to this repo is its complex structure and rules. The ideas above aim to address some of such issues.

1. Naming of operations in the same group.

OpenSearch has many operations (i.e. url and http-verb parings) that serve identical purposes. They are grouped into operation groups. To distinguish one operation from another in the same group, we use their HTTP verb and differences in path params:

For example, the 4 operations of the search group are currently named:

@http(method: "GET", uri: "/_search")
Search_Get

@http(method: "POST", uri: "/_search")
Search_Post

@http(method: "GET", uri: "/{index}/_search")
Search_Get_WithIndex

@http(method: "POST", uri: "/{index}/_search")
Search_Post_WithIndex

While the cat.health group, which only has 1 operation:

@http(method: "GET", uri: "/_cat/health")
CatHealth

But it can get very complicated with groups that have many operations or many path parameters.

@http(method: "GET", uri: "/_nodes/hot_threads")
NodesHotThreads

@http(method: "GET", uri: "/_cluster/nodes/hot_threads")
NodesHotThreads_DeprecatedDash 

@http(method: "GET", uri: "/_cluster/nodes/hotthreads")
NodesHotThreads_DeprecatedCluster

@http(method: "GET", uri: "/_nodes/hotthreads")
NodesHotThreads_Deprecated

@http(method: "GET", uri: "/_nodes/{node_id}/hot_threads")
NodesHotThreads_WithNodeId

@http(method: "GET", uri: "/_nodes/{node_id}/hotthreads")
NodesHotThreads_WithNodeId_Deprecated

@http(method: "GET", uri: "/_cluster/nodes/{node_id}/hot_threads")
NodesHotThreads_WithNodeId_DeprecatedDash

@http(method: "GET", uri: "/_cluster/nodes/{node_id}/hotthreads")
NodesHotThreads_WithNodeId_DeprecatedCluster

Another downside of this is it's not clear if not clear what kind of component the model represents. CatHealth could be the name of the CatHealth operation, or it could be the out of the operation.

Proposed Naming Format:

OP_<namespace>_<group>_<optional_number>

The search operations would be come

@http(method: "GET", uri: "/_search")
OP_Search_1

@http(method: "POST", uri: "/_search")
OP_Search_2

@http(method: "GET", uri: "/{index}/_search")
OP_Search_3

@http(method: "POST", uri: "/{index}/_search")
OP_Search_4

While the cat.health operation will be come OP_Cat_Health
This new naming format is a lot simpler and more clear on what type of structure is being defined. Note that the @http trait right above each operation definition already states what http-verb and path params accompanying the operation.

2. Naming of Operation Components

This is just a small change to make the namespace and operation name pop out more:
OP_Cat_Health_QueryParam --> OP_Cat_Health_QUERYPARAM
OP_Search_1_Input --> OP_Search_1_INPUT
We're simply putting all words used to describe component types in ALL CAPS.

3. Splitting structures.smithy into multiple files

Right now, each operation group has a file called structures.smithy that holds all components of every operation in the group: shared query string params, qperation inputs and outputs, response bodies, and request bodies. Some of these files are quite large without the request and response bodies.

Proposed files:

inputs.smithy
outputs.smithy
query_params.smithy
request_body.smithy

4. No shared default values

There are many path and query params that are repeated in different operations like index_name and node_id. While they have identical schemas, they often differ slightly in the descriptions and especially the default values. When defining the shared smithy models for these params, we should avoid adding the @default traits to avoid accidentally inheriting these values from the shared models.

5. Organize shared data models by namespace/function rather than by types

Data models shared across repo are being placed in files like

common_booleans.smithy
common_enum_lists.smithy
...

While it works fine right now, once we add very complex data structures from response and request bodies, we will encounter reference hell where bit and pieces of a large data structures will appear in a dozen files.

Proposed solution

I don't have an exact proposal on how we organize these models right now, but the general ideal is putting related models as close together as possible as we divide them into namespace and functionality. It will become clearer when I backfill the repo with request and response bodies.

Note that all of these changes can be done programmatically while I backfill the missing request and response bodies.

The text was updated successfully, but these errors were encountered:

dblock · 2024-01-23T19:16:01Z

While the cat.health operation will be come OP_Cat_Health

I like the intent, but OP feels strange. When are these names used, or are they purely internal? If they are internal, what's the benefit of adding OP and not calling these the same as the REST path, including variable parts, e.g. Get_Cat_Health and Post_Index_Search (everything starts with a verb, everything matches REST path)?

We're simply putting all words used to describe component types in ALL CAPS.

CAPS are usually CONSTANTS in other languages. What do these things represent in OpenSearch code? They should match those names IMO.

Splitting structures.smithy into multiple files

Seems reasonable.

No shared default values

Makes sense. Again I would say these need to match what's in OpenSearch core more than have their own variation.

Organize shared data models by namespace/function rather than by types

Same as above. Anything shared has to also be shared in OpenSearch core, then it makes sense to share these models here.

nhtruong · 2024-04-02T17:40:37Z

This is no longer needed as we've migrated over to native OpenAPI

github-actions bot added the untriaged label Jan 22, 2024

nhtruong removed the untriaged label Jan 22, 2024

dblock added the enhancement New feature or request label Jan 23, 2024

nhtruong closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Restructuring The Repo #183

[PROPOSAL] Restructuring The Repo #183

nhtruong commented Jan 22, 2024 •

edited

Loading

dblock commented Jan 23, 2024

nhtruong commented Apr 2, 2024

[PROPOSAL] Restructuring The Repo #183

[PROPOSAL] Restructuring The Repo #183

Comments

nhtruong commented Jan 22, 2024 • edited Loading

1. Naming of operations in the same group.

Proposed Naming Format:

2. Naming of Operation Components

3. Splitting structures.smithy into multiple files

Proposed files:

4. No shared default values

5. Organize shared data models by namespace/function rather than by types

Proposed solution

dblock commented Jan 23, 2024

nhtruong commented Apr 2, 2024

nhtruong commented Jan 22, 2024 •

edited

Loading