Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return any data from the /bugzilla_webhook endpoint, move response logging to step or service methods #766

Open
grahamalama opened this issue Nov 16, 2023 · 2 comments

Comments

@grahamalama
Copy link
Contributor

Slightly related to #645:

In step functions, we add responses from various services to an action's context object:

context = context.append_responses(jira_response)

Then, after the step has been executed, we log the responses we received, if any:

step_responses = context.responses_by_step[step.__name__]
if step_responses:
has_produced_request = True
for response in step_responses:
logger.debug(
"Received %s",
response,
extra={
"response": response,
**context.model_dump(),
},
)

Finally, we flatten the responses

responses = list(
itertools.chain.from_iterable(context.responses_by_step.values())
)
return True, {"responses": responses}

and return them (or an error message) in the /bugzilla_webhook router handler:

try:
result = execute_action(webhook_request, actions)
return result
except IgnoreInvalidRequestError as exception:
return {"error": str(exception)}

Some thoughts here:

  • we only return this response to Bugzilla, and it only cares about the 200 status code we respond with. No one (machine nor human) reads the responses
  • we duplicate a lot of response information in the logs. When we run:
for response in step_responses:
    logger.debug(
        "Received %s",
        response,
        extra={
            "response": response,
            **context.model_dump(),
        },
    )

we'll log each response in a message line, but we'll also log each response len(step_responses) times in some structured log field, since we've already added the responses to the context object by this point.

I propose that we:

  • don't return anything from the /bugzilla_webhook endpoint (only a 200 response)
    • follow this chain all the way down, and don't return anything from functions where we don't meaningfully use the return value
  • push response logging "closer" to where we receive the response (Step function, service, or somewhere else?). We'll be able to link log lines for an action pipeline as long as we correlate them with our rid field
@grahamalama
Copy link
Contributor Author

push response logging "closer" to where we receive the response (Step function, service, or somewhere else?). We'll be able to link log lines for an action pipeline as long as we correlate them with our rid field

If we end up doing this, I don't know what the best pattern would be. Maybe both places? Like, log lines could look something like:

jbi.steps, DEBUG, "executing step `create_issue` for action abc" (extra has action_context, including rid)
jbi.services.jira, DEBUG, "creating issue for Bugzilla bug 123456" (extra has action_context, including rid, request body)
jbi.services.jira, DEBUG, "Issue JBI-123 created for Bugzilla bug 123456" (extra has action_context, including rid, response body)
jbi.steps, DEBUG, executing step `add_link_to_bugzilla` for action abc
...

@leplatrem
Copy link
Contributor

don't return anything from the /bugzilla_webhook endpoint (only a 200 response)

follow this chain all the way down, and don't return anything from functions where we don't meaningfully use the return value

+1, as long as testing remains straightforward, sort of functional style

push response logging "closer" to where we receive the response (Step function, service,)
[...]
as long as we correlate them with our rid field

+1
Make sure we document well how to track a request (eg. examples of log explorer queries)

If we end up doing this, I don't know what the best pattern would be. Maybe both places?

I think we can log in both places, and use a different log level for example. For example, steps as INFO and services as DEBUG, or logs announcing sending of request as DEBUG and logs with response as INFO.

When I see the amount of logs per minute that the gke clusters produces, I don't want to feel bad having too much log. Unless it becomes an issue when troubleshooting, but I don't we're there yet.

Note that I think that last time I looked no DEBUG log was published in prod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants