Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve /micro/bad response format #74

Open
adatzer opened this issue Aug 25, 2021 · 2 comments
Open

Improve /micro/bad response format #74

adatzer opened this issue Aug 25, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@adatzer
Copy link
Contributor

adatzer commented Aug 25, 2021

This issue is about potential improvements to the bad row output in Snowplow Micro.

Is your feature request related to a problem? Please describe.

Currently, the response format of /micro/bad endpoint is a JSON array of BadEvents.

The errors parameter of a BadEvent is a list of strings:

  1. The first "error" is just a message, that denotes a type of failure in a general way. For example:
"Error while extracting event(s) from collector payload and validating it/them."
  1. The second "error" is json-escaped bad row. For example:
"{\"schema\":\"iglu:com.snowplowanalytics.snowplow.badrows/tracker_protocol_violations/jsonschema/1-0-0\",\"data\":{\"processor\":{\"artifact\":\"snowplow-micro\",\"version\":\"1.1.2\"},\"failure\":{\"timestamp\":\"2021-08-24T13:09:14.799119Z\",\"vendor\":\"com.snowplowanalytics.snowplow\",\"version\":\"tp2\",\"messages\":[{\"field\":\"body\",\"value\":\"\",\"error\":\"invalid json: exhausted input\"}]},\"payload\":{\"vendor\":\"com.snowplowanalytics.snowplow\",\"version\":\"tp2\",\"querystring\":[],\"contentType\":\"application/json\",\"body\":\"\",\"collector\":\"ssc-2.3.1-stdout$\",\"encoding\":\"UTF-8\",\"hostname\":\"0.0.0.0\",\"timestamp\":\"2021-08-24T13:09:14.797Z\",\"ipAddress\":\"172.17.0.1\",\"useragent\":\"curl/7.74.0\",\"refererUri\":null,\"headers\":[\"Timeout-Access: <function1>\",\"Host: 0.0.0.0:9090\",\"User-Agent: curl/7.74.0\",\"Accept: */*\",\"application/json\"],\"networkUserId\":\"283214ca-7868-465b-95eb-27418c8b872f\"}}}"

The problem is that users need to parse the json-escaped string in order to get to the actual error that resulted to a failed event.
In addition, the compact BadRow already contains information also shared in the other BadEvent's parameters (collectorPayload and rawEvent).

Improving the bad row output of Micro will also improve the user experience and at the same time provide the necessary information without loss or duplication.

Describe alternatives you've considered

Solutions we've considered so far in discussions (cc @paulboocock , @istreeter ) include:

  1. errors to be a list of messages, not a list of json-escaped bad rows.
  2. /micro/bad to respond with a JSON array of BadRows instead.
@adatzer adatzer added the enhancement New feature or request label Aug 25, 2021
@miike
Copy link
Contributor

miike commented Aug 27, 2021

I think having a list of errors in addition to a JSON array of the bad rows would definitely be useful. Having the complete JSON of the bad rows also opens up the bad endpoint to having some ability to filter in the same way that you can filter on good - only in this case you be able to filter on schema_violations, tracker_protocol_violations etc.

I'd also be tempted to standardise some of the JSON response that is returned back

e.g., for /micro/good, event.rawEvent is equivalent to event.collectorPayload in /micro/bad

@markst
Copy link

markst commented Jul 10, 2024

Possibly a duplicate of #14 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants