Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Resolve issue where read() would fail if it received unexpected/undeclared top-level properties in a stream #131

Merged
merged 55 commits into from
Mar 27, 2024

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Mar 19, 2024

The primary cause for failure in the case of PokeAPI source was a JSON field called cries was included in the delivered data, but not in the source's schema.

The core fix of this PR is to ensure that our base implementation does not attempt to write fields which are undeclared in the source schema.

Notes:

  • This also adds quoting of column names for bigquery.

@aaronsteers aaronsteers changed the title Add PokeAPI to pytest test suite Fix: Add PokeAPI to pytest test suite, resolve failures Mar 19, 2024
@aaronsteers aaronsteers changed the base branch from main to aj/add-windows-supprt March 19, 2024 05:13
Base automatically changed from aj/add-windows-supprt to main March 19, 2024 05:25
@aaronsteers
Copy link
Contributor Author

It looks like at least one root cause of failure is that source-pokeapi retrieves a stream property called cries which is not contained in the catalog file.

Error while reading data, error message: JSON parsing error in row starting at position 0: No such field: cries.

https://github.com/airbytehq/airbyte/blob/44f784e200fd66d43e0f948858aa2a2ee184fef7/docs/connector-development/tutorials/cdk-speedrun-assets/pokemon.json

@aaronsteers aaronsteers marked this pull request as ready for review March 22, 2024 05:25
@aaronsteers aaronsteers changed the title Fix: Add PokeAPI to pytest test suite, resolve failures Fix: Add PokeAPI to pytest test suite, resolve failures due to receiving properties not declared in the source catalog Mar 22, 2024
Copy link
Contributor

@bindipankhudi bindipankhudi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@aaronsteers
Copy link
Contributor Author

aaronsteers commented Mar 27, 2024

I did more digging and found that CI is failing the pokemon integration test, but inexplicably so.

In CI, it uses the incorrect endpoint and gets a 404:

https://pokeapi.co/api/v2/pikachu

Instead of the correct endpoint:

https://pokeapi.co/api/v2/pokemon/pikachu

image

It's not worth debugging at this point since the fix for the core issue is already contained in this PR. I'll open a low-pri issue to resolve. In the meanwhile, we'll just skip this particular test when run in CI. (And if it continues to act up, we can just remove that test entirely.)

cc @bindipankhudi

Update: Logged as: #146

@aaronsteers aaronsteers changed the title Fix: Add PokeAPI to pytest test suite, resolve failures due to receiving properties not declared in the source catalog Fix: Resolve issue where read() would fail if it received unexpected top-level properties in a stream Mar 27, 2024
@aaronsteers aaronsteers changed the title Fix: Resolve issue where read() would fail if it received unexpected top-level properties in a stream Fix: Resolve issue where read() would fail if it received unexpected/undeclared top-level properties in a stream Mar 27, 2024
@aaronsteers aaronsteers enabled auto-merge (squash) March 27, 2024 21:53
@aaronsteers aaronsteers merged commit 7652c33 into main Mar 27, 2024
16 checks passed
@aaronsteers aaronsteers deleted the aj/add-poke-api-integ-tests branch March 27, 2024 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 Bug: PostgreSQL Cache can't handle undeclared dict type Column.
2 participants