-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore: Bump to Sqlalchemy 2.0 #396
Conversation
Note Currently processing new changes in this PR. This may take a few minutes, please wait... Files ignored due to path filters (1)
Files selected for processing (1)
WalkthroughWalkthroughThe changes involve updates to SQL command handling and execution across multiple files in the Airbyte codebase. Key modifications include the adoption of Changes
Possibly related PRs
Wdyt? Tip You can validate your CodeRabbit configuration file in your editor.If your editor has YAML language server, you can enable auto-completion and validation by adding Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (9)
airbyte/_processors/sql/snowflake.py (2)
Line range hint
150-163
: Great job simplifying the SQL statement construction!The use of an f-string for the
copy_statement
andtext()
for SQL execution is a nice improvement. It's more readable and aligns well with SQLAlchemy 2.0 practices.One small suggestion: wdyt about using a multi-line string for better readability? Something like:
copy_statement = f""" COPY INTO {temp_table_name} ({columns_list_str}) FROM ( SELECT {variant_cols_str} FROM {internal_sf_stage_name} ) FILES = ({files_list}) FILE_FORMAT = (TYPE = JSON, COMPRESSION = GZIP) """This could make the SQL structure even clearer. What do you think?
177-183
: Solid update to usetext()
here as well!This change is consistent with the earlier updates and follows SQLAlchemy 2.0 best practices. Nice work!
A small suggestion for consistency: wdyt about using a multi-line string here too? It could look like:
connection.execute( text(""" ALTER SESSION SET QUOTED_IDENTIFIERS_IGNORE_CASE = TRUE MULTI_STATEMENT_COUNT = 0 """) )This would match the style used in the
copy_statement
. What do you think?tests/integration_tests/conftest.py (1)
124-124
: Nice touch usingtext()
for BigQuery schema dropping!Great job consistently applying
text()
for SQL queries across different database engines. It's a solid practice for SQL injection prevention and readability.Quick thought: I noticed we're using a general
Exception
in thesuppress
context. Wdyt about catching more specific exceptions here? It might help with debugging if unexpected errors occur. Maybe something like:from sqlalchemy.exc import SQLAlchemyError with suppress(SQLAlchemyError): # ... rest of the codeThis way, we're only suppressing SQLAlchemy-related errors. What do you think?
airbyte/caches/_catalog_backend.py (1)
16-16
: LGTM! How about a small tweak?The change looks good and aligns with the SQLAlchemy 2.0 migration. Nice job consolidating the imports! 👍
Just a tiny suggestion: wdyt about alphabetizing the imports from
sqlalchemy.orm
? It might make it easier to scan in the future. Something like:from sqlalchemy.orm import declarative_base, Sessiontests/integration_tests/test_all_cache_types.py (1)
250-256
: Nice update to SQLAlchemy 2.0 style! How about a small tweak?The changes look good and align well with SQLAlchemy 2.0 best practices. Using the context manager for the SQL connection and the
text()
function for the SQL command is spot on.One small suggestion: wdyt about extracting the SQL command into a separate variable for better readability? Something like this:
sql_command = text( f"ALTER TABLE {new_generic_cache.schema_name}.{table_name} " "DROP COLUMN _airbyte_raw_id" ) with new_generic_cache.processor.get_sql_connection() as conn: conn.execute(sql_command)This could make the code a bit more readable and easier to maintain. What do you think?
airbyte/cloud/sync_results.py (2)
280-280
: Explicit return looks good! Maybe we could add a type hint?The addition of the explicit
return
statement improves clarity. Nice job! 👍What do you think about adding a return type hint to make it even more explicit? Something like:
def get_sql_engine(self) -> sqlalchemy.engine.Engine: return self.get_sql_cache().get_sql_engine()This would make the method's contract crystal clear. WDYT?
291-291
: Explicit return looks great! How about adding type hints?The addition of the explicit
return
statement enhances clarity. Well done! 🎉To make it even more robust, what do you think about adding parameter and return type hints? Something like:
def get_sql_table(self, stream_name: str) -> sqlalchemy.Table: return self.get_sql_cache().processor.get_sql_table(stream_name)This would make the method's input and output types crystal clear. Does that sound good to you?
airbyte/shared/sql_processor.py (2)
135-135
: Good SQLAlchemy 2.0 preparation!Adding
future=True
is a great step towards SQLAlchemy 2.0 compatibility. To make the intention clear for future readers, what do you think about adding a brief comment explaining why we're usingfuture=True
? Something like:# Enable SQLAlchemy 2.0 future mode for forward compatibility future=True,This could help other developers understand the purpose of this parameter. wdyt?
1117-1119
: Updated query construction - looks good!The changes to the
select
statement in_emulated_merge_temp_table_to_final_table
seem to align with SQLAlchemy 2.0 query construction practices. To make this clear for future readers, what do you think about adding a brief comment? Something like:# Construct the select statement using SQLAlchemy 2.0 style select_new_records_stmt = ( select(temp_table).select_from(joined_table).where(where_not_exists_clause) )This could help other developers understand that this is intentionally using the new style. wdyt?
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
<details>
Files selected for processing (9)
- airbyte/_processors/sql/snowflake.py (4 hunks)
- airbyte/caches/_catalog_backend.py (1 hunks)
- airbyte/caches/_state_backend.py (1 hunks)
- airbyte/cloud/sync_results.py (2 hunks)
- airbyte/datasets/_sql.py (2 hunks)
- airbyte/shared/sql_processor.py (3 hunks)
- pyproject.toml (1 hunks)
- tests/integration_tests/conftest.py (3 hunks)
- tests/integration_tests/test_all_cache_types.py (2 hunks)
Files skipped from review due to trivial changes (1)
- airbyte/caches/_state_backend.py
Additional comments not posted (15)
airbyte/_processors/sql/snowflake.py (2)
7-7
: Nice job updating the imports!The removal of
dedent
and addition oftext
fromsqlalchemy
align well with the changes in SQL statement handling. This is a good step towards adopting SQLAlchemy 2.0 practices. LGTM!Also applies to: 15-15
Line range hint
1-183
: Great job on the SQLAlchemy 2.0 migration for this file!The changes you've made consistently apply the
text()
function for SQL execution and simplify SQL statement construction. These updates align perfectly with the PR objective of migrating to SQLAlchemy 2.0 and follow the recommendations in the migration guide.It's particularly commendable that you've maintained the existing functionality while making these updates. The code looks cleaner and more consistent now.
One thing to consider: have you run the test suite to ensure these changes don't introduce any unexpected behavior? If not, it might be worth doing so as a final check.
Overall, excellent work on this migration! 👍
tests/integration_tests/conftest.py (3)
19-19
: LGTM! Importing SQLAlchemy'stext
function.Nice job importing
text
from SQLAlchemy! This aligns perfectly with the migration to SQLAlchemy 2.0. Usingtext()
for SQL queries is a best practice for safety and clarity. Wdyt about adding a comment explaining why we're usingtext()
for future maintainers?
118-121
: Awesome job updating BigQuery engine creation!The addition of
future=True
in thecreate_engine
call for BigQuery is spot on. It's great to see consistency across different database engines in adopting SQLAlchemy 2.0 features. Keep up the good work!
93-98
: Great updates to Snowflake cache cleanup!Love the use of
future=True
increate_engine
andtext()
for the SQL query. These changes align nicely with SQLAlchemy 2.0 best practices.Quick question: I noticed we switched from
begin()
toconnect()
. Was this intentional? If we want to ensure the DROP SCHEMA is in a transaction, we might want to usewith engine.begin() as connection:
instead. Wdyt?Verification successful
Let's give it another shot!
It seems like
rg
doesn't recognize--type python
. How about we try using--type py
instead to search forengine.begin()
instances?Wdyt?
Verification Complete:
engine.begin()
Usage ConfirmedOnly one instance of
engine.begin()
was found intests/integration_tests/conftest.py
, indicating that the change is localized and everything looks good!Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify if there are any other instances of engine.begin() in the codebase rg --type python 'engine\.begin\(\)'Length of output: 69
Script:
#!/bin/bash # Verify if there are any other instances of engine.begin() in the codebase rg --type py 'engine\.begin\(\)'Length of output: 117
airbyte/datasets/_sql.py (3)
Line range hint
1-200
: Overall, great job on the SQLAlchemy 2.0 migration!The changes in this file look good and align well with the migration to SQLAlchemy 2.0. The import statement has been simplified, and the query construction has been updated to use more explicit methods.
Just a thought - would it be helpful to add a comment mentioning the SQLAlchemy 2.0 compatibility at the top of the file? It could help future maintainers understand the context of these changes. wdyt?
96-96
: Nice update to query construction!The change from
alias()
tosubquery()
looks good and aligns with SQLAlchemy 2.0 practices. It's a more explicit way of constructing the subquery.Have you had a chance to test this change with various types of queries to ensure it behaves as expected? Maybe we could add a quick unit test to verify the behavior? wdyt?
10-10
: LGTM! Simplified import statement.The import statement has been updated to directly import
Selectable
fromsqlalchemy
. This change aligns well with the SQLAlchemy 2.0 update. Looks good to me!Just to double-check, have you verified that this import works correctly with SQLAlchemy 2.0 in your local environment? wdyt?
pyproject.toml (3)
41-41
: Heads up on potential dependency conflicts 🕵️♂️I noticed we're updating SQLAlchemy, which is awesome! Just wanted to make sure we've checked for any potential conflicts with other dependencies. For example, I see we're using
snowflake-sqlalchemy
(version ^1.5.1). Have we confirmed it's compatible with SQLAlchemy 2.0?Maybe we could add a quick note in the PR description about any dependency checks we've done? What do you think about that idea?
#!/bin/bash # Check for any SQLAlchemy version specifications in other dependencies rg "sqlalchemy.*=.*" pyproject.toml
41-41
: Embracing the future! 🚀I noticed we removed the warning filter for
sqlalchemy.exc.MovedIn20Warning
. That's a bold move! Are we confident we've addressed all the warnings? It might be helpful to run the project and check for any new warnings that pop up.What do you think about adding a TODO comment to review and remove any deprecated SQLAlchemy 1.x code in a follow-up PR? This could help us keep track of any lingering migration tasks. Wdyt?
#!/bin/bash # Check for any remaining MovedIn20Warning suppressions rg "MovedIn20Warning" -g '!pyproject.toml'
41-41
: Exciting update to SQLAlchemy 2.0! 🎉I see we're making the leap to SQLAlchemy 2.0. That's a big move! Have we thoroughly tested this across the project? It might be worth adding a note in the PR description about any major changes or gotchas we encountered during the upgrade process. What do you think about that idea? 🤔
Also, just to double-check, are we sure all our code is compatible with 2.0? Sometimes these major version jumps can be tricky. Wdyt about running a full test suite if we haven't already?
airbyte/cloud/sync_results.py (1)
Line range hint
280-291
: Overall, these changes look great!The explicit
return
statements in bothget_sql_engine
andget_sql_table
methods improve code clarity and align well with SQLAlchemy 2.0 migration best practices. These changes make the code more readable without altering functionality.If you decide to add the suggested type hints, it would be the cherry on top for this clean-up. Great work on making the codebase more explicit and maintainable! 🚀
airbyte/shared/sql_processor.py (3)
65-65
: Nice type checking optimization!Moving
TextClause
to the TYPE_CHECKING block is a good way to improve startup time and avoid potential circular imports. Just wanted to double-check: are we sure thatTextClause
is not used anywhere in the runtime code? If so, this change looks great! wdyt?
1117-1119
: Simplified SQL execution logic - nice!I like the simplification of the
_execute_sql
method. Removing the special handling forTextClause
makes the code more straightforward. However, I wanted to check: are we sure that removing theautocommit=True
forTextClause
instances won't cause any issues? If this change is intentional and aligns with SQLAlchemy 2.0 practices, then it looks great! wdyt?
Line range hint
1-1119
: Great work on SQLAlchemy 2.0 migration!Overall, the changes in this file look really good and align well with the goal of updating to SQLAlchemy 2.0. The code has been simplified in places, and the query construction has been updated to the new style. The suggestions I've made are mostly about adding some comments to clarify the intentions behind the changes, which could be helpful for future maintenance.
One thing to keep in mind is the removal of the special handling for
TextClause
in the_execute_sql
method. It might be worth double-checking that this doesn't cause any unexpected behavior changes in other parts of the codebase.Great job on this update! The code looks cleaner and more future-proof now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range and nitpick comments (1)
airbyte/_processors/sql/bigquery.py (1)
102-102
: Looks good, but have you considered setting an explicit max length?The change to
sqlalchemy.types.String(length=None)
is great for the SQLAlchemy 2.0 migration. However, settingNone
for length might lead to unexpected behavior in some databases. What do you think about using an explicit max length instead? For example,sqlalchemy.types.String(length=MAX_LENGTH)
whereMAX_LENGTH
is a constant defined based on BigQuery's limitations. WDYT?
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
<details>
Files selected for processing (6)
- airbyte/_processors/sql/bigquery.py (3 hunks)
- airbyte/caches/_catalog_backend.py (3 hunks)
- airbyte/caches/_state_backend.py (5 hunks)
- airbyte/datasets/_sql.py (5 hunks)
- airbyte/shared/sql_processor.py (4 hunks)
- pyproject.toml (2 hunks)
Files skipped from review as they are similar to previous changes (5)
- airbyte/caches/_catalog_backend.py
- airbyte/caches/_state_backend.py
- airbyte/datasets/_sql.py
- airbyte/shared/sql_processor.py
- pyproject.toml
Additional comments not posted (1)
airbyte/_processors/sql/bigquery.py (1)
17-17
: Nice import addition!The specific import of
sqlalchemy.types
assqlalchemy_types
looks good. It's a clean way to bring in just what we need. LGTM!
/poetry-lock
|
Resolving dependencies... changes
Following this guide: https://docs.sqlalchemy.org/en/20/changelog/migration_20.html
Summary by CodeRabbit
Release Notes
New Features
sqlalchemy.text()
for improved SQL execution.Bug Fixes
Chores
^2.0.35
to leverage new features and improvements.sqlalchemy2-stubs
to support type checking.