fix database existence check #463

bernt-matthias · 2020-07-09T10:04:44Z

follow up to #372
fixes #462

TODO:

always dispose engines (also for other dialects)
just use the else branch for postgres? other drivers starting with "postgres" used the else branch before anyway
docs

follow up to kvesteri#372

CaselIT · 2020-07-09T11:04:20Z

Other than the than the disposal of the engines, I think this should work fine 👍

sqlalchemy_utils/functions/database.py

bernt-matthias · 2020-07-09T13:32:42Z

Wondering if one could extend a test case such that a postgres DB without CONNECT privilege for the postgres database is tested.

sqlalchemy-utils/tests/functions/test_database.py

Line 63 in 3090944

class TestDatabasePostgres(DatabaseTest):

Co-authored-by: Nicola Soranzo <[email protected]>

sqlalchemy_utils/functions/database.py

Co-authored-by: Nicola Soranzo <[email protected]>

sqlalchemy_utils/functions/database.py

- postgres: return for the first positive test - use immutable for default argument

…s/sqlalchemy-utils into topic/372-followup

sqlalchemy_utils/functions/database.py

Co-authored-by: Nicola Soranzo <[email protected]>

sqlalchemy_utils/functions/database.py

Co-authored-by: Nicola Soranzo <[email protected]>

jtbeach · 2020-07-15T12:40:27Z

sqlalchemy_utils/functions/database.py

+            except (ProgrammingError, OperationalError):
+                pass
+            finally:
+                engine.dispose()


Do we need to set engine to None here -- is it safe to call dispose() twice on an engine? A second dispose() will be called on line 510

true. how about this e481582

I think now we aren't disposing engines between each iteration of the loop if there are not exceptions. I'm not the author of this code, so don't want to offer too many opinions, but this code looks a little brittle in terms of the manual calls to Engine.dispose(). I would just use a context manager here...one that called dispose() in exit and then use with everywhere I created an Engine

If I understand https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=dispose#connectionless-execution-implicit-execution correctly, the current code uses implicit connections (i.e. calling execute on the engine). This may be changed to

with engine.connect() as connection: connection.execute(...)

then we might no need to use dispose at all https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=dispose#engine-disposal ..

but I'm not an sqlalchemy expert at all .. but I'm fine with implementing the change if you agree

also dispose is called in get_scalar_result .. i.e. in the case of a successful existence check dispose is currently called twice... but according to the docs it just closes connections.

I would do add a context manager similar to contextlib.closing (https://docs.python.org/3/library/contextlib.html#contextlib.closing) that does this:

from contextlib import contextmanager @contextmanager def disposing_engine(engine): try: yield engine finally: engine.dispose()

That way you can just write:

for pdb in postgres_db: url.database = pdb with disposing_engine(sa.create_engine(url)) as engine: if get_scalar_result(engine, text): return True return False

You can follow this pattern for other dialects as well so you don't need an engine var or 'ret` var.

elif dialect_name == 'mysql': with disposing_engine(sa.create_engine(url)) as engine: text = ... return bool(get_scalar_result(engine, text))

It seems a little strange that get_scalar_result would call Engine.dispose() but seems like it is safe to call twice then

Thanks for the detailed explanations. I tried a slightly different way in 6b5cdb2

use a connection for the execution and close the connection (via with). Already with this there should be no need to dispose the engine (since there should be no open connections in the pool that need to be closed).

explicitly use the Null connection pool .. just to be sure.

Lets see if this passes tests.

- use a connection (which is closed automatically) for data base existence check - explicitely use Null connection pool already with the 1st change disposal of the engine (which closes all open connections) is not necessary anymore. with the second change we are completely sure.

bernt-matthias · 2020-07-16T14:32:17Z

Tests are passing. Will test now the postgres part with Galaxy.

bernt-matthias · 2020-07-16T15:06:45Z

Works.

kvesteri · 2020-07-27T08:18:09Z

sqlalchemy_utils/functions/database.py

    """Check if a database exists.

    :param url: A SQLAlchemy engine URL.
+    :postgres_db: Only applies to postgres. List of databases to try to connect


Do this parameter apply to any other databases? In other words does for example Oracle or MSSQL have similar things that a default database should be provided in order to check the existence of another database. If so, we should use some different naming for this.

Furthermore the description of the parameter suggests this parameter is a list of values but the name does not. Either the parameter should be named postgres_databases (or something similar) OR this parameter could be string.

I'm leaning towards the latter option as the end user who is using this function should know which default database to use and just provide that as a string parameter for this function.

Do this parameter apply to any other databases?

I have no idea. But it seems to be postgres specific (I also found no issue that indicates such a problem for other databases). Still - if desired - I could change the parameter name to databases and indicate in the docs that it currently only applies to postgres.

Furthermore the description of the parameter suggests this parameter is a list of values but the name does not.

I would go for postgres_databases. The list has the advantage will work for many settings automatically.

ziima · 2020-07-29T10:02:23Z

sqlalchemy_utils/functions/database.py

-        engine.dispose()
-        return result
+        with engine.connect() as conn:
+            return conn.scalar(sql)


Since it's only a two-liner, is the inline function still needed?

I would say so. Still avoids some code duplication. Do you have some alternative in mind?

That's true, but personally I don't really like dynamically defined functions. Maybe move it to the module level?

personally I don't really like dynamically defined functions

Can do. Out of interest: is there a downside, apart from violating style preferences?

The function is created every time and that most likely eats up a bit of performance.

kvesteri · 2020-08-03T20:02:33Z

The more I think about this the more I'm convinced that we should change the function signature of create_database(url), drop_database(url) and database_exists(url) to following:

drop_database(engine_or_connection, database_name)
create_database(engine_or_connection, database_name)
drop_database(engine_or_connection, database_name)

This would solve the issue raised in this PR. It would also solve #467 and it would have the benefit of being able to use existing connection / engine. What do you guys think?

bernt-matthias · 2020-08-03T20:22:25Z

Thanks for the feedback @kvesteri .

I guess I agree .. given also the code duplication for the engine creation in the current implementation of the three functions.

But I think one would also like to add a function to establish a connection (or engine) -- that could be passed to the three functions. Otherwise one would just delegate the "complexity" to establish a connection which is currently hidden in the three functions to the calling code.

Alternatively one could also add such a function and call it in each of the three function -- just to remove the code duplication. This would have the advantage that the API is unchanged.

ziima · 2020-08-04T07:50:27Z

I would definitely keep the current API, since the most common use case, in my opinion, is most likely to check/create/drop a database using the same connection string as is used in create_engine. I find that extremely helpful.

On the other hand, I agree that a new set of functions *_database_internal(engine_or_connection, database_name) could help to solve some of the issues and reduce some of the complexity of the current implementation.

kvesteri · 2020-08-04T17:11:56Z

@ziima I'd like to hear more of your reasoning since I don't find it cumbersome at all to first create an engine / connection object and then re-use that in subsequent check / create / drop calls.

I do find the proposed solution quite cumbersome though with a postgresql specific keyword argument. The solution proposed in this PR does not solve #467 . It can't reuse an existing connection (= creates unnecessary connections). Furthermore I want the function signature to avoid database specific parameters unless absolutely necessary.

ziima · 2020-08-05T15:06:05Z

In my use case I create a test database when running unittests, and as such I try to keep it as simple as possible.

Minimal working example

import unittest

from sqlalchemy import create_engine
from sqlalchemy_utils.functions import (create_database, database_exists, drop_database)

TEST_DATABASE = 'postgresql:///test_db'

class MyTest(unittest.TestCase):
    def setUp(self):
        if not database_exists(TEST_DATABASE):
            create_database(TEST_DATABASE)

    def tearDown(self):
        drop_database(TEST_DATABASE)

    def test_foo(self):
        engine = create_engine(TEST_DATABASE)
        self.assertTrue(False)

I don't know much about other databases than sqlite3 and postgres, but using your proposal, the example would expanded significantly:

import unittest
from copy import copy

from sqlalchemy import create_engine
from sqlalchemy.engine.url import make_url
from sqlalchemy_utils.functions import (create_database, database_exists, drop_database)

TEST_DATABASE = 'postgresql:///test_db'

class MyTest(unittest.TestCase):
    def setUp(self):
        test_database_url = make_url(TEST_DATABASE)
        self.test_db_name = test_database_url.database
        # Here the whole backend specific code would have been
        admin_database = copy(test_database_url)
        if test_database_url.drivername == 'postgresql':
            admin_database.database = 'template1'  # Still simple version, I ignore other possible database names and a case for #467
        # elif...
        self.admin_connection = create_engine(admin_database)
        if not database_exists(self.admin_connection, self.test_db_name):
            create_database(self.admin_connection, self.test_db_name)

    def tearDown(self):
        drop_database(self.admin_connection, self.test_db_name)

    def test_foo(self):
        engine = create_engine(TEST_DATABASE)
        self.assertTrue(False)

As @bernt-matthias noted above, the complexity would just have to be handled before the functions are called. That's not very handy.

Regarding the problems you mentioned, they might be easier to manage if we create some sort of DatabaseManager with methods check, create and drop. It could maintain a connection between the subsequent operations. The current API could be kept for simple cases, where you don't care about connection reusability.

jakabk · 2020-10-08T13:27:53Z

Waiting for release!

kvesteri · 2020-12-01T15:28:01Z

@ziima the problems you referred to in your comment seem to be limitations of UnitTest. With pytest for example, one could just use fixtures and reuse connection fixture on subsequent function calls. Thus I'd like to see this changed so that the functions would take connection / engine as the first parameter. Introducing DatabaseManager seems counter-intuitive and a bit clumsy.

bernt-matthias · 2020-12-01T17:24:46Z

Thanks for coming back to this @kvesteri

Sounds like a larger change and I'm not sure if I will be able to implement this since actually I have never used the sqlalchemy library as a programmer (it's just used in a project that I'm involved in). So I have for instance no idea of the concepts behind an engine.

With a more precise plan I might give it a go. Maybe something like pseudo code for the at least one of the functions...

ziima · 2020-12-02T15:58:31Z

@kvesteri I didn't quite befriended pytests, but it seems it would require the same setup, just using other tools. Do you have any particular use case in mind?

nsoranzo · 2021-01-08T18:57:05Z

tests/types/test_encrypted.py

    import random
+    import string


Shouldn't this standard library imports just go to the top of the file?

bsquizz · 2021-01-21T19:53:34Z

sqlalchemy_utils/functions/database.py

-
-    elif engine.dialect.name == 'mysql':
+        if databases is None:
+            databases = ('postgres', 'template0', 'template1', None)


If url passed into this function provided a db name, for example:

postgresql://user:[email protected]/dbname

Wouldn't we want to check dbname first here?

Ok nevermind, I see that the SQL command provided by text will handle that.

nsoranzo · 2021-03-17T20:58:20Z

FYI, in the https://github.com/nsoranzo/sqlalchemy-utils/tree/sqlalchemy14 branch I'm trying to combine this PR with #487 (and more).

fix database existence check

6e11f3f

follow up to kvesteri#372

bernt-matthias mentioned this pull request Jul 9, 2020

database_exists: don't connect to 'postgres' data base for database existence check #372

Merged

bernt-matthias added 2 commits July 9, 2020 13:15

always dispose engine after db existence check

36e765b

add docs for postgres_db parameter

743aa21

nsoranzo reviewed Jul 9, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

bernt-matthias force-pushed the topic/372-followup branch 2 times, most recently from 3243079 to cc53be0 Compare July 9, 2020 13:27

fix dialect_name

15dc668

Co-authored-by: Nicola Soranzo <[email protected]>

bernt-matthias force-pushed the topic/372-followup branch from cc53be0 to 15dc668 Compare July 9, 2020 13:59

nsoranzo reviewed Jul 9, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

optimize execution order

527b885

Co-authored-by: Nicola Soranzo <[email protected]>

k4r1 mentioned this pull request Jul 15, 2020

0.36.8 breaks database_exists() call #462

Closed

jtbeach reviewed Jul 15, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

jtbeach reviewed Jul 15, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

bernt-matthias added 2 commits July 15, 2020 11:10

database_exists fix return

37bb37b

- postgres: return for the first positive test - use immutable for default argument

Merge branch 'topic/372-followup' of https://github.com/bernt-matthia…

4b396d8

…s/sqlalchemy-utils into topic/372-followup

nsoranzo reviewed Jul 15, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

use None as default

6dd16e7

Co-authored-by: Nicola Soranzo <[email protected]>

nsoranzo reviewed Jul 15, 2020

View reviewed changes

sqlalchemy_utils/functions/database.py Outdated Show resolved Hide resolved

break if successful

7e41e85

Co-authored-by: Nicola Soranzo <[email protected]>

jtbeach reviewed Jul 15, 2020

View reviewed changes

dispose only for exception

e481582

bernt-matthias force-pushed the topic/372-followup branch from 6b5cdb2 to fd6b773 Compare July 16, 2020 09:39

bernt-matthias force-pushed the topic/372-followup branch from fd6b773 to 74b3513 Compare July 16, 2020 12:01

fix isort call in tox.ini and import order

acb681c

kvesteri requested changes Jul 27, 2020

View reviewed changes

rename parameter to databases

7908606

ziima reviewed Jul 29, 2020

View reviewed changes

move functions to module level

f509f38

fmigneault mentioned this pull request Sep 10, 2020

database_exists throws error if username and db name are different for postgresql #472

Open

nsoranzo mentioned this pull request Nov 9, 2020

Full Python dependencies update galaxyproject/galaxy#10660

Merged

sfc-gh-pkommini mentioned this pull request Dec 17, 2020

Errors while doing upgrade Netflix/dispatch-docker#95

Closed

nsoranzo reviewed Jan 8, 2021

View reviewed changes

bsquizz reviewed Jan 21, 2021

View reviewed changes

This was referenced Mar 17, 2021

Add support for SQLAlchemy 1.4 #487

Closed

SQLAlchemy 1.4 support + Move CI to GitHub workflows #506

Merged

jdavcs mentioned this pull request Mar 22, 2021

Replace sqlalchemy-utilities with local implementation galaxyproject/galaxy#11696

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix database existence check #463

fix database existence check #463

bernt-matthias commented Jul 9, 2020 •

edited

Loading

CaselIT commented Jul 9, 2020

bernt-matthias commented Jul 9, 2020

jtbeach Jul 15, 2020

bernt-matthias Jul 15, 2020

jtbeach Jul 15, 2020

bernt-matthias Jul 15, 2020

bernt-matthias Jul 15, 2020

jtbeach Jul 15, 2020 •

edited

Loading

bernt-matthias Jul 16, 2020

bernt-matthias commented Jul 16, 2020

bernt-matthias commented Jul 16, 2020

kvesteri Jul 27, 2020

bernt-matthias Jul 27, 2020

ziima Jul 29, 2020

bernt-matthias Jul 29, 2020

ziima Jul 29, 2020

bernt-matthias Jul 29, 2020

ziima Jul 29, 2020

kvesteri commented Aug 3, 2020

bernt-matthias commented Aug 3, 2020

ziima commented Aug 4, 2020

kvesteri commented Aug 4, 2020

ziima commented Aug 5, 2020

jakabk commented Oct 8, 2020

kvesteri commented Dec 1, 2020

bernt-matthias commented Dec 1, 2020

ziima commented Dec 2, 2020

nsoranzo Jan 8, 2021

bsquizz Jan 21, 2021

bsquizz Jan 21, 2021

nsoranzo commented Mar 17, 2021

fix database existence check #463

Are you sure you want to change the base?

fix database existence check #463

Conversation

bernt-matthias commented Jul 9, 2020 • edited Loading

CaselIT commented Jul 9, 2020

bernt-matthias commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtbeach Jul 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bernt-matthias commented Jul 16, 2020

bernt-matthias commented Jul 16, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kvesteri commented Aug 3, 2020

bernt-matthias commented Aug 3, 2020

ziima commented Aug 4, 2020

kvesteri commented Aug 4, 2020

ziima commented Aug 5, 2020

jakabk commented Oct 8, 2020

kvesteri commented Dec 1, 2020

bernt-matthias commented Dec 1, 2020

ziima commented Dec 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsoranzo commented Mar 17, 2021

bernt-matthias commented Jul 9, 2020 •

edited

Loading

jtbeach Jul 15, 2020 •

edited

Loading