Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Default database collation to collate=C #100

Open
guewen opened this issue Feb 25, 2019 · 3 comments
Open

[proposal] Default database collation to collate=C #100

guewen opened this issue Feb 25, 2019 · 3 comments

Comments

@guewen
Copy link
Contributor

guewen commented Feb 25, 2019

When a database is created by the container's entrypoint, it uses the default collation that will be en_US.utf8.

As discussed here: odoo/odoo#25196 (comment), this may be under-optimized with the use of LIKE queries using wildcards (LIKE 'foo%').

We have several possible axes of improvements:

  • add text_pattern_ops case by case where necessary
  • add trigram indices using pg_trgm which benefits for LIKE '%foo%' queries as well, case by case too however
  • create the databases with a C collation and locale en_US.utf8 (collate=C)

This is mainly the last point which should be discussed here.

Pros:

  • consistent sorting
  • expected general improvement of performance

Cons:

  • sorting of accented chars "sounds" wrong for French: Blanche, Béatrice, Claude is going to be sorted as Blanche, Béatrice, Claude instead of Béatrice, Blanche, Claude. Can be resolved with unaccent

My proposal is to change the calls to createdb in the image to always create them with collate=C.

@guewen
Copy link
Contributor Author

guewen commented Feb 25, 2019

@sebastienbeau @rvalyi would you agree with this change?
An alternative would be to be able to configure it from a variable, but if the change is fine for everybody, let's keep it simple :)

@simahawk
Copy link
Member

simahawk commented Apr 9, 2019

@sebastienbeau ping

@guewen
Copy link
Contributor Author

guewen commented May 19, 2020

Version 13.0 is particularly affected by this, as child_of uses LIKE '1/8/%' for hierarchies.
We must do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants