Showing posts with label Collation. Show all posts
Showing posts with label Collation. Show all posts

Friday, April 17, 2026

What is an OS collation often referred to when dealing with PostgreSQL servers?

We have seen the following error in on of our postgres databases:
WARNING: database "testdb" has a collation version mismatch DETAIL: 
The database was created using collation version 2.17, but the operating system provides version 2.34. 
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE testdb REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
So what exactly is a "Collation"?

An OS collation is the set of language‑dependent rules supplied by the operating system that define how strings are compared and sorted (alphabetical order, case rules, accents, etc.). On Linux, these rules come from glibc locales.

What the error means

PostgreSQL databases rely on the operating system’s locale / collation implementation (typically glibc on Linux) for:
  • string ordering (ORDER BY)
  • comparisons (<, >, =)
  • indexes on text, varchar, char using the default collation The database testdb was created when the OS provided collation version 2.17, but the server is now running glibc 2.34.

  • In other words:

    • Database collation version: 2.17 
    • OS collation version: 2.34

    This usually happens after:

    • an OS upgrade
    • moving a data directory to a newer system
    • restoring a dump created on an older system
    Collation rules can change between glibc versions.

    Implications for the database

    • The sort order of strings may differ
    • Existing indexes may no longer match the current collation rules
    • Query results involving ordering or comparisons could be wrong
    • Index corruption does not occur, but index semantics may be invalid


    PostgreSQL cannot automatically fix this, so it warns you.

    Only objects that use the default collation, typically:
    • btree indexes on text / varchar
    • constraints using string comparison
    • materialized views
    Objects using COLLATE "C" / POSIX or ICU collations are not affected.

    Solution:

    For each database, and every schema:

    Rebuild all indexes and other dependet objects that rely on the default collation

    Then refresh the collation version:
    ALTER DATABASE prod REFRESH COLLATION VERSION;
    
    The alternative to this is stated in the messages itself:
    or build PostgreSQL with the right library version
    
    which means downgrade glibc or run PostgreSQL on a system with glibc 2.17 — usually not desirable.

    For references, how does this work on the oracle platform?

    For Oracle
    • Collation (NLS sorting/comparison rules) is implemented inside the database engine
    • Controlled via NLS_SORT, NLS_COMP, character set, etc.
    • Independent of the OS
    • OS upgrades do not change string ordering semantics
    For PostgreSQL (default setup)
    • Delegates collation behavior to the operating system locale
    • Uses glibc on Linux for ORDER BY, comparisons, and btree indexes on text
    • If glibc changes, collation rules may change → PostgreSQL detects and warns
    So PostgreSQL is not missing a feature; it is designed to rely on the OS by default. It is a design choice: PostgreSQL usually delegates collation to the OS; Oracle does not