PostgreSQL 14 breaks the .NET and Java drivers for PostgreSQL

Under certain circumstances, the new syntax in PostgreSQL 14 will break the official .NET and Java database driver. Especially if you use either of them to create a SQL function with BEGIN ATOMIC … END. If you don’t over change your database schema npgsql or PgJDBCno reason to worry.

Java’s JDBC and .NET’s ADO.NET database driver frameworks have one feature in common: they both support stacking of SQL statements with semicolons. This was deemed necessary for performance reasons. If you send one command at a time, you have to pay the latency cost for each command. Conversely, when you ship a batch, you only have to pay the cost once.

With some databases like SQL Server, you literally send the entire batch as one massive SQL string. But PostgreSQL’s wire format doesn’t work that way. The client must split the batch into individual commands, although they are still sent as a set.

The naive implementation would be to just assume that each semicolon means the end of the stack. Of course, it’s possible that a semicolon doesn’t represent the end of a statement, but is just part of a string literal. The Npgsql and PgJDBC parsers take this into account.

So far, so good. But what if you define a new SQL function that consists of multiple statements? This still wasn’t a problem since the body of the function would be masked with dollar quoting. Any semicolons within the $$ token pair would be treated like any other string literal.

Then PostgreSQL 14 came along and was added START ATOMICALLY… END, also known as “Standard SQL Syntax”. The release notes say

When writing a function or procedure in standard SQL syntax, the body is immediately parsed and stored as a parse tree. This allows for better feature dependency tracking and may have security benefits.

Because semicolons can appear anywhere within a BEGIN ATOMIC … END block without being quoted in a string, the parsers cannot use the current approach to determine where to split the stack into statements. To fully support this would require either an API change or building a new, far more complex parser.

Already worried about the overhead caused by the current parser, npgsql decided to change the API. They added what they call a Raw SQL mode to the library. This mode also requires the use of positional parameters instead of named parameters.

The PgJDBC team has not yet decided which approach to take. You can track their progress in the error report with the title New PG14 SQL standard function bodies break our SQL parser.

Comments are closed.