Davide Muzzarelli » cqrs /blog L'informatica a valore aggiunto Mon, 25 May 2015 22:41:52 +0000 en hourly 1 http://wordpress.org/?v=3.3.2 Dynamic pure SQL WHERE clauses and optional JOINs /blog/2015/01/dynamic-pure-sql-where-clauses-and-optional-joins/ /blog/2015/01/dynamic-pure-sql-where-clauses-and-optional-joins/#comments Tue, 06 Jan 2015 02:44:54 +0000 Davide Muzzarelli /blog/?p=332 Update 2015-05-25: see the video tutorial at the end of this article.

A dynamic SQL query is a query that operates accordingly through parameters.

Sometimes it is necessary to query the database filtering the rows in base at the user input. Like in a search, an user would choose several filters and not others.

The examples are in PostgreSQL and Python, but can be easily adapted to any database and language.

Imagine a table of customers:

CREATE TABLE "customers" (
    "id" serial PRIMARY KEY,
    "name" varchar(64) NOT NULL,
    "active" bool NOT NULL,
    "country" varchar(32)
);

The classic approach

The classic approach is to compose the SQL joining strings and conditional constructs, for example:

# Function that performs the query
# The parameters have null defaults
def customers_filtered(active=None, country=None):
    # This is the beginning of the query
    query = '
        SELECT *
        FROM customers
        WHERE'

    filters = []

    # Add the active filter, if not null
    if active is not None:
        filters.append('active=%(active)s')

    # Add the country filter, if not null
    if country is not None:
        filters.append('country=%(country)s')

    # Join all the filter queries chunks with an AND
    query += ' AND '.join(filters)

    # Prepare the parameters
    parameters = {
        'active': active,
        'country': country
    }

    # Execute the query and return the results
    return execute(query, parameters)

The resulting code is long, the query takes times to write and is unclear to read.

The database have to make an execution plan before executing the query. The execution plan is re-used when the SQL query is static, but this query is dynamic so the execution plan will be not re-used a second time and making the query a bit slower.

Optional WHERE clause

There is a solution with a negligible impact on perfomances, much easier to write and to read. The trick is to use NULL conditionals, so the query will be:

# Function that performs the query
# The parameters have the null defaults
def customers_filtered(active=None, country=None):
    # This is the full query!
    query = '
        SELECT *
        FROM customers
        WHERE (%(active)s IS NULL OR active=%(active)s)
              AND (%(country)s IS NULL OR country=%(country)s)'

    # Prepare the parameters
    parameters = {
        'active': active,
        'country': country
    }

    # Execute the query and return the results
    return execute(query, parameters)

The trick is to validate the right part of the OR clause only if the parameter is not NULL.

Optional JOINs

A similar approach can be applied for the JOIN clauses. First, create the “orders” table:

CREATE TABLE "orders" (
    "id" serial PRIMARY KEY,
    "customer_id" int NOT NULL,
    "date" timestamp NOT NULL,
    "value" int NOT NULL
);

It is possible to use a dynamic JOIN in order to get or not expensive data to calculate. The following SUM will be performed only if needed, setting the “flag” as true:

SELECT u.id, u.name, u.active, SUM(o.value) AS total
FROM customers AS u
LEFT JOIN orders AS o ON (true=%(flag)s AND o.customer_id=u.id)
GROUP BY u.id

Also in this case the performance impact is risible and the query remains very clear to read.

Conclusions

The classic method of composing the queries can be a source of bugs and headaches, this should be avoided. These solutions can reduce the code and increase the readability, especially when the ORM cannot be used or with the CQRS pattern (Command Query Responsibility Separation).

A video tutorial

Nat Dunn has created a nice video tutorial inspired by this article, it is part of his Python course:

Discussion on
Hacker News: https://news.ycombinator.com/item?id=8842677
Reddit: http://www.reddit.com/r/Database/comments/2rh0ak/dynamic_pure_sql_where_clauses_and_optional_joins/
Twitter: https://twitter.com/davmuz_en/status/552295916385087489

]]>
/blog/2015/01/dynamic-pure-sql-where-clauses-and-optional-joins/feed/ 0