The condition “IS NULL” and “OR” can inhibit you

Security
20/3/2024
Tomasz Chwasewicz
Table of contents

Where the “IS NULL” or “OR” condition rages unchecked, queries often fall into a performance rabbit hole — a place where CPU cycles burn faster than a comet's tail and I/O operations multiply with rabbit fervor in the spring.

The purpose of this text is clear - to guide database professionals towards faster and more efficient query execution. We'll roll up our sleeves to delve into the reasons why these particular patterns reduce performance and outline action strategies to mitigate those bumps along the way.

“OR” == BAD, why?

The “OR” condition, when overused in queries, can lead to several inefficiencies. One of the main problems is that it can cause the query to be evaluated repeatedly by Oracle, once for each condition separated by “OR”. This is because “OR” effectively expands your search criteria. This can prevent the database from using indexes effectively. As a result, instead of a single, efficient path to retrieve the required data, Oracle can scan larger parts of a table or index. This leads to increased I/O and CPU usage, and consequently slower query performance.

Moreover, queries “enriched” with multiple “OR” terms can become more difficult to read and maintain. With the addition of more conditions, transparency is lost, which makes it difficult to understand the intent of the query at a glance.

Query Optimization: Let's move away from “OR”

The “IN” condition provides a concise and efficient way to find multiple values in a column. Compared to combining several “OR” conditions together, “IN” can greatly reduce the complexity of the query and improve its execution plan. For example, consider the difference between these two approaches:

Use of “OR”:
SELECT* FROM EMPLOYEES
WHERE department_id = 10
OR department_id = 20
OR department_id = 30;

Use of “IN”:
SELECT* FROM EMPLOYEES
WHERE department_id IN (10, 20, 30);

Both queries are designed to retrieve employee data from departments 10, 20, and 30. However, the “IN” version is not only more readable, but also allows Oracle Query Optimizer to use indexes more efficiently and potentially reduce the number of scans required to retrieve the requested data.

Problems with “IS NULL”

wartości Null

The “IS NULL” condition presents a different set of challenges. This is problematic primarily because of the way NULL values are treated by Oracle indexing mechanisms, in particular B-TREE indexes. In Oracle, a value of NULL means no data. While this may be semantically appropriate for some scenarios, it introduces inefficiencies in query processing.

B-TREE indexes do not store entries for keys represented as NULL, which means that the query condition checking IS NULL will not be able to use these indexes effectively. As a result, Oracle can resort to full table scanning to satisfy such queries, leading to a drop in performance, especially for large data sets.

So let's move away from the values “Null” and “IS NULL” - How?

The basics seem simple. If you know the value, let me enter it. If you do not know the value, specify that it is unknown.

  • Defaults: By assigning a default value, you ensure that each record has a predictable value that can be efficiently indexed and searched. For example, if a column records the date the action was performed and the action is always to occur, setting a default date (such as the date the row was created) instead of allowing NULL values can improve index performance.
  • Placeholders for unknown data: In cases where a column may not have a meaningful default value - for example, optional user information - it is useful to use a placeholder value to represent unknown or non-applicable data. This avoids NULL values and ensures that the column remains index-friendly. For example, using a specified date far in the past or a designated string value such as “UNKNOWN” or “N/A” may be more efficient than NULL.
  • Feature-Based Indexes: How it Highlights Babette Turner, the use of function-based indexes can be an effective strategy for optimizing queries containing NULL values. By creating an index on a function that converts NULL values into a searchable key, you can preserve the performance of the index. Take, for example, the index created on NVL (mycolumn, 'ISNULL'). It converts NULL values into a constant 'ISNULL' value that can be efficiently indexed and searched.

Related articles