Recursion in SQL
We are used to easily writing recursive calls and loops in procedural languages, but this doesn’t hold true in SQL. Indeed, it depends on the DBMS used.
This article will focus on the ability to write recursive queries using Common Table Expressions (CTE). CTE are an addition from the SQL 1999 standard and are not yet available in all DBMS. As of today, CTE are at least compatible with:
- Microsoft SQL Server 2000 / 2005 / 2008
- IBM DB2
- Oracle 9.2
- PostgreSQL
Definition and syntax
A CTE is a temporary view with a very short life. It is only available in the SELECT following the declaration of the CTE, not longer!
The definition and the declaration is done in the same block, introduced by the WITH statement.
WITH expression_name [ ( column_name [,...n] ) ] AS ( CTE_query_definition ) /* You can use the CTE in the following select */ SELECT <column_list> FROM expression_name; /* And this is it. The following select won't work */ SELECT <column_list> FROM expression_name;
As stated in the MSDN:
Using a CTE offers the advantages of improved readability and ease of maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.
Beware: the only situation in which CTE improve the speed of an SQL request is when they are used more than once per request. In this case, the CTE are evaluated only once. Otherwise they only improve its readability and maintainability. And even if it is a key point for any developper, the CTE feature I want to introduce here is the recursivity.
WITH expression_name [ ( column_name [,...n] ) ] AS ( CTE_query_initialization UNION ALL CTE_query_recursion )
A recursive function needs a base case, an inductive step and a stopping criterion. The initialization is done in the first SELECT while the inductive step is defined in the second one. Finally, the stopping criterion will be part of the WHERE statement of the second condition.
With such a feature it is now possible to dynamically build datasets and avoid table creation for example (as shown in the next part of this article).
Basic example
Before going further, a simple example may help to understand the power of CTE in SQL. The following code retrieves the value of the factorial function
Reminder:
- factorial (0) = 1
- factorial (1) = 1
- factorial (n) = n! = n * factorial (n -1)
- factorial (4) = 4! = 4 * factorial (3) = 4 * 3 * 2 * 1
The following code returns the value of factorial (4). Amazing, isn’t it?
/* Code tested with MS SQL Server 2005/2008 */
with cteFacto
as
(
select 1 as I_FACTO,
1 as FACTO
union all
select I_FACTO + 1 as I_FACTO,
(I_FACTO + 1) * FACTO as FACTO
from cteFacto
where I_FACTO < 9
)
/*
I_FACTO indicates the factoriel we want
FACTO retrieves its value
*/
select FACTO
from cteFacto
where I_FACTO = 4 /* = 24 */
Generate datasets with dates
Dates manipulation is another, closer to concrete use cases, but still simple example of CTE in SQL.
Using CTE and recursivity, it is now very easy to generate this kind of dataset:
AGENDA year month day 2011-01-01 00:00:00.000 2011 1 1 2011-01-02 00:00:00.000 2011 1 2 2011-01-03 00:00:00.000 2011 1 3 ... 2011-01-27 00:00:00.000 2011 1 27 2011-01-28 00:00:00.000 2011 1 28 2011-01-29 00:00:00.000 2011 1 29 2011-01-30 00:00:00.000 2011 1 30 2011-01-31 00:00:00.000 2011 1 31
And, it is as simple as this:
/* Code tested with MS SQL Server 2005/2008 */ declare @startingDate as datetime set @startingDate = convert (datetime, '2011-01-01', 121) /* Semicolons are mandatory when the with statement is not immediately after the go keyword */ ;with cteAgenda as ( select @startingDate as AGENDA union all select DATEADD (day, 1, AGENDA) as AGENDA from cteAgenda where agenda < convert (datetime, '2011-02-01', 121) ) select AGENDA, YEAR (AGENDA) as year, MONTH (AGENDA) as month, DAY (AGENDA) as day from cteAgenda where AGENDA < convert (datetime, '2011-02-01', 121)
This kind of code may avoid the creation of a one-shot table by pushing values in a temporary table. Which, in my opinion, is a real benefit.
Final points
CTE are more than simple temporary views: they also allow the use of recursivity in SQL. Using CTE with temporary tables helps to create datasets that can be used in testing environment. They can also be used to explore datasets, or to recursivly apply operations on tree-structured datasets.
Read more about CTE
- MSDN: CTE (EN)
- MSDN: Recursive CTE (EN)
- PostgreSQL: CTE (EN)
- Oracle 9i 2: CTE (EN)
- IBM DB2: CTE (FR)
- CTE presentation (EN)
- CTE Recursives (FR)
