Generate a series of records with CTE
Published on: 27th Sep 2012
Updated on: 18th May 2021
Overview
It's quite common to return the records without any gaps in between, especially when you are developing stored procedures for reporting purposes.
For example, you are developing a daily sales report for January 2021.
select
trans_date
, total_sales = sum(qty * price)
from tb_sales
where
trans_date between '2021-01-01' and '2021-01-13'
group by
trans_date
order by
trans_date
Let's say only 1st, 3rd and 5th have sales. As a result, the above query will return 3 records. Other dates are missing from the output.
Solving the missing data with CTE
In MSSQL, you may use common Table Expression (CTE) to generate a series of records. For example, get a series of number from 1 to 10,
;
with tb (ii)
as (
select
1 ii
union all
select ii + 1
from tb
where ii + 1 <= 10
)
select *
from tb
The above statement was started with a strange symbol, i.e, a semi-colon. In fact, it is the terminator for the previous statement. Starting from the very old version of MSSQL, statement terminator is optional. With CTE, it is compulsory for the previous statement to be terminated before starting the CTE.
Test table and test data
Here is the sales table with three days of sales.
create table tb_sales (
trans_date datetime
, qty int
, price money
)
go
insert into tb_sales (trans_date, qty , price )
values ('2021-01-01', 10, 1.5)
, ('2021-01-03', 13, 4.5)
, ('2021-01-05', 15, 2.5)
go
Now, we are going to use CTE to generate 31 dates (records) for January.
; with tb_date
as (
select dt = cast('2021-01-01' as datetime)
union all
select dateadd(day, 1, dt)
from tb_date
where
month(dateadd(day, 1, dt)) = 1
)
select *
from tb_date
Finally, merging the sales summary query with the above CTE,
; with tb_date
as (
select dt = cast('2021-01-01' as datetime)
union all
select dateadd(day, 1, dt)
from tb_date
where
month(dateadd(day, 1, dt)) = 1
)
select
d.dt
, total_sales = isnull(sum(s.qty * s.price), 0)
from tb_date d
left outer join (
select s0.*
from tb_sales s0
where
s0.trans_date between '2021-01-01' and '2021-01-13'
) as s on s.trans_date = d.dt
group by
d.dt
order by
d.dt
And, here is the output,
dt | total_sales |
---|---|
2021-01-01 00:00:00.000 | 15.00 |
2021-01-02 00:00:00.000 | 0.00 |
2021-01-03 00:00:00.000 | 58.50 |
2021-01-04 00:00:00.000 | 0.00 |
2021-01-05 00:00:00.000 | 37.50 |
2021-01-06 00:00:00.000 | 0.00 |
.... | ... |
2021-01-31 00:00:00.000 | 0.00 |
Jump to #MSSQL blog
Author
Lau Hon Wan, software developer.