Curriculum
In this tutorial, you will learn how to use the SQL Server GROUPING SETS
to generate multiple grouping sets.
Let’s create a new table named sales.sales_summary
for the demonstration.
SELECT b.brand_name AS brand, c.category_name AS category, p.model_year, round( SUM ( quantity * i.list_price * (1 - discount) ), 0 ) sales INTO sales.sales_summary FROM sales.order_items i INNER JOIN production.products p ON p.product_id = i.product_id INNER JOIN production.brands b ON b.brand_id = p.brand_id INNER JOIN production.categories c ON c.category_id = p.category_id GROUP BY b.brand_name, c.category_name, p.model_year ORDER BY b.brand_name, c.category_name, p.model_year;
In this query, we retrieve the sales amount data by brand and category and populate it into the sales.sales_summary
table.
The following query returns data from the sales.sales_summary
table:
SELECT * FROM sales.sales_summary ORDER BY brand, category, model_year;
GROUPING SETS
By definition, a grouping set is a group of columns by which you group. Typically, a single query with an aggregate defines a single grouping set.
For example, the following query defines a grouping set that includes brand and category which is denoted as (brand, category)
. The query returns the sales amount grouped by brand and category:
SELECT brand, category, SUM (sales) sales FROM sales.sales_summary GROUP BY brand, category ORDER BY brand, category;
The following query returns the sales amount by brand. It defines a grouping set (brand)
:
SELECT brand, SUM (sales) sales FROM sales.sales_summary GROUP BY brand ORDER BY brand;
The following query returns the sales amount by category. It defines a grouping set (category)
:
SELECT category, SUM (sales) sales FROM sales.sales_summary GROUP BY category ORDER BY category;
The following query defines an empty grouping set ()
. It returns the sales amount for all brands and categories.
SELECT SUM (sales) sales FROM sales.sales_summary;
The four queries above return four result sets with four grouping sets:
(brand, category) (brand) (category) ()
To get a unified result set with the aggregated data for all grouping sets, you can use the UNION ALL operator.
Because UNION ALL
operator requires all result set to have the same number of columns, you need to add NULL to the select list to the queries like this:
SELECT brand, category, SUM (sales) sales FROM sales.sales_summary GROUP BY brand, category UNION ALL SELECT brand, NULL, SUM (sales) sales FROM sales.sales_summary GROUP BY brand UNION ALL SELECT NULL, category, SUM (sales) sales FROM sales.sales_summary GROUP BY category UNION ALL SELECT NULL, NULL, SUM (sales) FROM sales.sales_summary ORDER BY brand, category;
The query generated a single result with the aggregates for all grouping sets as we expected.
However, it has two major problems:
To fix these problems, SQL Server provides a subclause of the GROUP BY clause called
GROUPING SETS
.
The GROUPING SETS
defines multiple grouping sets in the same query. The following shows the general syntax of the GROUPING SETS
:
SELECT column1, column2, aggregate_function (column3) FROM table_name GROUP BY GROUPING SETS ( (column1, column2), (column1), (column2), () );
This query creates four grouping sets:
(column1,column2) (column1) (column2) ()
You can use this GROUPING SETS
to rewrite the query that gets the sales data as follows:
SELECT brand, category, SUM (sales) sales FROM sales.sales_summary GROUP BY GROUPING SETS ( (brand, category), (brand), (category), () ) ORDER BY brand, category;
As you can see, the query produces the same result as the one that uses the UNION ALL
operator. However, this query is much more readable and of course more efficient.
GROUPING
functionThe GROUPING
function indicates whether a specified column in a GROUP BY
clause is aggregated or not. It returns 1 for aggregated or 0 for not aggregated in the result set.
See the following query example:
SELECT GROUPING(brand) grouping_brand, GROUPING(category) grouping_category, brand, category, SUM (sales) sales FROM sales.sales_summary GROUP BY GROUPING SETS ( (brand, category), (brand), (category), () ) ORDER BY brand, category;
The value in the grouping_brand
column indicates that the row is aggregated or not, 1 means that the sales amount is aggregated by brand, 0 means that the sales amount is not aggregated by brand. The same concept is applied to the grouping_category
column.