Curriculum
In this tutorial, you will learn how to use the SQL Server self join to join a table to itself.
A self join allows you to join a table to itself. It helps query hierarchical data or compare rows within the same table.
A self join uses the inner join or left join clause. Because the query that uses the self join references the same table, the table alias is used to assign different names to the same table within the query.
The following shows the syntax of joining the table T
 to itself:
SELECT select_list FROM T t1 [INNER | LEFT] JOIN T t2 ON join_predicate;
The query references the table T
 twice. The table aliases t1
 and t2
 are used to assign the T
 table different names in the query.
Let’s take some examples to understand how the self join works.
Consider the following  staffs
 table from the sample database:
The  staffs
 table stores the staff information such as id, first name, last name, and email. It also has a column named manager_id
 that specifies the direct manager. For example, Mireya
 reports to Fabiola
 because the value in the manager_id
 of  Mireya
 is Fabiola
.
Fabiola
 has no manager, so the manager id column has a NULL.
To get who reports to whom, you use the self join as shown in the following query:
SELECT e.first_name + ' ' + e.last_name employee, m.first_name + ' ' + m.last_name manager FROM sales.staffs e INNER JOIN sales.staffs m ON m.staff_id = e.manager_id ORDER BY manager;
In this example, we referenced to the  staffs
 table twice: one as e
 for the employees and the other as m
 for the managers. The join predicate matches employee and manager relationship using the values in the e.manager_id
 and m.staff_id
 columns.
The employee column does not have Fabiola Jackson
 because of the INNER JOIN effect. If you replace theÂ
INNER JOINÂ clause by theÂ
LEFT JOINÂ clause as shown in the following query, you will get the result set that includesÂ
Fabiola Jackson
 in the employee column:
SELECT e.first_name + ' ' + e.last_name employee, m.first_name + ' ' + m.last_name manager FROM sales.staffs e LEFT JOIN sales.staffs m ON m.staff_id = e.manager_id ORDER BY manager;
The following statement uses the self join to find the customers located in the same city.
SELECT c1.city, c1.first_name + ' ' + c1.last_name customer_1, c2.first_name + ' ' + c2.last_name customer_2 FROM sales.customers c1 INNER JOIN sales.customers c2 ON c1.customer_id > c2.customer_id AND c1.city = c2.city ORDER BY city, customer_1, customer_2;
The following condition makes sure that the statement doesn’t compare the same customer:
c1.customer_id > c2.customer_id
And the following condition matches the city of the two customers:
AND c1.city = c2.city
Note that if you change the greater than ( > ) operator by the not equal to (<>) operator, you will get more rows:
SELECT c1.city, c1.first_name + ' ' + c1.last_name customer_1, c2.first_name + ' ' + c2.last_name customer_2 FROM sales.customers c1 INNER JOIN sales.customers c2 ON c1.customer_id <> c2.customer_id AND c1.city = c2.city ORDER BY city, customer_1, customer_2;
Let’s see the difference between > and <> in the ON
 clause by limiting to one city to make it easier for comparison.
The following query returns the customers located in Albany:
SELECT customer_id, first_name + ' ' + last_name c, city FROM sales.customers WHERE city = 'Albany' ORDER BY c;
This query uses (Â >
) operator in the ON
 clause:
SELECT c1.city, c1.first_name + ' ' + c1.last_name customer_1, c2.first_name + ' ' + c2.last_name customer_2 FROM sales.customers c1 INNER JOIN sales.customers c2 ON c1.customer_id > c2.customer_id AND c1.city = c2.city WHERE c1.city = 'Albany' ORDER BY c1.city, customer_1, customer_2;
This query uses (Â <>
) operator in the ON
 clause:
SELECT c1.city, c1.first_name + ' ' + c1.last_name customer_1, c2.first_name + ' ' + c2.last_name customer_2 FROM sales.customers c1 INNER JOIN sales.customers c2 ON c1.customer_id <> c2.customer_id AND c1.city = c2.city WHERE c1.city = 'Albany' ORDER BY c1.city, customer_1, customer_2;