Is used to combine records from two or more tables in a database based on a common field between them?

Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.

In this lesson we'll cover:

  • The SQL UNION operator
  • Practice problems

The SQL UNION operator

SQL joins allow you to combine two datasets side-by-side, but UNION allows you to stack one dataset on top of the other. Put differently, UNION allows you to write two separate SELECT statements, and to have the results of one statement display in the same table as the results from the other statement.

Let's try it out with the Crunchbase investment data, which has been split into two tables for the purposes of this lesson. The following query will display all results from the first portion of the query, then all results from the second portion in the same table:

SELECT * FROM tutorial.crunchbase_investments_part1 UNION SELECT * FROM tutorial.crunchbase_investments_part2

Note that UNION only appends distinct values. More specifically, when you use UNION, the dataset is appended, and any rows in the appended table that are exactly identical to rows in the first table are dropped. If you'd like to append all the values from the second table, use UNION ALL. You'll likely use UNION ALL far more often than UNION. In this particular case, there are no duplicate rows, so UNION ALL will produce the same results:

SELECT * FROM tutorial.crunchbase_investments_part1 UNION ALL SELECT * FROM tutorial.crunchbase_investments_part2

SQL has strict rules for appending data:

  1. Both tables must have the same number of columns
  2. The columns must have the same data types in the same order as the first table

While the column names don't necessarily have to be the same, you will find that they typically are. This is because most of the instances in which you'd want to use UNION involve stitching together different parts of the same dataset (as is the case here).

Since you are writing two separate SELECT statements, you can treat them differently before appending. For example, you can filter them differently using different WHERE clauses.

Sharpen your SQL skills

Write a query that appends the two crunchbase_investments datasets above (including duplicate values). Filter the first dataset to only companies with names that start with the letter "T", and filter the second to companies with names starting with "M" (both not case-sensitive). Only include the company_permalink, company_name, and investor_name columns.

Try it out See the answer

For a bit more of a challenge:

Write a query that shows 3 columns. The first indicates which dataset (part 1 or 2) the data comes from, the second shows company status, and the third is a count of the number of investors.

Hint: you will have to use the tutorial.crunchbase_companies table as well as the investments tables. And you'll want to group by status and dataset.

If you're working with databases, at some point in your work you will likely need to use SQL JOINs. This guide offers a quick overview of SQL JOINs and introduces you to some of the types of JOINs used most commonly.

SQL JOIN definition and uses 

Let's start with an overview of what a database is. A database is a collection of different tables storing different types of information. The JOIN clause is used when retrieving data from related tables in a database. The SQL JOIN clause is more complex than a simple query that retrieves data from a single table because it retrieves data from multiple tables. 

Types of SQL JOINs with examples

You can choose among four types of SQL JOINs depending upon the results you desire; Inner JOIN, Left Outer JOIN, Right Outer JOIN, and Full Outer JOIN. Take a look at how each works, along with some sample SQL JOIN clauses:

Inner 

Inner JOINs combine two tables based on a shared key. For example, if you had a table with a column called "user id" and each user id was unique to a user, you could join that table to another table with a "user id" column to find the information associated with each user. This  example shows how to use an Inner JOIN clause to join two tables:

SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id;

Left Outer

Left JOINs return all rows from the first table and only the rows in the second table that match. This example shows how to use a Left Outer JOIN clause to join two tables:

SELECT * FROM table1 LEFT OUTER JOIN table2  ON table1.id = table2.user_id

Right Outer

Right JOINs are logically the opposite of Left JOINs—they return all rows from the second table, and only the rows in the first table that match. This example shows how to use a Right Outer JOIN clause to join two tables:

SELECT * FROM table1  RIGHT OUTER JOIN table2 ON table1.id = table2.user_id

Full Outer

Full JOINs combine both left and right joins by returning all rows from both tables, as long as there is at least one match between them. This example shows how to use a Full Outer JOIN clause to join two tables:

SELECT * FROM table1  FULL OUTER JOIN table2 ON table1.id = table2.user_id

There are many cases for using SQL JOINs, and they are crucial when mapping out relationships between tables in your database. 

Example of an application of the SQL JOIN clause

Imagine a table that stores personal information (name, address, phone number) and another table stores information related to employee job positions. Suppose each row on the employee table represents a single employee. In that case, it makes sense to store the employees' personal data in another table since an individual may be represented more than once (one row per position as they change roles).

Let's say that you need to write an application that shows employee names and addresses along with their current position, previous positions, and hire date. To retrieve this data from the database, you need to join these two tables together using some attributes common between them (such as Employee ID).

An e-commerce example of using SQL JOIN

Imagine now that you have an online store and want to know which products were bought by your customers. You would have two tables: one containing information about your customers and another containing information about your products. You can use an Inner JOIN to retrieve all the records that appear in both these tables using the following syntax:

Select * from customers Inner JOIN orders on customers.id = orders.customer_id;

Example with code

Consider a situation where you have two database tables, one called “Students” and the other called “Grades.” The “Students” table contains one record for each student: their ID number, name, major, and so on. The “Grades” table contains one record for each student's grade on different courses: their student ID number, the course they took, and their grade in the course.

In SQL, you would write a query to find the names of all students who have received a grade of 100 as follows:

  • SELECT Students.StudentName FROM Students.

  • JOIN Grades ON Students.StudentID=Grades.StudentID.

  • WHERE Grades.Grade=100.

Combining JOINs 

There are many ways to combine results from two or more queries. Here are the most common:

  • Use a JOIN statement to combine data from multiple tables in one SELECT statement.

  • Use a subquery to retrieve data from one table based on values from another table.

  • Use a UNION statement to combine multiple tables (or queries) data.

  • A JOIN statement can be used with any other type of statement that SQL supports, including UPDATE and DELETE.

Tips for learning more about SQL JOINs

If you're looking to do SQL projects or to get a job using SQL, you may need to build your knowledge and skills. Make sure you learn from reliable materials. Check that the trainer or instructor has advanced competencies in SQL. Read reviews and analyze the coursework or learning structure.

Tutorials 

Many tutorials are available on the internet that can help you learn SQL. These tutorials are often free and provided by competent people in their field. Learning through tutorials requires some planning. If you choose this route, make sure you follow a logical learning structure to learn all the foundational building blocks for working with SQL. For example, you will need a solid understanding of databases.

Online courses

There are many online courses with which you can learn SQL. Some of these courses are free, and some charge a fee. Some of the paid courses are comprehensive and offer value for money. Courses provide you with a structured learning process and can be an excellent way to build knowledge. 

Certifications 

There are plenty of SQL certifications for you to choose from. Certificates allow you to demonstrate to employers that you have passed an examination testing your SQL knowledge and can be particularly helpful if your resume doesn't contain much SQL experience.

Next steps 

If you want to learn more about SQL, consider taking one of the courses on Coursera. The Introduction to Structured Query Language (SQL) course offered by the University of Michigan is a good place to start your journey. By taking this course, you can learn how to create a MySQL database step by step and learn more about the SQL language.