Open CheatSheet Back To List Next Tutorial

Joins (Video Tutorial)

Joins

Intro

  • This is the sixth in a series of tutorials on how to use a relational database.
  • We are using Postgres, and the PG Admin graphical interface.
  • We have a database named bookstore with two tables in it, authors and books.
  • In this tutorial we cover table joins.

Read it, watch it, do it, review it:
  • There is both a written and accompanying video version of this tutorial. There are timestamps under the headings that align with the video version of the topic.
    • Read it: For each topic heading, First read the topic. 
    • Watch it: Then watch the associated section of the video.
    • Do it: Then follow the instructions to replicate the steps in your own version of the bookstore database.
  • The Database CheatSheet has a major category that aligns with this tutorial.
    • Review it: When you are done with the whole tutorial, open the CheatSheet and review the Joins category from this tutorial. Make sure you understand everything, and can refer back to the CheatSheet when you are working on your own projects in the future.

What are joins?
[0:00 Video timestamp]

Joins are data queries on two (or more) tables. The tables are associated with each other through matches on a joining column.

Inner joins are the most common type of join. Inner joins only select rows that have associations in both tables.
There are two ways to create an inner join. Through the WHERE clause, or with a JOIN clause.

Outer joins can include rows that do not have matches on the associated column values.
There are three kinds of outer joins: left outer joinright outer join, and full outer join.

We will cover all of these topic in this tutorial.



Setup

In our bookstore database, let's temporarily drop the not null constraint on author_id, then set the first book's author_id to null.
We are doing this just to better demonstrate the different types of joins.

Open the PG Admin Query Tool and execute the below statements:
ALTER TABLE books ALTER COLUMN author_id DROP NOT NULL; 
UPDATE books SET author_id = null WHERE id = 1;

Then view the books data with:
SELECT * FROM books;
Book where id = 1 should have author_id of null.



Inner Join using a WHERE clause:

[3:33 Video timestamp]

  • You can join two associated tables using a WHERE clause condition.
  • Use the below format:
SELECT table1.column_name, table2.column_name
FROM table1, table2
WHERE table1.associating_column = table2.identifying_column;
  • Put both table names in the FROM clause.
  • In the SELECT clause, optionally prefix the table name to the column name. This is required if the same column name is used in both tables, such as the id column.
  • The WHERE clause is where you connect the tables by matching associated columns.

Inner Join the books and authors tables using a WHERE clause:
The books table has a column for author_id. If we want to get the author's name, we have to go to the authors table, which we can do with an inner join on the two tables.
In the PG Admin Query Tool, execute the below statement:
SELECT books.id AS book_id, books.title, authors.name AS author, books.quantity, books.price
FROM books, authors
WHERE books.author_id = authors.id
ORDER BY book_id;

This will return all the book rows that have a valid author_id.

  • It will display the following column names: book_id, title, author, quantity, and price.
    • The id field is present in both tables so to select it we had to preface it with the table name: books.id 
    • To make the results easier to interpret we aliased id from the books table as book_id and name from the authors table as author.
  • In the FROM clause we list both the books and authors tables.
  • We join the tables in the WHERE clause by connecting the books.author_id column with authors.id
    • books where author_id is null or where the author_id value is not found in the authors table will be excluded.



Inner Join using a JOIN clause:

[6:16 Video timestamp]

Instead of making the join in the WHERE clause, we can use the INNER JOIN clause.
  • INNER JOIN joins two associated tables ON the specified columns.
  • Only rows with a valid association between the two tables are returned.
  • The INNER keyword is optional because it is the default join type in SQL.
SELECT column1, column2, ...
FROM table1
[INNER] JOIN table2 
  ON table1.associating_column = table2.identifying_column;

Inner Join the books and authors tables using a JOIN clause:
Let's repeat our join of books and authors this time using a JOIN clause.

In the PG Admin Query Tool, execute the below statement:
SELECT books.id AS book_id, books.title, authors.name AS author, books.quantity, books.price
FROM books
INNER JOIN authors 
  ON books.author_id = authors.id
ORDER BY book_id;
  • This should return the same result as before. All the book rows that have a valid author_id.
  • INNER JOIN is the default type of join, so it can be replaced with just JOIN.



Outer Joins

[7:11 Video timestamp]

Unlike inner joins, outer joins can include rows that do not have matches on the associated column values.
There are three kinds of outer joins: left outer join, right outer join, and full outer join.

The format for outer joins is below:
SELECT column1, column2, ...
FROM left_table
LEFT [OUTER] JOIN right_table 
  ON left_table.associating_column = right_table.identifying_column;
  • You can replace LEFT with RIGHT or FULL to make it a right or full join.
  • The "OUTER" keyword is optional. It is implied when keywords LEFT, RIGHT, or FULL are used in front of JOIN.

We'll go through the three types of outer joins using our books and authors tables.

Left Outer Join:
  • A left outer join gets all the rows from the left table (authors). Authors that have no books will have null values for the book columns: title, quantity, and price. 
  • Rows from the right table (books) with no associated author (i.e., author_id is null or invalid) are ignored.

Execute the below statement in the Query Tool and review the results:
SELECT authors.name AS author, books.id AS book_id, books.title, books.quantity, books.price
FROM authors
LEFT OUTER JOIN books ON authors.id = books.author_id
ORDER BY author;

Right Outer Join:
[8:20 Video timestamp]

  • Gets all the rows from the right table (books). Books that have no author_id value will have null values for author_id and name.
  • Rows from left table (authors) with no associated books are ignored.

Execute the below statement and review the results:
SELECT authors.name AS author, books.id AS book_id, books.title, books.quantity, books.price
FROM authors
RIGHT OUTER JOIN books ON authors.id = books.author_id
ORDER BY author;

Full Outer Join:
[9:03 Video timestamp]

  • Gets all rows from left table (authors). Authors that have no books will have null values for book title, quantity, and price. 
  • Gets all rows from right table (books). Books that have no author_id value will have null values for author_id and name.

Execute the below statement and review the results:
SELECT authors.name as author, books.id AS book_id, books.title, books.quantity, books.price
FROM books
FULL OUTER JOIN authors ON books.author_id = authors.id
ORDER BY author;

Wrap up:
Set the author_id of book 1 from null back to 1.
Restore the not null constraint on author_id.
UPDATE books SET author_id = 1 WHERE id = 1;
ALTER TABLE books ALTER COLUMN author_id SET NOT NULL; 

Open CheatSheet Back To List Next Tutorial