Open CheatSheet Back To List Next Tutorial

Data Types (Video Tutorial)

Data Types

Intro

  • This is the second in a series of tutorials on how to use a relational database.
  • We are using Postgres, and the PG Admin graphical interface.
  • We have a database created named bookstore with tables for author and book.
  • In this tutorial we will discuss Data Types.

Read it, watch it, do it, review it:
  • There is both a written and accompanying video version of this tutorial. There are timestamps under the headings that align with the video version of the topic.
    • Read it: For each topic heading, First read the topic. 
    • Watch it: Then watch the associated section of the video.
    • Do it: Then follow the instructions to replicate the steps in your own version of the bookstore database.
  • The Database CheatSheet has a major category that aligns with this tutorial.
    • Review it: When you are done with the whole tutorial, open the CheatSheet and review the Postgres category Data Types section from this tutorial. Make sure you understand everything, and can refer back to the CheatSheet when you are working on your own projects in the future. 


Data Types

[0:00 Video timestamp]

The generic statement to create a table is:
CREATE TABLE tablename (fieldname datatype, fieldname datatype); 

The SQL statements we used to create the author and book tables are below:
CREATE TABLE authors (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name VARCHAR(255),
bio TEXT,
  date_of_birth DATE
);
CREATE TABLE books (
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
isbn CHAR(13),
title VARCHAR(255),
author VARCHAR(100),
description TEXT,
quantity INTEGER,
price NUMERIC(5,2),
on_sale BOOLEAN,
published_date DATE,
created_at TIMESTAMP
);

In the create table statement, the column names are listed with the data type to the right of them.

Using the PG Admin Tool
[1:17 Video timestamp]
You can view the Columns and their Data Types in the PG Admin GUI tool:
Open PG Admin, in the Browser column click: Servers > PostgreSQL > Databases > bookstore > right click the table name > Properties > Columns tab.

Data type overview
[3:34 Video timestamp]
The data type constrains the type of data that can be saved to the database. It is an integrity check to make sure we only get the type of data we expect.

Each database vendor handles data types in their own way. 
There are SQL standards for data type names, which the major database vendors comply with. They have preferred data type names and have aliases that you can use as well.
  • For Postgres the full list of data types is at: postgresql.org/docs/current/datatype 
  • Capitalization: There is no consensus on whether you should put data type names in uppercase. To be consistent with SQL keywords, we are using uppercase.

We can group the most commonly used data types into the following categories: Character, Numeric, Date/time, and Boolean.

Character data types
[3:52 Video timestamp]

Ref: postgresql.org/docs/current/datatype-character
In our create table statements we used three different data types for characters.
  • CHARACTER VARYING(n) | alias: VARCHAR(n) Use this type for text fields where you want to put a limit on the number of characters allowed. Replace n with the character limit. 
    • We used this for author name, and the book title and author. We know these fields won't have a huge number of characters so we're limiting them.
  • CHARACTER(n) | alias: CHAR(n) Use this type for text that has a fixed-length. Replace n with the number of characters expected for this field. Values less than n length are padded with spaces.
    • Book isbn numbers are 13 characters long so we can use this data type to ensure 13 characters are entered. If the value is less than 13 characters, spaces are inserted at the end for the missing characters.
  • TEXT Use this type for text fields where we do not set a character limit.
    • We used this for the author bio, and book description fields.
  • VARCHAR and TEXT can store up to 1GB of data.
  • Per the Postgres documentation, there is no performance difference between VARCHAR(n), VARCHAR, and TEXT.

Numeric data types
[7:22 Video timestamp]

Ref: postgresql.org/docs/current/datatype-numeric
  • INTEGER | alias: INT Integers are numbers with no decimal points. The number must be within the range of plus or minus 2.1 billion. 
    • BIGINT If the number exceeds the Integer range you can use the BIGINT data type.
    • We used Integer data type for the id fields for both tables, and for the book quantity field. 
  • NUMERIC | alias: DECIMAL Allows decimal points. 
    • NUMERIC(percision,scale) Specify precision (total digits), and scale (number of digits after the decimal point).
    • We used the numeric data type for the book price field price NUMERIC(5,2)
It has a scale of 2 decimal points and a precision of 5 digits. So the price can be a number up to 999.99.
  • Floating point numbers (aka floats): are decimals where the number of decimal points is not specified. Instead, they can "float" to the appropriate position. Postgres has two float data types: REAL and DOUBLE PRECISION.
    • We did not use this data type in our tables. If we used the real data type for our price field then a price of 1.50 would be expressed as 1.5. If there was a 7% discount, then the new price (1.5 * .93) would be expressed as 1.395.   

Boolean data type (true or false)
[9:28 Video timestamp]

Ref: postgresql.org/docs/current/datatype-boolean
  • BOOLEAN Boolean has two possible values, true or false. 
    • True: Values Postgres will treat as true are true, yes, on, 1; 
    • False: Values Postgres will treat as false are false, no, off, 0.
    • We used this data type for the on_sale field.

Date/Time data types
[9:55 Video timestamp]

Ref: postgresql.org/docs/current/datatype-datetime
  • DATE 'YYYY-MM-DD'
    • We used this data type for the date_of_birth, published_date fields because time is not needed.
  • TIME 'hh:mm:ss'
  • TIMESTAMP 'YYYY-MM-DD hh:mm:ss[.fraction] [time zone]'
    • Timestamp values are converted from the current time zone to UTC for storage, and back to current time zone on retrieval.
    • Current time zone defaults to the server's time.
    • We used timestamp as the data type for the created_at field because precise time down to the second could be important. 



Auto-incrementing Row IDs

[10:47 Video timestamp]

In both tables, the first row was id with data type INT. The full lines were:
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,

Generated always as identity makes the value auto-increment by 1 every time a new row is added.
The first row is automatically given an ID value of 1 by the database when it is created. The second row is automatically given an ID of 2. And so on.
We will discuss primary key in the next tutorial.

Open CheatSheet Back To List Next Tutorial