Create and View Tables

Now that we have created our sql_book database, we have the 'outer shell' of our building. Our building doesn't currently have any rooms however, so if we were to fill that building with stuff, that stuff wouldn't be any more organised inside the building than outside of it. We need to add some rooms so that we can organise our stuff properly, or in SQL terms we need to add tables.

Database tables, sometimes referred to as relations, and the relationships between those tables are what provide the structure we need to house our data. Tables can represent real-world entities that constitute the business logic of an application, like a customer or an order. The tables and the relationships between them are used to store data to represent our business logic. For example, consider representing a customer who has orders associated with them.

Table Creation Syntax

To create a table we can use the CREATE TABLE SQL statement. In its most simple form, it's very similar to the CREATE DATABASE SQL statement we looked at in the previous chapter:

CREATE TABLE some_table();

The only difference is the use of TABLE instead of DATABASE and the empty parentheses at the end of the table name. If we issued this command, it would create a table called some_table in the database that we're currently connected to; such a table wouldn't be of much use to us however, since it wouldn't have any columns. Without columns we wouldn't be able to use the table to store any data. Columns are what we use to give tables their structure.

To create a table with columns, we need to place column definitions in between the parentheses. Each column definition is generally written on a separate line, separated by a comma. The basic format of a CREATE TABLE statement is:

CREATE TABLE table_name (
    column_1_name column_1_data_type [constraints, ...],
    column_2_name column_2_data_type [constraints, ...],
    .
    .
    .
    constraints
);

The brackets ([]) in the above example indicate optional parts of the CREATE TABLE command, and the ellipses (...) indicate that there can be more than one constraint.

Column names and data types are a required part of each column definition; constraints are optional. We'll look at constraints in more detail later in the book, but one thing to note from the above format is that constraints can be defined either at the column level or at the table level.

Creating a users table

In the sql_book database we want to store a list of users; for each user we want to store an id for that user, their username, and whether their account is enabled or not. In order to have somewhere to contain these different pieces of user data, we need to create a table with an appropriate column for each piece of data.

Below is the SQL statement to create a table, named users, using the CREATE TABLE statement:

CREATE TABLE users (
       id serial UNIQUE NOT NULL,
       username char(25),
       enabled boolean DEFAULT TRUE
);

Let's break this down a little to understand exactly what's going on here:

  1. CREATE TABLE: Firstly, CREATE TABLE users is the primary command.
  2. users: The name of the table that will be created.
  3. (): The information in the parentheses is related to the columns in the table.
  4. id, username, enabled: These are the three columns of the table.
  5. serial, char(25), boolean: These are the data types of the columns. We will look at data types shortly.
  6. UNIQUE, NOT NULL: These are constraints. We'll talk about these later in this chapter.
  7. DEFAULT TRUE: Specifies a default value for the column. We'll revisit this later.
  8. Notice that each column definition is comma separated; this is the standard in any SQL database management system.

Issue the command while connected to the sql_book database via the psql console. You should receive the following response:

CREATE TABLE

In our sql_book database, we should now have a users table with three columns: id, username, and enabled. When defining those columns in our table creation statement, as well as specifying the column name, we specified a data type for each column, and a constraint for two of them. Let's explore what those two things mean a little more.

Data Types

The three columns we created, id, username, and enabled, were all created with different data types, serial, char(25), and boolean respectively. A data type classifies particular values that are allowed for that column. This can help protect our database from data of an invalid type being entered. We'll look at this in a bit more detail later, when we start adding data to our table.

The table below lists some common data types. We'll be using a number of these in the rest of the book and you will encounter others as you continue to use SQL.

Column Data Type Description
serial This data type is used to create identifier columns for a PostgreSQL database. These identifiers are integers, auto-incrementing, and cannot contain a null value.
char(N) This data type specifies that information stored in a column can contain strings of up to N characters in length. If a string less than length N is stored, then the remaining string length is filled with space characters.
varchar(N) This data type specifies that information stored in a column can contain strings of up to N characters in length. If a string less than length N is stored, then the remaining string length isn't used.
boolean This is a data type that can only contain two values "true" or "false". In PostgreSQL, boolean values are often displayed in a shorthand format, t or f
integer or INT An integer is simply a "whole number." An example might be 1 or 50, -50, or 792197 depending on what storage type is used.
decimal(precision, scale) The decimal type takes two arguments, one being the total number of digits in the entire number on both sides of the decimal point (the precision), the second is the number of the digits in the fractional part of the number to the right of the decimal point (the scale).
timestamp The timestamp type contains both a simple date and time in YYYY-MM-DD HH:MM:SS format.
date The date type contains a date but no time.

As of v. 10, PostgreSQL added the standard (ANSI/ISO SQL:2003) IDENTITY syntax to handle auto-incrementing key values. The use of serial is no longer recommended for new (production) applications. This is for some (fairly technical) reasons, which one of the developers of the newer syntax explains here.

For the purposes of working through this book however, and for learning PostgreSQL in general, this isn't something that you need to worry about. The code examples in this book were created under version 9 of PostgreSQL, and versions prior to v10 are still in widespread use (the default version on AWS Cloud9 is 9.6.1, for example). Furthermore, the way in which we use serial in the book isn't impacted by the compatibility and permission management issues which IDENTITY seeks to address.

Keys and Constraints

While data types are a mandatory part of a column definition, constraints are optional. They are extremely useful however, and more often than not you'll want to add some kind of constraint to your columns.

One of the key functions of a database is to maintain the integrity and quality of the data that it is storing. Keys and Constraints are rules that define what data values are allowed in certain columns. They are an important database concept and are part of a database's schema definition. Defining Keys and Constraints is part of the database design process and ensures that the data within a database is reliable and maintains its integrity. Constraints can apply to a specific column, an entire table, more than one table, or an entire schema.

The creation statement for the users table specified some properties, or constraints for each column. Let's go over those now:

  • UNIQUE: The id column has a UNIQUE constraint, which prevents any duplicate values from being entered into that column.

  • NOT NULL: The id column also has a NOT NULL constraint, which essentially means that when adding data to the table a value MUST be specified for this column; it cannot be left empty.

  • DEFAULT: The enabled column has an extra property DEFAULT with a value of TRUE. If no value is set in this field when a record is created then the value of TRUE is set in that field.

We talk more about UNIQUE, NOT NULL and DEFAULT when we look at adding data into this table later in the book.

We haven't run into keys just yet, but we will later on. They come into play when we have to set up relationships between different database tables. They're also useful for keeping track of unique rows within a database table. We'll explore keys and constraints in more detail later on.

View the Table

In a large database with lots of different tables, we might want to view a list of all the tables that exist in the database. We can use the \dt meta-command to show us a list of all the tables, or relations, in the database.

If we issue this meta-command when connected to our sql_book database, there is only one row with our table users since that's the only table we have so far created.

\dt
         List of relations
 Schema | Name  | Type  |   Owner
--------+-------+-------+-----------
 public | users | table | User
(1 row)

The \dt meta-command is useful for quickly viewing a list of all the tables in a database, but the information it provides is fairly limited. What if we want to see more detailed information about a particular table, such as the names of its columns, and the column data types and properties?

For this kind of more detailed information we can use the \d meta-command. This lets us describe a table. Thus, we can use \d users to see information on the users table. In the output below, each row is a column in the users table, along with that column's data type and properties.

\d users
        Table "public.users"
  Column  |     Type      |  Modifiers
----------+---------------+----------------------------------------------------
 id       | integer       | not null default nextval('users_id_seq'::regclass)
 username | character(25) |
 enabled  | boolean       | default true
Indexes:
   "users_id_key" UNIQUE CONSTRAINT, btree (id)

Note: the output of \d varies between versions of PostgreSQL. The look is basically the same, but the column names may be different, and there may be other columns present.

You might notice that the Type for our id column is integer. This might seem a bit strange given that our table creation statement specified a type of serial. serial is a special data type available in PostgreSQL. It uses the integer data type along with a DEFAULT constraint and a function called nextval which keeps a track of the current highest value and increments this by one to be used as the next value. We'll be looking at some other functions in more detail later in this book.

You might also have noticed that our table has an index users_id_key; this index was created when we added the UNIQUE constraint. Indexes are a deep topic, and not one that we'll be covering in this book. For the purposes of our sql_book database, it is useful to think of indexes as a way of storing a quick-reference to values in a particular column. We'll see a basic example of how they work in the context of a UNIQUE constraint when we look at inserting data to our table later in the book.

Schema and DCL

One thing to note about the information returned by the \dt and \d meta-commands is that, although the information is displayed in a tabular format, it relates only to the schema of the database not the data. Remember that a database's schema is what provides the structure we need to house our data.

When we've previously spoken about creating the schema for a database, we've explained that we use one of the SQL sub-languages, DDL, to do this. There are actually parts of a database's schema that are controlled and managed by another of SQL's sub-languages, DCL (Data Control Language).

DCL is concerned with controlling who is allowed to perform certain actions within a database, or in other words with the 'security' of a database. Although we won't look at DCL in detail in this book, it's worth understanding that the 'security settings' determined by DCL are also part of the database's schema. We can see a clue to this if we look at the information returned by the \dt meta command. When we use \dt, we see a table of information, Schema, Name, Type, and Owner.

All of these aspects of the database's schema could be used by DCL to allow or restrict access to certain parts of the database or specific tables. The value in the Owner column in particular is very much a DCL concern; for example you could add a restriction to a table so that other users can add, read, update and delete data from the table but only the owner can alter the structure of the table or delete the table entirely. You can perhaps think of this in terms of different users having different permissions in the database.

Summary

We've covered quite a lot of ground in this chapter; let's quickly recap some of the main points.

  • Tables are created using the CREATE TABLE SQL command
  • Table column definitions go between the parentheses of the table creation statement
  • Table column definitions consist of a column name, a data type, and optional constraints
  • There are many different data types, and these can be used to restrict the data that can be input to a column
  • Constraints can also be used on the data that is input to a particular column
  • We can view a list of tables or the structures of a particular table in the psql console using meta-commands
  • Although database schema is largely a DDL (Data Definition Language) concern, parts of it, such as access and permissions, are determined by DCL (Data Control Language)

Here are some of the commands we looked at:

Command Notes
CREATE TABLE users.. Creates a new table called users
\dt Shows the tables in the current database
\d users Shows the schema of the table users

In the next chapter, we'll continue learning about DDL, and try our hand at altering a database and its tables.