Jump to content

Structured Query Language/Data Types

From Wikibooks, open books for an open world



The SQL standard knows three kinds of data types

  • predefined data types
  • constructed types
  • user-defined types.

This page presents only the predefined data types. Constructed types are one of ARRAY, MULTISET, REF(erence) or ROW. User-defined types are comparable to classes in object-oriented language with their own constructors, observers, mutators, methods, inheritance, overloading, overwriting, interfaces, and so on.

Overview

[edit | edit source]

The standard groups predefined data types into types with similar characteristics.

  • Character Types
  • Character (CHAR)
  • Character Varying (VARCHAR)
  • Character Large Object (CLOB)
  • Binary Types
  • Binary (BINARY)
  • Binary Varying (VARBINARY)
  • Binary Large Object (BLOB)
  • Numeric Types
  • Exact Numeric Types (NUMERIC, DECIMAL, SMALLINT, INTEGER, BIGINT)
  • Approximate Numeric Types (FLOAT, REAL, DOUBLE PRECISION)
  • Datetime Types (DATE, TIME, TIMESTAMP)
  • Interval Type (INTERVAL)
  • Boolean
  • XML
  • JSON (since SQL:2016)

Character types hold printable characters, binary types any binary data. Both may have a fixed or variable size with an upper limit. If the upper limit exceeds a certain value, the type is a 'large object' with special methods and functions.

Exact numeric types hold numeric values without digits after the decimal or with a firm number of digits after the decimal. Please note that the standard does not define a separate data type 'auto-increment' for generating primary keys. Instead it defines the phrase 'GENERATED ALWAYS AS IDENTITY' as part of the CREATE TABLE statement, see CREATE TABLE statement or auto-increment-columns.

Approximate numeric types hold numeric values with an implementation-defined precision (after the decimal).

Temporal types hold values for INTERVAL (a certain range on the time bar), DATE (year, month, day), TIME with and without TIMEZONE (name of timezone, hour, minute, second including fraction) and TIMESTAMP with and without TIMEZONE (name of timezone, year to second including fraction).

The boolean data type holds the two values true and false.

Part 14 of the SQL standard extends the list of predefined data types by introducing the data type XML (Oracle calls it XMLType) together with a bunch of particular functions. Columns of this type hold XML instances.

In the outdated SQL-2 standard, there was a data type 'BIT'. This data type is no longer part of the standard.

Most DBMS implement the majority of predefined data types, but there are some exceptions. Also, the naming differs slightly. An overview of the major implementations is available in the Wikibook SQL_Dialects_Reference.

Data types are used within the CREATE TABLE statement as part of column definitions - or during CAST operations.

CREATE TABLE <tablename> (
  <column_name> <data_type> ... ,
  <column_name> <data_type> ... ,
  ...
);

Character

[edit | edit source]

A series of printable characters - which is a string - can be stored within character string types. If all rows of a table use the same fixed size for the strings, the data type is CHAR(<n>) where <n> is the size of the strings. If the size varies from row to row, the data type VARCHAR(<n>) defines that up to <n> characters can be stored in the column. So <n> defines the upper limit for this column. The maximum value for <n> depends on the used DBMS implementation. If applications need to store longer strings than it is allowed by this upper system limit, the data type CLOB must be used. Also CLOB has its own upper limit, but this is significantly greater than the upper limit of VARCHAR.

-- A table with columns of fixed and variable size strings and a CLOB string
CREATE TABLE datatypes_1 (
  id    DECIMAL PRIMARY KEY,
  col_1 CHAR(10),       -- exactly 10 characters
  col_2 VARCHAR(150),   -- up to 150 characters
  col_3 CLOB            -- very large strings (MySQL denotes this data type: 'LONGTEXT')
);

Hint: Unlike other programming languages, SQL does not distinguish between a character data type and a string data type. It knows only the character string data types CHAR, VARCHAR and CLOB.

Binary

[edit | edit source]

Binary data types are similar to character data types. They differ in that they accept a different range of bytes. Binary data types accept all values.

-- A table with columns of fixed and variable size binary data and a BLOB
CREATE TABLE datatypes_2 (
  id    DECIMAL PRIMARY KEY,
  col_1 BINARY(10),     -- exactly 10 byte
  col_2 VARBINARY(150), -- up to 150 byte
  col_3 BLOB            -- very large data: jpeg, mp3, ...
);

A hint to Oracle users: The data type BINARY is not supported, the data type VARBINARY is denoted as RAW and is deprecated. Oracle recommends the use of BLOB.

Exact Numeric

[edit | edit source]

Exact numeric types hold numeric values without digits after the decimal or with a firm number of digits after the decimal. All exact numeric types are signed.

NUMERIC(<p>,<s>) and DECIMAL(<p>,<s>) denotes two types which are nearly the same. <p> (precision) defines a fixed number of all digits within the type and <s> (scale) defines how many of those digits follow the decimal place. Numeric values with more than (p - s) digits before the decimal place cannot be stored and numeric values with more than s digits after the decimal place are truncated to s digits after the decimal place. p and s are optional. It must always be: p ≥ s ≥ 0 and p > 0.

SMALLINT, INTEGER and BIGINT denote data types without a decimal place. The SQL standard did not define their size, but the size of SMALLINT shall be smaller than the size of INTEGER and the size of INTEGER shall be smaller than the size of BIGINT.


-- A table using five exact numeric data types
CREATE TABLE datatypes_3 (
  id    DECIMAL PRIMARY KEY,
  col_1 DECIMAL(5,2),    -- three digits before the decimal and two behind
  col_2 SMALLINT,        -- no decimal point
  col_3 INTEGER,         -- no decimal point
  col_4 BIGINT           -- no decimal point. (Not supported by Oracle.)
);

Approximate Numeric

[edit | edit source]

Approximate numeric types hold numeric values with an implementation-defined precision (after the decimal). All approximate numeric types are signed. Their primary use cases are scientific computations.

There are three types: FLOAT (<p>), REAL and DOUBLE PRECISION, where p denotes the guaranteed precision of the FLOAT data type. The precision of REAL and DOUBLE PRECISION is implementation defined.

-- A table using the approximate numeric data types
CREATE TABLE datatypes_4 (
  id    DECIMAL PRIMARY KEY,
  col_1 FLOAT(2),     -- two or more digits after the decimal place
  col_2 REAL,
  col_3 DOUBLE PRECISION
);

Temporal

[edit | edit source]

Data types with respect to temporal aspects are: DATE, TIME, TIMESTAMP and INTERVAL.

DATE stores year, month and day. TIME stores hour, minute and second. TIMESTAMP stores year, month, day, hour, minute and second. Seconds can contain digits after the decimal. TIME and TIMESTAMP can contain the name of a TIME ZONE.

The SQL standard defines two kinds of INTERVALs. The first one is an interval with year and month, the second one is an interval with day, hour, minute, and second.

-- A table using temporal data types
CREATE TABLE datatypes_5 (
  id    DECIMAL PRIMARY KEY,
  col_1 DATE,                       -- store year, month and day (Oracle: plus hour, minute and seconds) 
  col_2 TIME,
  col_3 TIMESTAMP(9),               -- a timestamp with 9 digits after the decimal of seconds
  col_4 TIMESTAMP WITH TIME ZONE,   -- a timestamp including the name of a timezone
  col_5 INTERVAL YEAR TO MONTH,
  col_6 INTERVAL DAY TO SECOND(6)   -- an interval with 6 digits after the decimal of seconds
);

A hint to Oracle users: The data type TIME is not supported. Use DATE instead.

A hint to MySQL users: The use of TIME ZONE as part of data types is not supported. MySQL implements a different concept to handle time zones. Fractions of seconds are not supported. The data type INTERVAL is not supported, but there is a data value INTERVAL.

Boolean

[edit | edit source]

SQL has a 3-value-logic. It knows the boolean values true, false and unknown. Columns of the boolean data type can store one of the two values true or false. unknown is represented by storing no value, which is the NULL indicator.

-- A table with one column of boolean
CREATE TABLE datatypes_6 (
  id    DECIMAL PRIMARY KEY,
  col_1 BOOLEAN     -- not supported by Oracle
);

Part 14 of the SQL standard extends the list of predefined data types by introducing the data type XML. The standard also defines a wide range of particular functions for this data type.

-- A table with one column of data type XML 
CREATE TABLE datatypes_7 (
  id    DECIMAL PRIMARY KEY,
  col_1 XML
);

A hint to Oracle users: The data type XML is denoted as XMLType.

A hint to MySQL users: The data type XML is not supported.

Domains

[edit | edit source]

In the context of data types, the standard knows domains. The purpose of domains is to constrain the set of valid values that can be stored in a column. The domain-concept is a very early predecessor of user-defined types and may be outdated.

Clean Up

[edit | edit source]
DROP TABLE datatypes_1;
DROP TABLE datatypes_2;
DROP TABLE datatypes_3;
DROP TABLE datatypes_4;
DROP TABLE datatypes_5;
DROP TABLE datatypes_6;
DROP TABLE datatypes_7;

Exercises

[edit | edit source]

Create a table 'company' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200), 'isin' (strings of length 12), 'stock_price' (numeric with 2 digits before and 2 after the decimal), 'description_text' (a very large string) and description_doc (any binary format).

Click to see solution
CREATE TABLE company (
  id               DECIMAL PRIMARY KEY,
  name             VARCHAR(200),
  isin             CHAR(12),
  stock_price      DECIMAL(4,2),
  description_text CLOB,
  description_doc  BLOB
);