Jump to content

Fundamentals of databases : Relational databases

From Wikibooks, open books for an open world


PAPER 2 - ⇑ Fundamentals of databases ⇑

← Conceptual data models and entity relationship modelling Relational databases Database design and normalisation techniques →


What is a relational database?

[edit | edit source]

A relational database is a method of storing data, where we divide the data to fit into specific structures which have relationships between them.

Imagine a relational database is being used by a vet. The data would be divided up into tables, each representing an entity or 'thing':

Entity - A category of item, e.g. Animal, Customer


All details about animals the vet cares for would be stored inside the animal table, details about the different medicines the vet uses would be stored in the medicine table and so on.

The database also models the relationships between the entities. In the vet example, we have tables for Customer and Animal. Each customer may own one (or more) animals, and each animal belongs to one customer, so there is a relationship between the records in these two tables.

Tables

[edit | edit source]
Table - The data structure which stores all of the data about a given entity, i.e. a table is how the entity is represented in the database


Tables consist of columns and rows which can be imagined in the same format as we may imagine a spreadsheet.


Exercise: Picking Tables

List the tables you might need to store a database on a football league

Answer:

Team
Player
Match
Stadium
Referee

List the tables you might need to record details for an online store

Answer:

Staff
Products
Receipts
Customers


Attributes

[edit | edit source]

Each column represents an attribute which is a labelled element of data we want to store about the entity represented by this particular table. Here are some examples of entities and possible attributes those entities might have:

Animal: Name, Weight, IsNeutered
Customer: First name, Last name, Address, Post code

When the table is created, the data type for each attribute is specified. This allows the database software to allocate sufficient memory to be able to store any potential data that is input, as well as making it able to enforce a type check on the data. For example in our table for Animal we may have the following attributes:

Attribute Data type
Name Text
Weight Real
IsNeutered Boolean

Records

[edit | edit source]

Each row in a table represents a single instance of an entity and is called a record. For example within the table Animal we may have rows for Chewbacca the dog, Tabitha the cat and Percy the fish, each being an instance of an Animal.

Name Weight TypeOfAnimal ...
Chewbacca 8 Dog
Tabitha 3 Cat
Percy 0.25 Fish
Exercise: Picking attributes

For a social networking site, what attributes and data types would you include for the table of people

Answer:

Name: string
Date of Birth: Date
Gender: string (character might also work here)
Public profile: boolean
Personal description: String

For a social networking site, what attributes and data types would you include for the table of relationships (the table that stores 'friend' 'sister' 'father' etc)

Answer:

Person1 ID: integer
Person2 ID: integer
Type of relationship: string
Date created: Date


Primary Key

[edit | edit source]

Every record in a relational database must have its own unique Primary Key (PK) attribute, which provides a way of identifying that specific record out of all of the records in the entire table.

Primary Key (PK) - An attribute which uniquely identifies a record in a database table.


The key attribute must be chosen carefully so that it is truly unique. For example we could not choose the attribute "Name" as the primary key for our Animal table because it is extremely likely that two animals brought to the vet will have the same name.

A common method of ensuring a unique Primary Key field is to allow the database software to automatically generate the next integer in sequence when a new record is added. The Primary Key attribute is often named <TableName>ID (in this example Animal_ID), although this is not compulsory.

Animal_ID Name Weight TypeOfAnimal ...
1 Chewbacca 8 Dog
2 Tabitha 3 Cat
3 Percy 0.25 Fish

When writing table designs in shorthand, the Primary Key field is usually written first and underlined:

TableName(PrimaryKey, Attribute, Attribute, Attribute, Attribute)
Animal(Animal_ID, Name, Weight, TypeOfAnimal)

Alternatively, in an Entity Relationship Diagram, the Primary Key may be denoted by a *

Animal_ID*
Name
Weight
TypeOfAnimal

Example: Picking a Primary Key

Let's look at an example in the Criminal Table:

Table: Criminal
Attributes:
NI Number: String
Name: String
Date of Birth: Date
Number of scars: Integer
Home town: String

Which of these data items are unique?

Attribute Unique Reason
Home town No you might have several criminals living in the same town
Number of scars No you might have two criminals with the same number of scars
Date of Birth No you might have two criminals born on the same day, or twins who are criminals
Name No you might have two criminals with the same name. e.g. John Smith and John Smith
NI Number Yes this is unique for each person

We can then write the tables as:

Criminal(NI Number, Name, Date of Birth, Number of scars, Home town)

Composite Primary Key

[edit | edit source]

Sometimes a combination of attributes are used together as a key instead of one single attribute. The uniqueness is only guaranteed when the attributes are combined together. For example, in a table concerning hotel bookings, a composite key could be created by combining the RoomNumber attribute and the attribute for the StartDate of the stay. It is not possible to have two different bookings for the same room starting on the same date, so this composite key uniquely identifies each record.

RoomNumber StartDate NumberOfGuests
8 24 April 2016 2
2 12 May 2016 1
8 6 July 2016 2

RoomNumber on its own is insufficient as a unique identifier as lots of bookings are made for the same room. StartDate on its own is insufficient as a unique identifier as lots of bookings are made which start on the same date.


Example: Picking a Composite Key

Take a look at this example for houses:

Table: House
Number: integer
Road: string
Colour: string
Post code: string
Attribute Unique Reason
Number No you might have a 61 on London Rd and a 61 on Manchester Rd
Road No you might have multiple houses on the same road
Colour No more than one house might be green
post code No multiple houses might have the same post code

We could make a primary key up, but if you look carefully, we can use a combination of attributes. The house number and the road name combined seem to be unique. You can't have two 45 Belmont Close can you?.

House(Number, Road, Colour, Post code)

Does that sound ok? What about if we were storing data on all the towns in the country and there was a 5 London Road in Manchester and a 5 London Road in Winchester. This would mean that the combination was not unique. We might try using the house number and post code instead, and this combination is always unqiue, this is our composite key:

House(Number, Road, Colour, Post code)


Exercise: Keys

Spot the primary key in the following table attributes, and write out the table design:

Table: Car
Registration number: string
Colour: string
Number of doors: integer
Convertable: boolean

Answer:

Car(Registration number, Colour, Number of doors, Convertible)

Spot the primary key in the following table attributes:

Table: Student record
Name: string
Unique Learner Number: integer
Date of Birth: date
Mobile number: integer

Answer:

Student(Unique Learner Number, Name, Date of Birth, Mobile number

NOT mobile number as they might share a phone with someone else.

Where applicable rewrite the table description with composite keys, primary keys, or add a primary key for the following tables:
Receipt(CustomerID, DateTime, Total, StaffID)

Answer:

Receipt(CustomerID, DateTime, Total, StaffID)

Match(TeamA, TeamB, Date, TeamAScore, TeamBScore, RefID)

Answer:

Match(TeamA, TeamB, Date, TeamAScore, TeamBScore, RefID)

Phone(ModelID, Colour, Weight, Internet)

Answer:

Phone(ModelID, Colour, Weight, Internet)

You might argue that if this was for a particular instance of a phone instead of for a model of phone you'd need to introduce a unique value through serial number or the like

Cat(Colour, Weight, NumberofLegs, Name)

Answer:

There aren't enough fields here for us to find a unique one or combination. We could have two ginger cats weighing 1 kg with three legs called Phil. We therefore need to introduce a new unique value, CatID Cat(CatID, Colour, Weight, NumberofLegs, Name)

Foreign Key

[edit | edit source]
Foreign Key (FK) - A field in a table which corresponds to the Primary Key field of a different table


Foreign keys are used to link tables together and cross reference between tables. For example here is the Animal table from our vet example with an added foreign key of Owner_ID:

Animal_ID Name Weight TypeOfAnimal Owner_ID ...
1 Chewbacca 8 Dog 2
2 Tabitha 3 Cat 2
3 Percy 0.25 Fish 6

Owner_ID is a foreign key in the table Animal because it is the primary key of the Owner table.

The foreign key can be used to look up any of the other attributes of the owner, because it corresponds to an Owner_ID in the Owner table, which is the primary key and thus is guaranteed to refer to exactly one of the records.