Jump to content

A-level Computing/OCR/Unit 2.3 Algorithms

From Wikibooks, open books for an open world

Types of Algorithms

[edit | edit source]

Search Algorithms

[edit | edit source]
[edit | edit source]
An example of a linear search

A linear search simply looks at each item in the list of values to be searched through until it finds the correct value. This is acceptably efficient for small amounts of data, however once the volume of data increases it becomes far less efficient. In Big O Notation, it can be defined as O(n), as the time taken for search increases proportionally to the number of data items being searched through.

[edit | edit source]
An example of a binary search

A binary search requires that the data is sorted prior to being searched through. It begins by splitting the list of data in half and determining which half contains the value required. The half which does not contain the required data is disregarded, and the process repeats until the value is found.

Sorting Algorithms

[edit | edit source]

Bubble Sort

[edit | edit source]
An example of a bubble sort.

A bubble sort is the simplest type of sort. It inspects each element individually, and determines if its neighbour is larger or smaller than it. If it is smaller, it remains in its current place and the algorithm moves to the next element. If it is larger than its neighbour, the algorithm swaps the two over. This then iterates over the whole sorting item, until no more shifts are made. At this point the item can be said to be sorted.

Insertion Sort

[edit | edit source]
An example of an insertion sort

This works by creating two separate lists, a sorted list and an unsorted list. Initially an item is pulled from the unsorted list into the sorted list. From here, another item is pulled, and if the item is greater than the current item in the sorted list, it is placed in front, else it is placed behind the sorted item. The next item is pulled in and compared, if it is the largest value it goes at the front, smallest goes to the back, and if it is between both it is placed in the middle. This process repeats until the whole list is sorted.

Quick Sort

[edit | edit source]
An example of a quick sort

The quick-sort algorithm uses a value in the list known as the pivot. Any values larger than the pivot are placed to the right, any smaller go to the left. This creates two sub-lists either side of the pivot. This process repeats with both lists, until all the sub-lists have become pivots. At this point the list can be considered sorted.

Path-Finding Algorithms

[edit | edit source]

Dijkstra's algorithm

[edit | edit source]

Dijkstra's algorithm acts upon a graph and finds the shortest path from node to another. A graph is a collection of nodes connected by arcs. Arcs can have values associated with them, which are referred to as the cost of the arc. Nodes and arcs could represent many different things because they are generally an abstraction of reality. For example, nodes could represent routers on a network and arcs could be the physical connections between them and their costs being the time taken for a signal to go down one route. Alternatively, nodes might represent towns and arcs might represent the roads between them, with their costs being how much petrol is needed to travel down the road.

The algorithm begins at the starting node, set by the user, and attempts to calculate minimum cost accumulated in traversing the graph for all nodes, attempting to improve the calculated cost bit by bit.

  1. Each node is given two values.
    • A temporary cost (integer or float), representing the smallest known cost accumulated in traversing arcs to get to that node. Initially, this is set to infinity for all nodes except the starting node, whose temporary cost is set to zero, as this is the smallest cost accumulating in traversing to it.
    • A visited flag (Boolean), representing whether the current node has been visited. Initially, this is set to false for all nodes except the starting node, which is set to true.
  2. The temporary cost accumulated for each unvisited node connected to the current node is calculated.
    • Each arc has a cost, so the temporary cost for each node is set to that value
    • For successive iterations of Dijkstra's algorithm, the temporary cost is only updated if the calculated cost is less than the node's current temporary cost.
  3. The node with the smallest temporary cost is traversed to.
    • It is now the current node.
    • It is marked as visited.
  4. Steps 2 and 3 are repeated until every node connected to the current node is visited.
    • At this point, the temporary cost accumulated for each node can be considered the minimum possible cost.
    • Once complete, the path taken can be traced back to find the shortest path to a desired node.

This is largely the same as Dijkstra's algorithm, only it makes use of a heuristic to make the search more efficient. A heuristic is effectively an educated guess of what the length of the path from a given node to the target node might be. For example, when finding paths between towns, a heuristic might be the straight line distance between the towns. Heuristics like this that are always an underestimate are known as admissible heuristics and have the characteristic that when used with the A* algorithm, will always result in an optimal path being found. However, heuristics that are not admissible can still be used as they will seriously cut down the number of paths that need to be looked at before a good one is found. In many cases, following Dijkstra's algorithm would involve looking at many paths that are unlikely to be the right one, such as paths that obviously go away from the target. Heuristics give us a way to use this knowledge to speed up the algorithm we're using to find a path.

How A* differs from Dijkstra is that it has three values for each node rather than two: temporary path cost, final path cost and heuristic cost. When selecting a node to expand, for each node without a final path cost, a total value is calculated which is the sum of the temporary path cost and the heuristic cost. The node with the lowest total will be expanded.