Pascal Programming/Pointers

The new data type presented in this chapter adds another layer of abstraction to your repertoire: Pointers are by far the most complicated data type. If you master them, you have got what it takes to tackle even the supreme discipline of assembly programming. So, let’s get started!

Indirection

In Pascal there are two kinds of variable types.

So far we have been using static variables. They exist during the entire execution of a block, e. g. while a program is running or just during execution of a routine.
There is another kind called dynamic variables.^{[fn 1]} They do not necessarily “exist” during the entire block. That means, there is no static memory allocated, but the used memory space varies each time the program runs.

While using static variables, the compiler^{[fn 2]} already knows which memory chunk will be used in advance.^{[fn 3]} Dynamic variables, however, are, hence their name, dynamic in that they will be occupying different, unpredictable memory segments.

Memory is referred to by addresses. An address is, in CS, simply a number, an integer value so to speak.^{[fn 4]} When you want to refer to a certain memory block, you use its address.

See also the section Pointers in Programming Concepts.

The pointer data type is a value that stores an address. This address can then be used to access the memory it is referring to. A pointer, however, is just that: It is pointing, but not making any statement as regards to “whom”, what variable this block of memory “belongs.”

Declaration

In Pascal, a pointer data type declaration starts with a ↑ (up arrow), or alternatively and more frequently the ^ (caret) character, followed by the name of a data type.

program pointerDemo(output);
type
	charReference = ^char;

A variable of this pointer data type can point to a single char value (and no other data type). In Pascal all pointer data types have to indicate the data type of the value the pointer is referring to. This is because a pointer alone is just an address: An address merely points to the start of a memory block. There is no statement with respect to this block’s size, its length. The domain restriction, the specification of targeted value’s data type, tells the compiler “how large” a memory block will be, and in consequence how to properly read and write, how to access it.

Unlike any other data type, a pointer data type is the only data type that can use data types not declared yet. Below you will see a usage scenario, but let’s continue in the script.

Allocating memory

When you are declaring a variable in the var‑section, you are declaring a static variable. In the following code fragment c is a static variable, thus its memory location is already known.

var
	c: charReference;
begin
	{ artificially stall the program without breakpoints }
	readLn;

At this point, there is no memory space alloted to a char value yet. There is already space to store a pointer value, the address of a char value, but we do not have any space available to put it, a char value, anywhere.

In Pascal you will first need to invoke the procedure new to get memory assigned to your program. New takes one pointer variable as an argument and will reserve enough memory space to hold one value of the pointer’s domain.

	new(c);

After this operation

you occupy additional memory for, in this case, one char value the program previously did not “own”, and
c, the pointer variable itself, will give us the address of this newly allocated memory.

As with all variables of any kind, the memory space we have acquired now is totally undefined (unintialized).

Dereferencing

To use the memory we just gained we will have to follow a pointer. This is done by appending ↑ (or usually ^) to the name of the pointer variable.

	c^ := 'X';
	writeLn(c^);

This action is called dereferencing. The pointer is a (kind of) reference to the underlying char value. This char value does not have a name, but you use the pointer to access it anyway.

On this dereferenced variable we can perform all operations permissible on the pointer’s domain data type. I. e. here we are allowed to assign a char value 'X' to it, and then use it, for instance, in a writeLn as demonstrated above.

Note that something like c := 'X' will not work, because in this case c simply refers to the pointer, the address storage.

The expression c has the data type charReference.
The expression c^ has the data type char.

In Pascal it is forbidden to directly assign addresses to pointers, other than by using new. For the special case of nil, see below.

Releasing memory

After invoking new the respective memory is exclusively reserved to your program. This memory management occurs outside of your program. It is a typical task of the respective OS.

To reverse the operation of new, there is a dedicated procedure “unreserving” memory: Dispose.

	readLn;
	dispose(c);
	
	readLn;
end.

Dispose takes the name of a pointer variable, and releases previously with new allocated memory. After a dispose you may not follow, dereference, the pointer anymore. Nevertheless the pointer itself still stores the address where, in this case, the referenced char value was. Meanwhile, the “freed” memory may be used again for something or by someone else.

Lifetime

In Pascal, memory of dynamic variables remains reserved

as long as it is accessible, that means at least one pointer must point to it, or
until you specifically request to “unreserve” memory.

See also the chapter Garbage Collection in Memory Management.

If a chunk of memory is rendered inaccessible by some operation, it is automatically released. This can happen implicitly: In the program above the pointer variable c is “gone” upon program termination. Because this variable is/was the only pointer (left) pointing to our previously reserved char value, there is an automatic “invisible” dispose. Insofar, the explicit dispose from our side was not necessary.

Wikipedia has related information at Memory leak

However, unfortunately not all compilers comply with this specification as laid out in the Pascal ISO standards. For instance, Delphi, as well as the FPC (even in its {$mode ISO} compatibility mode, as of version 3.2.0) will not issue an automatic dispose. There, an explicit dispose is necessary.^{[fn 5]} Rest assured, using the GPC it is not necessary though; the GPC fully complies with the ISO standard 7185 level 1.

Note, that memory accessibility is transitive: This means that, for instance, a pointer pointing to a pointer pointing to the memory still satisfies the accessibility requirement.

Indication

The additional housekeeping of allocating and releasing memory may seem like quite a hassle, so when does that make sense?

All variables declared in a var-section need to indicate their size in advance. For some applications, however, you do not know how much data you will need to store and process. Pointers are a means to overcome this limitation. Further below we will explore how.
Pointer values can be used to represent graphs, networks, of data, allowing you to put everything into relation with each other. This means you do not need to store the same datum multiple times. A pointer value is usually, with respect to its memory requirements, a comparatively small data type. Handling pointers trades lower memory space demand for increased complexity.

Furthermore, pointer values are frequently used to implement variable parameters of routines: Due to its smaller size passing a single pointer value can be faster than passing, that means copying, for instance an entire array. This kind of use of pointers is completely transparent. Pascal equips you with an adequate language construct; you will learn more about variable parameters in the chapter on scopes.

Links

Nil pointers

See also the chapter Nothing Data Type in Programming Fundamentals.

All pointers can be assigned a literal value nil. The nil pointer value represents the notion “not pointing anywhere in particular.”

Coincidentally, nil is the only pointer value that could be used for a pointer literal.

const
	nowhere = nil;

There is no other pointer value that you could possibly specify anywhere in your source code. This also means you cannot explicitely compare any specific pointer value except nil.

Note that nil is fundamentally different to an unintialized variable. You are allowed to read the value of a pointer that has been assigned the value nil, but you are still forbidden to attempt reading the value of a variable that has not been assigned any value at all.

Attempting to dereference a pointer that currently possesses the value nil constitutes a fatal error.

Permissible operators

In the introduction we used the analogy comparing pointers to integer values. However, this is really just that. Unlike integer values, pointers are by no means “ordered”; they do not belong the class of ordinal data types. There is no ord, succ, pred defined for a pointer, but also ordering comparison operators like < or >= do not apply to pointers, not to mention any arithmetic operator is invalid in combination with a pointer value.

The only operators applicable to pointers are^{[fn 6]}

=, do two pointer values refer to the same address,
<>, do two pointer values refer to different addresses, and
:=, the assignment of a pointer value, either nil or the value of an already defined pointer variable of the same data type, to a pointer variable.

It may seem at first like quite a restriction, but it prevents you from doing potentially harmful, or even just stupid stuff.

Chicken or egg

Pointers are the only data type that can be declared using a data type yet to be declared.^{[fn 7]} This circumstance makes it possible to declare data types containing pointers, possibly to the data type being declared at hand or other yet to be declared data types. This is possible because a pointer to foo has the same memory requirements as a pointer to bar or any other data type. The domain restriction of a pointer is not (necessarily/explicitly) stored in the program.

In the following code fragment numberListItem is not yet declared, but you are still allowed declare a new pointer data type with it anyway:

program listDemo(input, output);

type
	numberListItemReference = ^numberListItem;
	
	numberListItem = record
			value: real;
			nextItemLocation: numberListItemReference;
		end;

Yet you cannot reverse the order of the declarations of numberListItemReference and numberListItem; the compiler cannot magically conclude nextItemLocation is a pointer until it has actually seen/read the respective declaration.

Putting things together

Now we can use this data structure to dynamically store a series of numbers. Pay attention when to derefercene the pointer in the following code:

var
	numberListStart: numberListItemReference;
begin
	new(numberListStart);
	readLn(numberListStart^.value);
	
	new(numberListStart^.nextItemLocation);
	readLn(numberListStart^.nextItemLocation^.value);
	
	dispose(numberListStart^.nextItemLocation);
	dispose(numberListStart);
end.

The entire program contains one static variable. Only the variable numberListStart was declared by you. During run-time, however, while the program is running you will have at one point two additional real values at your disposal.

Take notice of this example’s order of dispose statements: The supplied pointer variable must be valid, so a reverse order would not be possible in this specific case here.

Concededly, this example could have been better implemented by simply declaring two real variables. The true power of pointers becomes apparent when you are, unlike the above code, use pointers as a means of abstraction. This chapter’s exercises will delve into that.

Routines

In particular, let’s first explore a special kind of pointers: Routine parameters, that is functional and procedural parameters, are parameters of routines that allow you to statically modify the routine’s behavior by virtually passing the address of another routine. Let’s see how this works.

Declaration and use

In the formal parameter list of a routine you can declare a parameter that looks just like a routine signature:

program routineParameter(output);

procedure fancyPrint(function f: integer);
begin
	writeLn('❧ ', f:1, ' ☙')
end;

Inside the definition of fancyPrint you can use the parameter f as if it was a regular function declared and defined before and outside of fancyPrint. However, at this point it is not known what function will be used. The actual parameter f is in fact a pointer.^{[fn 8]} We only know that this pointer’s “domain” is a, any function without parameters and returning an integer value, but this is already enough we need to know.

One routine fits it all

To call this kind of routine you will need to specify an appropriate routine designator that matches the signature as regards to order, number and data types of parameters and, if applicable, the returned value’s data type.

function getRandom: integer;
begin
	{ chosen by fair dice roll: guaranteed to be random }
	getRandom := 4
end;

function getAnswer: integer;
begin
	{ the answer to the ultimate question of life, the universe and everything }
	getAnswer := 42
end;

begin
	fancyPrint(getRandom);
	fancyPrint(getAnswer)
end.

To supply a routine parameter value to a routine, simply name a compatible routine. Note that in this case you never specify any parameters, because you are not making a call here, but the called routine will do so “on behalf” of you. Specifying the routine’s name, and thus passing its address, is sufficient to achieve that.

Standard routines such as writeStr (EP) or sin cannot be used that way,^{[fn 9]} because they are an integral part of the language. There is no (singular) routine definition for them.

Caveats

As a beginner, pointers are difficult to tame. Without experience, you will frequently observe (for you) “unexpected” behaviors. Some pitfalls are presented here.

`with`-clause

Special care must be taken when using pointers in conjunction with a with-clause. The expressions listed at the top of a with-clause are evaluated once before executing any following statement. During the entire with-statement the expressions using the “short” notation will actually use an invisible transient value. This speeds up execution, because the same value is not evaluated over and over again, but there is also a caveat in it.

Surprisingly, the long notation using an FQI can become invalid, while the short notation at first seems to be still valid. The following program demonstrates the issue:

program withDemo(output);
type
	foo = record
			magnitude: integer;
		end;
	fooReference = ^foo;
var
	bar: fooReference;
begin
	new(bar);
	bar^.magnitude := 42;
	
	with bar^ do
	begin
		dispose(bar);
		bar := nil;
		{ Here, bar^.magnitude would fail horribly, }
		{ but you can still do the following: }
		writeLn(magnitude);
	end;
end.

When you compile and run this program, you will

notice that it prints anything but 42, but
it should be rather astonishing that it still prints anything at all.

The writeLn(magnitude) does actually use a “hidden (pointer) variable” and not bar. This variable’s value was evaluated one time at the top of the with-clause. The compiler does not (and cannot) complain that bar meanwhile became invalid. You are not making any assignments to the actually utilized hidden variable (i. e. it is still considered bearing a valid value), thus there is no reason for complaints.

Limits

This section primarily concerns users of Delphi and the FPC, as well as possibly some other compilers. Users of the GPC could skip this section, but understanding the theory is encouraged.

Memory is not an infinite resource. This has some grave implications.

There is no means to check whether any subsequent new will exhaust the finite resource memory. On multi-tasking OSs it is feasible that between the time you have queried the amount of free memory space and actually requesting additional memory, another program running at the same time has acquired memory so there is none, or not enough left for you. This kind of situation is known as time-of-check to time-of-use. You need to simply in a make-or-break manner ask for more memory.

This issue is rather of theoretical concern for the scope of this textbook. A standard desktop computer manufactured in the 21^st century or later will not run out of memory for any programming exercise given here. This is not supposed to mean you can waste memory.

Do not hoard memory: To mitigate potential OOM conditions, it is generally sensible to dispose memory as soon as you are certain it will not be used anymore.

Tasks

Adapt the program listDemo so it accepts an unknown number of items. The program should print the number of total items first and then a list of items.

An acceptable solution could look like this:

function readNumber: numberListItemReference;
var
	result: numberListItemReference;
begin
	new(result);
	
	with result^ do
	begin
		readLn(value);
		nextItemLocation := nil;
	end;
	
	readNumber := result;
end;

{ === MAIN ============================================================= }
var
	numberListRoot: numberListItemReference;
	currentNumberListItem: numberListItemReference;
	numberListLength: integer;
begin
	writeLn('Enter numbers and finish by abandoning input:');
	
	{ input - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	numberListRoot := readNumber;
	numberListLength := 1;
	currentNumberListItem := numberListRoot;
		
	while not EOF(input) do
	begin
		with currentNumberListItem^ do
		begin
			nextItemLocation := readNumber;
			currentNumberListItem := nextItemLocation;
		end;
		numberListLength := numberListLength + 1;
	end;
	
	{ output  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	writeLn('You’ve entered ', numberListLength:1, ' numbers as follows:');
	currentNumberListItem := numberListRoot;
	while currentNumberListItem <> nil do
	begin
		with currentNumberListItem^ do
		begin
			writeLn(value);
			currentNumberListItem := nextItemLocation;
		end;
	end;
	
	{ release memory  - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	currentNumberListItem := numberListRoot;
	while currentNumberListItem <> nil do
	begin
		with currentNumberListItem^ do
		begin
			dispose(currentNumberListItem);
			{ Note that at _this_ point, after dispose(…), writing
			  … := currentNumberListItem^.nextItemLocation
			  would be illegal! }
			currentNumberListItem := nextItemLocation;
		end;
	end;
end.

An acceptable solution could look like this:

function readNumber: numberListItemReference;
var
	result: numberListItemReference;
begin
	new(result);
	
	with result^ do
	begin
		readLn(value);
		nextItemLocation := nil;
	end;
	
	readNumber := result;
end;

{ === MAIN ============================================================= }
var
	numberListRoot: numberListItemReference;
	currentNumberListItem: numberListItemReference;
	numberListLength: integer;
begin
	writeLn('Enter numbers and finish by abandoning input:');
	
	{ input - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	numberListRoot := readNumber;
	numberListLength := 1;
	currentNumberListItem := numberListRoot;
		
	while not EOF(input) do
	begin
		with currentNumberListItem^ do
		begin
			nextItemLocation := readNumber;
			currentNumberListItem := nextItemLocation;
		end;
		numberListLength := numberListLength + 1;
	end;
	
	{ output  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	writeLn('You’ve entered ', numberListLength:1, ' numbers as follows:');
	currentNumberListItem := numberListRoot;
	while currentNumberListItem <> nil do
	begin
		with currentNumberListItem^ do
		begin
			writeLn(value);
			currentNumberListItem := nextItemLocation;
		end;
	end;
	
	{ release memory  - - - - - - - - - - - - - - - - - - - - - - - - - - - }
	currentNumberListItem := numberListRoot;
	while currentNumberListItem <> nil do
	begin
		with currentNumberListItem^ do
		begin
			dispose(currentNumberListItem);
			{ Note that at _this_ point, after dispose(…), writing
			  … := currentNumberListItem^.nextItemLocation
			  would be illegal! }
			currentNumberListItem := nextItemLocation;
		end;
	end;
end.

Write a procedure that accepts a real function and graphs its function values similar to this:

For that complete the following procedure:

program graphPlots(output);

const
	lineWidth = 80;

procedure plot(
		function f(x: real): real;
		xMinimum: real; xMaximum: real; xDelta: real;
		yMinimum: real; yMaximum: real
	);
{
	this is the part you are supposed to implement
}

function wave(x: real): real;
begin
	wave := sin(x);
end;

begin
	plot(wave, 0.0, 6.283, 0.196, -1.0, 1.0);
end.

An acceptable implementation of plot could look like this:

procedure plot(
		function f(x: real): real;
		xMinimum: real; xMaximum: real; xDelta: real;
		yMinimum: real; yMaximum: real
	);
var
	x: real;
	y: real;
	column: 0..lineWidth;
begin
	x := xMinimum;
	
	while x < xMaximum do
	begin
		y := f(x);
		{ always reset `column` in lieu of doing that in an `else` branch }
		column := 0;
		{ is function value within window? }
		if (y >= yMinimum) and (y <= yMaximum) then
		begin
			{ move everything toward zero }
			y := y - yMinimum;
			{ scale [yMinimum, yMaximum] range to [0..79] range }
			y := y * (lineWidth - 1) / (yMaximum - yMinimum);
			{ convert to integer }
			column := round(y) + 1;
		end;

The following use of write/writeLn is actually an EP extension. In Standard Pascal as laid out in ISO standard 7185 all format specifiers need to be positive integer values. Extended Pascal also allows a zero value. While for printing integer values the width specifier still indicates a minimum width, for char and string values it means the exact width. Thus the following can print a blank line when column is zero, i. e. when the function value is outside of the window.

		writeLn('*':column);

It should be an easy feat for you to adapt the writeLn line should your compiler not support this EP extension.

		x := x + xDelta;
	end;
end;

Appreciate the fact that once you have implemented plot in such a generic way, i. e. accepting a functional parameter, you can reuse it for any other function you wish to.

An acceptable implementation of plot could look like this:

procedure plot(
		function f(x: real): real;
		xMinimum: real; xMaximum: real; xDelta: real;
		yMinimum: real; yMaximum: real
	);
var
	x: real;
	y: real;
	column: 0..lineWidth;
begin
	x := xMinimum;
	
	while x < xMaximum do
	begin
		y := f(x);
		{ always reset `column` in lieu of doing that in an `else` branch }
		column := 0;
		{ is function value within window? }
		if (y >= yMinimum) and (y <= yMaximum) then
		begin
			{ move everything toward zero }
			y := y - yMinimum;
			{ scale [yMinimum, yMaximum] range to [0..79] range }
			y := y * (lineWidth - 1) / (yMaximum - yMinimum);
			{ convert to integer }
			column := round(y) + 1;
		end;

The following use of write/writeLn is actually an EP extension. In Standard Pascal as laid out in ISO standard 7185 all format specifiers need to be positive integer values. Extended Pascal also allows a zero value. While for printing integer values the width specifier still indicates a minimum width, for char and string values it means the exact width. Thus the following can print a blank line when column is zero, i. e. when the function value is outside of the window.

		writeLn('*':column);

It should be an easy feat for you to adapt the writeLn line should your compiler not support this EP extension.

		x := x + xDelta;
	end;
end;

Appreciate the fact that once you have implemented plot in such a generic way, i. e. accepting a functional parameter, you can reuse it for any other function you wish to.

Notes:

↑ The Pascal ISO standards call this idea identified variables.
↑ For the sake of simplicity we say this was the compiler’s task. Usually it is rather a task of a linker, the link editor, that determines and substitutes specific addresses.
↑ Actually the compiler does not know which (physical) memory will be used, but another abstraction layer called virtual memory administerd by the OS permits us to think that way.
↑ This is an analogy for explanation purposes. The range of integer values does not necessarily correspond to permissible pointer values (i. e. addresses). For instance on x32 targets pointers have 32 significant bits, but an integer value occupies 64 bits.
↑ Failing to release memory will probably go unnoticed. Your program will compile and run without a proper dispose. However, eventually the finite resource “memory” will be exhausted, a condition known as memory leak. If there is no sufficient memory available, any new will fail and terminate the program immediately.
↑ Some manuals call the ↑/^ an “operator”. This language, however, is imprecise. The ↑ does not alter the state of your program, do an operation, but merely instructs the compiler to treat an identifier differently than it would without the arrow’s presence.
↑ The declaration of the pointer data type and the referenced data type must occur in the same scope, in the same block, in other words in one and the same type-section.
↑ This is an implementation detail that is not specified by the ISO standards, although in fact most compilers will implement this as a pointer.
↑ Some compilers do not have this restriction, yet the ISO standards require an “activation”, which simply does not happen for standard routines.

Next Page: Files | Previous Page: Records
Home: Pascal Programming

[1] The Pascal ISO standards call this idea identified variables.

[2] For the sake of simplicity we say this was the compiler’s task. Usually it is rather a task of a linker, the link editor, that determines and substitutes specific addresses.

[3] Actually the compiler does not know which (physical) memory will be used, but another abstraction layer called virtual memory administerd by the OS permits us to think that way.

[4] This is an analogy for explanation purposes. The range of integer values does not necessarily correspond to permissible pointer values (i. e. addresses). For instance on x32 targets pointers have 32 significant bits, but an integer value occupies 64 bits.

[5] Failing to release memory will probably go unnoticed. Your program will compile and run without a proper dispose. However, eventually the finite resource “memory” will be exhausted, a condition known as memory leak. If there is no sufficient memory available, any new will fail and terminate the program immediately.

[6] Some manuals call the ↑/^ an “operator”. This language, however, is imprecise. The ↑ does not alter the state of your program, do an operation, but merely instructs the compiler to treat an identifier differently than it would without the arrow’s presence.

[7] The declaration of the pointer data type and the referenced data type must occur in the same scope, in the same block, in other words in one and the same type-section.

[8] This is an implementation detail that is not specified by the ISO standards, although in fact most compilers will implement this as a pointer.

[9] Some compilers do not have this restriction, yet the ISO standards require an “activation”, which simply does not happen for standard routines.

[fn 1]

[fn 2]

[fn 3]

[fn 4]

[fn 5]

[fn 6]

[fn 7]

[fn 8]

[fn 9]

Indirection

Declaration

Allocating memory

Dereferencing

Releasing memory

Lifetime

Indication

Links

Nil pointers

Permissible operators

Chicken or egg

Putting things together

Routines

Declaration and use

One routine fits it all

Caveats

with-clause

Limits

Tasks

`with`-clause