This is the print version of Pascal Programming You won't see this message or any elements not part of the book's content when you print or preview this page. |
The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Pascal_Programming
Pascal is an influential computer programming language named after the mathematician Blaise Pascal. It was invented by Niklaus Wirth in 1968 as a research project into the nascent field of compiler theory. The backronym PASCAL standing for primary algorithmic scientific commercial application language highlights its suitability for computing tasks in science, making it certainly usable for general programming as well.
About
[edit | edit source]- Target demographic
- Adolescent and adult programming novices
- Scope
- Standard Pascal (ISO standard 7185) and selected modern extensions.
- Description
- This books will teach high-level programming, using the programming language Pascal.
- Learning objectives
- You can analyze trivial to medium difficult programming problems, take general software engineering principles into consideration, and write POC implementations in Pascal using its strengths and knowing its limits. This book will not make a senior-level programmer out of you, but you will definitely pass any college-level introductory CS classes.
- Not covered here, but (possibly) in other Wikibooks
- Computer architecture, low-level OS interactions, specific usage of high-level libraries such as
nCurses
. - Guidelines for co‑authors
-
- American English spelling. Mathematical vocabulary, but explain words if they mean something special in mathematics.
- Use Unextended Pascal, ISO 7185, as your base, and go from there.
- Every example program is regarded and has to be by itself complete.
- Responsible authors
- These authors ensure the book follow a more or less uniform style and can be read from stem to stern with an acceptable degree of repetition. You are welcome to contribute individual chapters or sections without feeling responsible for the entire book.
- Kai Burghardt (discuss • contribs) 00:00, 11 September 2021 (UTC)
- you??
- Structure
- See, think, do: Expose the reader to beautiful code and challenge them.
Contents
[edit | edit source]Standard Pascal
- Getting started
- Beginning Pascal
- Variables and Constants
- Input and Output
- Expressions and Branches
- Routines
- Enumerations
- Sets
- Arrays
- Strings
- Records
- Pointers
- Files
- Scopes
Extensions
- Schemata
- Parameters
- Complex numbers
- Units
- Object-oriented Programming
- Exporting to libraries
- Foreign Function Interfaces
- Generics
- Miscellaneous extensions
Appendix
Alternative resources
[edit | edit source]Tutorials, Textbooks, and the like:
- “Learn Pascal tutorial” by Tao Yue
- Doug Cooper: “Oh! Pascal!”
- (available in multiple languages) Marco Cantù: “Essential Pascal” (see § “Getting Essential Pascal“)
- (in German) Delphi Crashkurs
References, Articles on certain topics:
Part Ⅰ
Standard Pascal
Beginning Pascal
Welcome to the WikiBook Pascal Programming! This book will teach you to program in Pascal, a high-level, human-readable programming language. High-level means there are abstract concepts, such as data types or control structures, which the microprocessor does not know, but the programming language provides this abstraction level. Human-readable refers to the fact that a program written in Pascal can be read like (very simple, “Neanderthalian”) English phrases. This makes Pascal particularly suitable for beginners and we hope you will appreciate this.
Prerequisites
[edit | edit source]In order to successfully use this book you need to already know a few things:
- What are and how to access and use files that are stored on a file system.
- How to install software on your OS.
- How to edit plain text files using a text file editor such as
vi(1)
, MS Notepad oremacs(1)
. (Note: A LibreOffice or Word document is not a plain text file.) - What is and how to use a CLI, e. g.
cmd.exe
on MS Windows or the Linux terminal.
Covering these topics would be out of this book’s scope. Pascal only assumes there is some user interface (i. e. a console) and there are external entities (this usually refers to “files”). Every system, however, implements them differently, so we cannot explain them to you, nor can we say at what point you have learned enough to continue with this book.
Required software
[edit | edit source]Pascal is a compiled language. That means, you need a tool, a computer program, that “translates” the human-readable Pascal source code into a sequence of Bytes the microprocessor understands. This work is done by a compiler.
Prior the 2000s there were many different compilers, but (as in 2020) there are primarily three Pascal compilers:
- Delphi,
- Free Pascal Compiler (FPC), and
- GNU Pascal Compiler (GPC).
The authors suggest FPC, due to its availability (on many platforms, and free of charge) and continuous progress in development. This table provides more information about each compiler:
compiler | homepage | platform | license | extra |
---|---|---|---|---|
Delphi | Embarcadero.com | Windows | proprietary | commercial product, with IDE |
Free Pascal | FreePascal.org | many | GPL | supports multiple dialects |
GNU Pascal | GNU-Pascal.de | All that GCC supports | GPL | considered abandoned since the 2010 |
Pascal-P | SourceForge | Public domain | ISO 7185 Level 0 only, must be compiled manually |
[Another comparison of Free Pascal and GNU Pascal]
Furthermore, you will need a program you can edit source code files with. This can be any editor (that can edit and save plain text files), but there are also dedicated suites available for programming purposes. These are called integrated development environments, in short IDE. Such IDEs provide means to write, compile, and run programs, and possibly find programming mistakes, all in one single program. Some IDEs are:
- Delphi
fp(1)
, a text-mode IDE that is shipped with the FPC- Lazarus, which is related to the FPC, but more colorful
An IDE may be overwhelming if you are just starting to program.
In this case we suggest to stick to simple editors, such as nano(1)
.
It has an easy to understand user guidance system allowing you to delve in into programming right away.
A temporary alternative for your first steps may also be websites:
- online GDB
- tutorials point: https://www.tutorialspoint.com/compile_pascal_online.php [no link, because this is site is blacklisted]
- jDoodle
- RexTester
- IDE one
All of these are powered by the FPC. Be aware of what you enter on those sites.
Working with this book
[edit | edit source]We suggest to create a dedicated folder for your programming exercises.
Keep your source code files until you have finished with this book.
If your folder becomes cluttered with all kinds of files, the FPC comes with the tool delp(1)
that can delete all (Pascal-related) files other than source code files.
Starting up
In this chapter you will learn:
- How a Pascal source code file is structured
- Basic terminology
Programs
[edit | edit source]All your programming tasks require one source code file that is called in Pascal a program
.
A program
source code file is translated by the compiler into an executable application which you can run.
Let’s look at a minimal program
source code file:
program nop;
begin
{ intentionally empty }
end.
- The first line
program nop;
indicates that this file is a Pascal source code file for a program. - The
begin
andend
mark a frame. We will explain this in detail as we move on. { intentionally empty }
is a comment. Comments will be ignored by the compiler, thus do not contribute in any way how the executable program looks or behaves.- And the final dot
.
after the finalend
informs the compiler about the program source code file’s end.
If you feel overwhelmed by the pace of this book, the Wikibook Programming Basics might be more suitable for you. |
Compilation
[edit | edit source]In order to start your program you need to compile it.
First, copy the program shown above. We advise you to actually type out the examples and not to copy and paste code.
Name the file nop.pas
.
nop
is the program’s name, and the filename extension .pas
helps you to identify the source code file.
Once you are finished, tell the compiler you have chosen to compile the program:
fpc
followed by a (relative or absolute) file name path to the source code file:fpc nop.pas
Target OS: Linux for x86-64
Compiling nop.pas
Linking nop
4 lines compiled, 0.1 sec
nop
. This is the executable program you can start.
gpc
followed by a (relative or absolute) file name path to the source code file:gpc nop.pas
gpc
will not report any errors, but there will be a new file (by default) called a.out
.
Finally, you can then execute the program by one of the methods your OS provides.
For example on a console you simply type out the file name of the executable file:
./nop
(where ./
refers to the current working directory in Unix-like environments)
As this program does (intentionally) nothing, you will not notice any (notable) changes.
After all, the program’s name nop
is short for no operation.
The computer speaks
[edit | edit source]Congratulations to your first Pascal program! To be fair, though, the program is not of much use, right? As a small step forward, let’s make the computer speak (metaphorically) and introduce itself to the world:
program helloWorld(output);
begin
writeLn('Hello world!');
end.
Program header
[edit | edit source]The first difference you will notice is in the first line.
Not only the program name changed, but there is (output)
.
This is a program parameter.
In fact, it is a list.
Here, it only contains one item, but the general form is (a, b, c, d, e, …)
and so on.
A program parameter designates an external entity the OS needs to supply the program with, so it can run as expected.
We will go into detail later on, but for now we need to know there are two special program parameters:
input
and output
.
These parameters symbolize the default means of interacting with the OS.
Usually, if you run a program on a console, output
is the console’s display.
Writing to the console
[edit | edit source]The next difference is writeLn('Hello world!')
.
This is a statement.
The statement is a routine invocation.
The routine is called writeLn
.
WriteLn
has (optional) parameters.
The parameters are, again, a comma-separated list surrounded by parentheses.
Routines
[edit | edit source]Routines are reusable pieces of code that can be used over and over again.
The routine writeLn
, short for write line, writes all supplied parameters to the destination followed by a “newline character” (some magic that will move the cursor to the next line).
Here, however, the destination is invisible.
That is, because it is optional it can be left out.
If it is left out, the destination becomes output
, so our console output.
If we want to name the destination explicitly, we have to write writeLn(output, 'Hello world!')
.
WriteLn(output, 'Hello world!')
and writeLn('Hello world!')
are identical.
The missing optional parameter will be inserted automatically, but it relieves the programmer from typing it out.
In order to use a routine, we write its name, as a statement, followed by the list of parameters. We did that in line 2 above.
String literals
[edit | edit source]The parameter 'Hello world!'
is a so-called string literal.
Literal means, your program will take this sequence of characters as it is, not interpret it in any way, and pass it to the routine.
A string
literal is delimited by typewriter (straight) apostrophes.
Reserved words
[edit | edit source]In contrast to that, the words program
, begin
and end
(and many more you see in a bold face in the code examples) are so-called reserved words.
They convey special meaning as regards to how to interpret and construct the executable program.
You are only allowed to write them at particular places.
Nevertheless, you can write the string literal 'program' . The string delimiters “disable” interpretation.
|
Behavior
[edit | edit source]helloWorld.pas
, copy the source code (by typing it manually), compile and run it:
program helloWorld(output);
begin
writeLn('Hello world!');
end.
Hello world!
Hello world!
, without the straight quotation marks, on an individual line to the console. Isn’t that great?
program helloWorld(input, output);
begin
writeLn('Hello world!');
readLn();
end.
readLn()
will make your program stall, so the program is not considered done. After you hit ↵ Enter the terminal window should close again.
This type of program, by the way, is an example of a class of “Hello world” programs. They serve the purpose for demonstrating minimal requirements a source code file in any programming language needs to fulfill. For more examples see Hello world in the WikiBook “Computer Programming” (and appreciate Pascal’s simplicity compared to other programming languages).
Comments
[edit | edit source]We already saw the option to write comments. The purpose of comments is to serve the programmer as a reminder.
Comment syntax
[edit | edit source]Pascal defines curly braces as comment delimiting characters:
{ comment }
(spaces are for visual guidance and have no significance).
The left brace opens or starts a comment, and the right brace closes a comment.
“Inside” a comment you cannot use the comment closing character as part of your text. The first occurrence of the proper closing character(s) will be the end of the comment. |
However, when Pascal was developed not all computer systems had curly braces on their keyboards.
Therefore the bigramms (a pair of letters) using parentheses and asterisks was made legal, too:
(* comment *)
.
Such comments are called block comments.
They can span multiple lines.
Delphi introduced yet another style of comment, line comments.
They start with two slashes //
and comprise everything until the end of the current line.
Delphi, the FPC as well as GPC support all three styles of comments.
Helpful comments
[edit | edit source]There is an “art” of writing good comments.
Comments should not repeat what can be deduced from the source code itself.
program helloWorld(output);
begin { This is where the program begins }
writeLn('Hello world!');
end. (* This is where the program ends. *)
|
Comments should explain information that is not apparent:
program nop;
begin
{ intentionally empty }
end.
|
|
When writing a comment, stick to one natural language.
In the chapters to come you will read many “good” comments (unless they clearly demonstrate something like below).
Terminology
[edit | edit source]Familiarize with the following terminology (that means the terms on the right printed as comments):
program demo(input, output); // program header
// ───────────────────────────────────┐
const // ────────────────────┐ │
answer = 42; // constant definition ┝ const-section│
// ────────────────────┘ │
type // ────────────────────┐ │
employee = record // ─┐ │ │
number: integer; // │ │ │
firstName: string; // ┝ type definition │ │
lastName: string; // │ ┝ type-section │
end; // ─┘ │ │
// │ │
employeeReference = ^employee; // another type def. │ │
// ────────────────────┘ ┝ block
// │
var // ────────────────────┐ │
boss: employeeReference; // variable declaration┝ var-section │
// ────────────────────┘ │
// │
begin // ────────────────────┐ │
boss := nil; // statement │ │
writeLn('No boss yet.'); // another statement ┝ sequence │
readLn(); // another statement │ │
end. // ────────────────────┘ │
// ───────────────────────────────────┘
Note, how every constant and type definition, as well as every variable declaration all go into dedicated sections.
The reserved words const
, type
, and var
serve as headings.
A sequence is also called a compound statement. The combination of definitions, declarations and a sequence is called a block. Definitions and declarations are optional, but a sequence is required. The sequence may be empty, as we already demonstrated above, but this is usually not the case.
Do not worry, the difference between definition and declaration will be explained later. For now you should know and recognize sections and blocks.
Tasks
[edit | edit source]program commentDemo;
begin
{ (* Hello { { { }
(* (* { (* Foo }
{ (* Bar *)
The first comment-ending character(s) demarcate the end of the entire comment, regardless whether it started with {
or (*
. That means, here the compiler will complain:
{ start (* again? } *)
Line comments are immune to this, since they do not have an explicit end delimiter. This will compile without errors:
// *) } { (*
end.
writeLn
(note the lack of a parameter list) do?WriteLn
without any supplied parameters prints an empty line to the default destination, i. e. output
.
program
that shows this (or similar):
#### ####
######## ########
## ##### ##
## # ##
## ILY ##
## sweetie ##
### ###
### ###
### ###
###
#
program valentine(output);
begin
writeLn(' #### ####');
writeLn(' ######## ########');
writeLn(' ## ##### ##');
writeLn(' ## # ##');
writeLn(' ## ILY ##');
writeLn(' ## sweetie ##');
writeLn(' ### ###');
writeLn(' ### ###');
writeLn(' ### ###');
writeLn(' ###');
writeLn(' #');
end.
Note, the program parameter list (first line) only lists output
.
Beware, while the exact number of spaces do not matter in your code, they do matter in string literals.
Variables and Constants
Like all programming languages, Pascal provides some means to modify memory. This concept is known as variables. Variables are named chunks of memory. You can use them to store data you cannot predict.
Constants, on the other hand, are named pieces of data. You cannot alter them during run-time, but they are hard-coded into the compiled executable program. Constants do not necessarily occupy any dedicated unique space in memory, but facilitate writing clean and understandable source code.
Declaration
[edit | edit source]In Pascal, before you are even allowed to use any variable or constant you have to declare them, like virtually any symbol in Pascal. A declaration makes a certain symbol known to the compiler and possibly instructs it to make the necessary provisions for their effective usage, that means – in the context of variables – earmark some piece of memory.
A declaration is always a two-tuple , to be more specific, variables are declared like and constant declarations are tuples. A tuple is an ordered collection. You may not reverse or rearrange its items without the tuple rendering to be different.
After you have declared an identifier to refer to one thing, you may not re-declare the same identifier to refer to another (or same) thing (“shadowing” may apply, but more on that later). |
Identifiers
[edit | edit source]Structure
[edit | edit source]Identifiers are names denoting constants, types, bounds, variables, procedures, and functions. They must begin with a letter, which may be followed by any combination and numbers of letters and digits. The spelling of an identifier is significant over its whole length. Corresponding upper-case and lower-case letters are considered equivalent.[1]
Letters refers to the modern Latin alphabet, that is all letters you use in writing English words, and digits are Western Arabic digits.
Usage
[edit | edit source]As you infer from the quote’s last sentence, the casing of letters does not matter:
Foo
and fOO
are both the same identifier, just different representations.
Identifiers are used simply by writing them out at a suitable position.
Significant characters
[edit | edit source]In the age Pascal was developed in, computer memory was a precious resource. In order to build a working compiler, however, the notion of significant characters was introduced. A significant character of an identifier is a character that contributes to distinguishing two identifiers from one another.
Some programming languages had a limit of 8 (eight) characters. This led to very cryptic identifiers. Today, however, the limit of significant characters is primarily governed by usability: The programmer eventually has to type them out if no IDE supports some auto-completion mechanism. The FPC, for example, has a limit of 127 characters:
Identifiers consist of between
1
and127
significant characters (letters, digits and the underscore character), of which the first must be a letter (a
‑z
orA
‑Z
), or an underscore (_
).[2]
You are still allowed to write identifiers longer than 127 characters, however, the compiler only looks at the first 127 characters and discards the remaining characters as irrelevant. |
Note, allowing _
, too, is an ISO 10206 (“Extended Pascal”) extension, but – unlike the FPC – it imposes the restriction that an identifier may neither begin or end an identifier, nor may two underscores appear one another.
Variables
[edit | edit source]Variable section
[edit | edit source]Variables are declared in a dedicated section, the var
-section.
program varDemo(input, output);
var
number: integer;
begin
write('Enter a number: ');
readLn(number);
writeLn('Great choice! ', number, ' is awesome.');
end.
When the compiler processes the var
-section it will set as much memory aside as is required by its associated data type.
Here, we instruct the compiler to reserve space for an integer
.
An integer
is a data type that is part of the programming language, thus it is guaranteed to be present regardless of the used compiler.
It stores a subset of ℤ, the set of integers, like for example 42
, 1337
or -1
.
Data type
[edit | edit source]Data type refers to the combination of a permissible range of values and permissible operations on this range of values.
Pascal defines some basic data types as part of the language.
Apart from integer
there are also:
char
- A character, like a Latin letter or Western Arabic digit, but also spaces and other characters.
real
- A subset of ℚ, that is – due to computer’s binary nature – the set of rational numbers. Examples are
0.015625
(2−6) or73728.5
(216 + 213 + 2−1). Boolean
- A Boolean value, that is
false
ortrue
.
Each data type defines how data are laid out in memory. In a high-level language, such as Pascal, it is not of the programmer’s concern how exactly the data are stored, but the processor (i. e. in most cases a compiler) has to define it.
We will revisit all data types later on.
Reading from the console
[edit | edit source]As you may have noticed, the example above contains readLn(number)
and the program header also lists input
.
ReadLn
will (try to) read data from the (optionally named) source and store the (interpreted) values into the supplied parameters discarding any line-end characters.
If the source is not specified, like it is the case here, input
is assumed, thus readLn(number)
is equivalent to readLn(input, number)
, but shorter.
When the program is run, it will stop and wait for the user to input a number, that is a literal that can be converted into the argument’s data type.
Enter a number: I want cookies!
./a.out: sign or digit expected (error #552 at 402ac3)
writeLn
was not executed. Now obviously I want cookies!
is not a literal that can be converted into an integer
value (i. e. the data type of number
). For reference, this error message was generated with the program compiled using the GPC. Programs compiled with different compilers may emit different error messages.You have to indicate in your program’s accompanying documents – the user manual – how and when the user needs to input data. Later we will learn how to treat erroneous input, but this is too complex for now.
More variables
[edit | edit source]There can be as many var
-sections as necessary, but they may not be empty.
There is also a shorthand syntax for declaring many variables of the same type:
var
foo, bar, x: integer;
This will declare three independent variables, all of the integer
data type.
Nonetheless, different types have to appear in different declarations:
var
x: integer;
itIsSunnyInPhiladelphia: Boolean;
Constants
[edit | edit source]Constant section
[edit | edit source]program constDemo(output);
const
answer = 42;
begin
writeLn('The answer to the Ultimate Question of ',
'Life, the Universe, and Everything, is: ',
answer);
end.
Usage
[edit | edit source]As already mentioned in the introduction, a constant may never change its value, but you have to modify the source code. Consequently, the name of a constant cannot appear on the left-hand side of an assignment.
Pre‑defined constants
[edit | edit source]There are some already predefined constants:
maxInt
- This is the maximum
integer
value aninteger
variable could assume. There is no minimum integer constant, but it is guaranteed that ainteger
variable can at least store the value-maxInt
. maxChar
- Likewise, this is the maximum
char
value achar
variable could assume, where maximum refers to the ordinal value using the built-inord
function. maxReal
,minReal
andepsReal
- Are defined by the “Extended Pascal” standard.
false
andtrue
- Refer to Boolean values.
Rationale
[edit | edit source]Pascal was designed, so – among other considerations – it could be compiled in one pass, from top to bottom: The reason being to make compiling fast and simple. Distinguishing between variables and constants allows the processor to simply substitute any occurrence of a constant identifier to be replaced by its value. Thus, a constant does not need any special treatment like a variable, yet allows the programmer to reuse reappearing data.
Tasks
[edit | edit source]Zähler
(meaning “counter” / “enumerator”) constitute a valid identifier?
1direction
(1D
) a permissible identifier?
write
and writeLn
?writeLn
puts the cursor into the next line after it has printed all its parameters.
References:
- ↑ Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5.
{{cite book}}
: no-break space character in|title=
at position 7 (help) - ↑ Michaël Van Canneyt (September 2017). "§1.4". Free Pascal Reference guide. version 3.0.4. p. 15. ftp://ftp.freepascal.org/pub/fpc/docs-pdf/ref.pdf. Retrieved 2019-12-14.
Input and Output
We already have been using I/O since the first chapter, but only to get going. It is time to dig a little bit deeper, so we can write nicer programs.
Interface
[edit | edit source]In its heydays Pascal was so smart and defined a minimal common, yet convenient interface to interact with I/O. Despite various standardization efforts I/O operations differ among every single OS, yet – as part of the language – Pascal defines a set of operations to be present, regardless of the utilized compiler or OS.
Special files
[edit | edit source]In the first chapter it was already mentioned that input
and output
are special program parameters.
If you list them in the program parameter list, you can use these identifiers to write and read from the terminal, the CLI you are using.
Text files
[edit | edit source]In fact, input
and output
are variables.
Their data type is text
.
We call a variable that has the data type text
a text file.
The data of a text file are composed of lines. A line is a (possibly empty) sequence of characters (e. g. letters, digits, spaces or punctuation) until and including a terminating “newline character”.
Files
[edit | edit source]A file – in general – has the following properties:
- It can be associated with an external entity. External means “outside” of your program. A suitable entity can be, for instance, your console window, a device such as your keyboard, or a file that resides in your file system.
- If a file is associated with an external entity, it is considered bound.
- A file has a mode. Every file can be in generation or inspection mode, none or both. If a file is in generation and inspection mode at the same time, this can also be called update mode.[fn 1]
- Every file has a buffer. This buffer is a temporary storage for writing or reading data, so virtually another variable. This buffer variable exists due to reasons how I/O on computers works.
All this information is implicitly available to you, you do not need to take care of it. You can query and alter some information in predefined ways.
All you have to keep in mind in order to successfully use files is that a file has a mode.
The text files input
and output
are, once they are listed in the program parameter list, in inspection and generation mode respectively.
You can only read
data from files that are inspection mode.
And it is only possible to write
data to files that are generation mode.
Note, due to their special nature the mode of input
and output
cannot be changed.
Routines
[edit | edit source]Pascal defines the following routines to read and write to files:
get
,put
,read
/readLn
, andwrite
/writeLn
.
The routines readLn
and writeLn
can only be used in conjunction with text files, whereas all other routines work with any kind of file.
In the following sections we will focus on read
and write
.
These routines build upon the “low-level” get
and put
.
In the chapter “Files” we will take a look at them, though.
Writing data
[edit | edit source]Let’s look at a simple program:
program writeDemo(output);
var
x: integer;
begin
x := 10;
writeLn(output, x:20);
end.
Copy the program and see what it does.
Assignment
[edit | edit source]First, we will learn a new statement, the assignment.
Colon equals (:=
) is read as “becomes”.
In the line x := 10
the variable’s value becomes ten.
On the left hand side you write a variable name.
On the right hand side you put a value.
The value has to be valid for the variable’s data type.
For instance, you could not assign 'Hello world!'
to the variable x
, because it is not a valid integer
, i. e. the data type x
has.
Converting output
[edit | edit source]The power of write
/writeLn
is that – for text files – it converts the parameters into a human-readable form.
On modern computers the integer
value ten is stored in a particular binary form.
00001010
is a visual representation of the bits set (1
) and unset (0
) for storing “ten”.
Yet, despite the binary storage the characters you see on the screen are 10
.
This conversion, from zeroes and ones into a human-readable representation, the character sequence “10”, is done automatically.
If the destination of write /writeLn is a text file, all parameters are converted into a human-readable form provided such conversion is necessary and makes sense.
|
Formatting output
[edit | edit source]Furthermore, after the parameter x
comes :20
.
As you might have noticed, when you run the program the value ten is printed right-aligned making the 0
in 10
appear at the 20th column (position from the left margin).
The :20
is a format specifier.
It ensures that the given parameter has a minimum width of that many characters and it may fill missing “width” with spaces to left.
Format specifiers in a write /writeLn call can only be specified where a human-readable representation is necessary, in other words if the destination is a text file.
|
Reading data
[edit | edit source]Look at this program:
program iceCream(input, output);
var
response: char;
begin
writeLn('Do you like ice cream?');
writeLn('Type “y” for “yes” (“Yummy!”) and “n” for “no”.');
writeLn('Confirm your selection by hitting Enter.');
readLn(input, response);
if response = 'y' then
begin
writeLn('Awesome!');
end;
end.
Requirements
[edit | edit source]All parameters given to read
/readLn
have to be variables.
The first parameter, the source, has to be a file variable which is currently in inspection mode.
We ensure that by putting input
into the program parameter list.
If the source parameter is input
, you are allowed to omit it, thus readLn(response)
is equivalent to readLn(input, response)
.
Branching
[edit | edit source]A new language construct which we will cover in detail in the next chapter is the if
-then
-branch.
The code after then
that is surrounded by begin
and end;
is only executed if response
equals to the character value 'y'
.
Otherwise, we are polite and do not express our strong disagreement.
Tasks
[edit | edit source]write
to input
? Why does / should it work, or not?write
to input
. The text file input
is, provided it is listed in the program parameter list, in inspection mode. That means you can only read
data from this text file, never write
.
read
to a constant?read
/readLn
have to be variables. A constant, per definition, does not change its value during run-time. That means, also the user cannot assign values to a constant.
program valentine
from the first chapter and improve it with knowledge you have learned in this chapter: Make the heart ideogram appear (sort of) centered. Assume a console window width of 80 characters, or any reasonable width.program valentine(output);
const
width = 49;
begin
writeLn(' #### #### ':width);
writeLn(' ######## ######## ':width);
writeLn('## #### ##':width);
writeLn('## # ##':width);
writeLn('## ILY ##':width);
writeLn(' ## sweetie ## ':width);
writeLn(' ### ### ':width);
writeLn(' ### ### ':width);
writeLn(' ### ### ':width);
writeLn(' ### ':width);
writeLn(' # ':width);
end.
string
literal is shorter than width
(otherwise it does not resemble a heart ideogram anymore):program valentine(output);
const
width = 49;
begin
writeLn( '#### #### ':width);
writeLn( '######## ######## ':width);
writeLn('## #### ##':width);
writeLn('## # ##':width);
writeLn('## ILY ##':width);
writeLn( '## sweetie ## ':width);
writeLn( '### ### ':width);
writeLn( '### ### ':width);
writeLn( '### ### ':width);
writeLn( '### ':width);
writeLn( '# ':width);
end.
o--------------------------------------o
| |
| |
| |
| |
o--------------------------------------o
''
. It is two straight typewriter’s apostrophes back-to-back. You can use it in your solution.program box(output);
const
space = 38;
begin
writeLn('o--------------------------------------o');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('o--------------------------------------o');
end.
'':space
will generate 38 (that is the value of the constant space
) spaces. If you are really smart, you have noticed that the top and bottom edges of the box are the same literal twice. We can shorten our program even further:program box(output);
const
space = 38;
topAndBottomEdge = 'o--------------------------------------o';
begin
writeLn(topAndBottomEdge);
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn(topAndBottomEdge);
end.
Notes:
- ↑ “Update” mode is only available in Extended Pascal (ISO standard 10206). In Standard (unextended) Pascal (laid out in ISO standard 7185) a file can be either in inspection or generation mode, or none.
Expressions and Branches
In this chapter you will learn
- to distinguish between statements and expressions, and
- how to program branches.
Statement
[edit | edit source]Before we get know “expressions”, let’s define “statements” more precisely, shall we: A statement tells the computer to change something. All statements in some way or other change the program state. Program state refers to a whole conglomerate of individual states, including but not limited to:
- the values variables have, or
- in general the program’s designated memory contents, but also
- (implicitly) which statement is currently processed.
The last metric is stored in an invisible variable, the program counter. The PC always points to the currently processed statement. Imagine pointing with your finger to one source code line (or, more precisely, statement): “Here we are!” After a statement has successfully been executed, the PC advances to the effect that it points to the next statement.[fn 1] The PC cannot be altered directly, but only implicitly. In this chapter we will learn how.
Classification
[edit | edit source]Statements can be categorized into two groups: Elementary and complex statements. Elementary statements are the minimal building blocks of high-level programming languages. In Pascal they are:[fn 2][fn 3]
- Assignments (
:=
), and - Routine[fn 4] invocations (such as
readLn(x)
andwriteLn('Hi!')
).
“Complex” statements are:
- Sequences (surrounded by
begin
andend
), - branches, and
- loops.
Semicolon
[edit | edit source]Unlike many other programming languages, in Pascal the semicolon ;
separates two statements.
Lots of programming languages use some symbol to terminate a statement, e. g. the semicolon.
Pascal, however, recognized that an extra symbol should not be part of a statement in order to make it an actual statement.
The helloWorld
program from the second chapter could be written without a semicolon after the writeLn(…)
, because there is no following statement:
program helloWorld(output);
begin
writeLn('Hello world!')
end.
We, however, recommend you to put a semicolon there anyway, even though it is not required. Later in this chapter you will learn one place you most probably (that means not necessarily always) do not want to put a semicolon.
Although a semicolon does not terminate a statement, the program header, constant definitions, variable declarations and some other language constructs are terminated by this symbol. You cannot omit a semicolon at these locations.
Expressions
[edit | edit source]Expressions, in contrast to statements, do not change the program state. They are transient values that can be used as part of statements. Examples of expressions are:
42
,'D’oh!'
, orx
(wherex
is the name of a previously declared variable).
Every expression has a type:
When an expression is evaluated it results in a value of a certain data type.
The expression 42
has the data type integer
, 'D’oh!'
is a “string type” and the expression merely consisting of a variable’s name, such as x
, evaluates to the data type of that variable.
Because the data type of an expression is so important, expressions are named after their type.
The expression true
is a Boolean expression, as is false
.
Using expressions
[edit | edit source]Expressions appear at many places:
- In the assignment statement (
:=
) you write an expression on the RHS. This expression has to have the data type of the variable on the LHS.[fn 5] An assignment makes the transient value of an expression “permanent” by storing it into the variable’s memory block. - The parameter lists of routine invocations consist of expressions. In order to invoke a routine all the parameters have to be stored in memory. Think of a routine invocation as a sequence of assignments to invisible variables before the routine is actually called. Thus
writeLn(output, 'Hi!')
can be understood as- destination becomes
output
- “first parameter” becomes
'Hi!'
- call the routine
writeLn
with the invisible “variables” destination and “first parameter”
- destination becomes
- In a constant definition the RHS is also an expression, although – hence their name – it has to be constant. You could not use, for instance, a variable as part of that expression.
Linking expressions
[edit | edit source]The power of expressions lies in their capability to link with other expressions.
This is done by using special symbols called operators.
In the previous chapter we already saw one operator, the equals operator =
.
Now we can break up such an expression:
response = 'y' {
│ └──────┰─────┘ ┃ └──────┰──────┘ │
│ sub-expression operator sub-expression │
│ │
└─────────────────────┰─────────────────────┘
expression }
As you can you can see in the diagram, an expression can be part of a larger expression.
The sub-expressions are linked using the operator symbol =
.
Sub-expressions that are linked via, or associated with an operator symbol are also called operands.
Comparisons
[edit | edit source]Linking expressions via an operand “creates” a new expression which has a data type on its own.
While response
and 'y'
in the example above were both char
-expressions, the overall data type of the whole expression is Boolean
, because the linking operator is the equal comparison.
An equal comparison yields a Boolean expression.
Here is a table of relational operators which we can already use with our knowledge:
name | source code symbol |
---|---|
=, equals | =
|
≠, unequal | <>
|
<, less than | <
|
>, greater than | >
|
≤, less than or equal to | <=
|
≥, greater than or equal to | >=
|
Using these symbols yield Boolean expressions.
The value of the expression will be either true
or false
depending on the operator’s definition.
All those relational operators require operands on both sides to be of the same data type.[fn 6]
Although we can say '$' = 1337
is wrong, that means it should evaluate to the value false
, it is nevertheless illegal, because '$'
is a char
‑expression and 1337
is an integer
‑expression.
Pascal forbids you to compare things/objects that differ in their data type.
So, I guess, y’can’t compare apples ’n’ oranges after all.
(Note, a few conversion routines will allow you to do some comparisons that are not allowed directly, but by taking a detour. In the next chapter we will see some of them.)
Calculations
[edit | edit source]Expressions are also used for calculations, the machine you are using is not called “computer” for no reason.
In Standard Pascal you can add, subtract, multiply and divide two numbers, i. e. integer
‑ and real
‑expressions and any combination thereof.
The symbols that work for all combinations are:
name | source code symbol |
---|---|
+, plus | +
|
−, minus | -
|
×, times | *
|
The division operation has been omitted as it is tricky, and will be explained in a following chapter.
If at least one of the operands is a real ‑expression, the entire expression is of type real , even if the exact value could be represented by an integer .
|
Note, unlike in mathematics, there is no invisible times assumed between two “operands”:
You always need to write the “times”, meaning the asterisk *
explicitly.
The operator symbols +
and -
can also appear with one number expression only.
It then indicates the positive or negative sign, or – more formally – sign identity or sign inversion respectively.
Operator precedence
[edit | edit source]Just like in mathematics, operators have a certain “force” associated with them, in CS we call this operator precedence. You may recall from your primary or secondary education, school or homeschooling, the acronym PEMDAS: It is a mnemonic standing for the initial letters of
- parentheses
- exponents
- multiplication / division
- addition / subtraction
giving us the correct order to evaluate an arithmetic expression in mathematics. Luckily, Pascal’s operator precedence is just the same, although – to be fair – technically not defined by the word “PEMDAS”.[fn 7]
As you might have guessed it, operator precedence can be overridden on a per-expression basis by using parentheses:
In order to evaluate 5 * (x + 7)
, the sub-expression x + 7
is evaluated first and that value is then multiplied by 5
, even though multiplication is generally evaluated prior sums or differences.
Branches
[edit | edit source]Branches are complex statements.
Up to this point all programs we wrote were linear:
They started at the top and the computer (ideally) executed them line-by-line until the final end.
.
Branches allow you to choose alternative paths, like at a T‑bone intersection: “Do I turn left or do I turn right?”
The general tendency to process the program “downward” remains, but there is (in principle) a choice.
Conditional statement
[edit | edit source]Let’s review the program iceCream
from the previous chapter.
The conditional statement is highlighted:
program iceCream(input, output);
var
response: char;
begin
writeLn('Do you like ice cream?');
writeLn('Type “y” for “yes” (“Yummy!”) and “n” for “no”.');
writeLn('Confirm your selection by hitting Enter.');
readLn(input, response);
if response = 'y' then
begin
writeLn('Awesome!');
end;
end.
Now we can say that response = 'y'
is a Boolean expression.
The words if
and then
are part of the language construct we call conditional statement.
After then
comes a statement, in this case a complex statement:
begin … end
is a sequence and considered to be one statement.
If you remember or can infer from the source code, the statements between begin … end
, the writeLn('Awesome!')
is only executed if the expression response = 'y'
evaluated to true
.
Otherwise, this is skipped as if there was nothing.
Due to this binary nature – yes / no, execute the code or skip it – the expression between if
and then
has to be a Boolean expression.
You cannot write if 1 * 1 then …
, since 1 * 1
is an integer
-expression.
The computer cannot decide based on an integer
-expression, whether it shall take a route or not.
Alternative statement
[edit | edit source]Let’s expand the program iceCream
by giving an alternative response if the user says not to like ice cream.
We could do this with another if
‑statement, yet there is a smarter solution for this frequently occurring situation:
if response = 'y' then
begin
writeLn('Awesome!');
end
else
begin
writeLn('That’s a pity!');
end;
The highlighted alternative, the else
‑branch, will only be executed if the supplied Boolean expression evaluated to false
.
In either case, regardless whether the then
‑branch or the else
‑branch was taken, program execution resumes after the else
‑statement (in this after the end;
in the last line).
Relevance
[edit | edit source]Branches and (soon explained) loops are the only method of modifying the PC, “your finger” pointing to the currently executed statement, based on data, an expression, and thus a means of responding to user input. Without them, your programs would be static and do the same over and over again, so pretty boring. Utilizing branches and loops will make your program way more responsive to the given input.
Tasks
[edit | edit source]char
-expressions? All? None?'0'
through '9'
, 'A'
through 'Z'
, and 'a'
through 'z'
are sorted as you are familiar from the English alphabet, or – with respect to the digits – their numeral value in ascending order.
Because of that you are allowed to make a comparison such as 'A' <= 'F'
(which will evaluate to true
).
Notes:
- ↑ This paragraph intentionally uses imprecise terminology to keep things simple. The PC is in fact a processor register (e. g.
%eip
, extended instruction pointer) and points to the following instruction (not current statement). See Subject: Assembly languages for more details. - ↑ Jumps (
goto
) have been deliberately banned into the appendix, and are not covered here, yetgoto
is also an elementary statement. - ↑ Exception extensions also define
raise
as an elementary statement. - ↑ More correctly:
procedure
calls. - ↑ Or, a “compatible” data type, e. g. an
integer
expression can be stored into a variable of the data typereal
, but not the other way round. As we progress we will learn more about “compatible” types. - ↑ In the chapter on sets we will expand this statement.
- ↑ To read the technical definition, see § Expressions, subsection 1 “General” in the ISO standard 7185.
Routines
In the opening chapter routines were already mentioned.
Routines are, as it was described before, reusable pieces of code that can be used over and over again.
Examples of routines are read
/ readLn
and write
/ writeLn
.
You can invoke, call, these routines as many times as you want.
In this chapter you will learn
- how to define your own routines,
- the difference between a definition and declaration, and
- the difference between functions and procedures.
Different routines for different occasions
[edit | edit source]Routines come in two flavors.
In Pascal, routines can either replace statements, or they replace a (sub‑)expression.
A routine that can be used where statements are allowed is called a procedure
.
A routine that is called as part of an expression is a function
.
Functions
[edit | edit source]A function
is a routine that returns a value.
Pascal defines, among others, a function odd
.
The function odd
takes one integer
-expression as a parameter and returns false
or true
, depending on the parity of the supplied parameter (in layman terms that means whether it is divisible by 2).
Let’s see the function odd
in action:
program functionDemo(input, output);
var
x: integer;
begin
write('Enter an integer: ');
readLn(x);
if odd(x) then
begin
writeLn('Now this is an odd number.');
end
else
begin
writeLn('Boring!');
end;
end.
Odd(x)
is pronounced “odd of x”.
First, the expression in parentheses is evaluated.
Here it is simply x
, the variable’s value to be precise, but a more complex expression is allowed too, as long as it eventually evaluates to an integer
-expression.
The value of this expression, the actual parameter, is then handed to a (in this case invisible) block of code that processes the input, performs some calculations on it, and returns false
or true
according to the calculation’s findings.
The function’s returned value is ultimately filled in in place of the function call.
You can, in your mind, read false
/ true
in place of odd(x)
, although this is dynamic depending on the given input.
You are only allowed to call functions where you can put an expression. The following program is wrong:
program lostFunction;
begin
odd(42);
end.
false . But false is not a statement. You can only put statements between begin and end , no expressions.[fn 1] |
Procedures
[edit | edit source]Procedures on the other hand cannot be used as part of an expression. You can only call procedures where statements are allowed.
A routine can either be a function or a procedure. In some programming languages the routine used to retrieve data from the console can be used like a function, but this is not the case in Pascal. The following program will not compile:
program strayProcedure(input, output);
begin
if readLn(input) = '' then
begin
writeLn('Error: No input supplied.');
end;
end.
ReadLn refers to a procedure thus it does not return anything, yet at this specific position a value has to be inserted so the if ‑branch language construct and the equal comparison make sense. |
Effects
[edit | edit source]A procedure
may use functions, and the other way around.
Do not understand a function
as a mere substitute for an expression.
In the following section we will learn why.
Rationale
[edit | edit source]The dichotomy of routines, distinguishing between a procedure
and a function
, is meant to gently push the programmer to write “clean” programs.
Doing so, a routine does not conceal whether it is just a replacement for a sequence of statements or shorthand for a complex, difficult to write out expression.
This kind of notation works without introducing nasty pseudo types like, for example, void
in the C programming language where every routine is a function, but the “invalid” data type void
will allow you to make it (in part) behave like a procedure
.
Definition
[edit | edit source]Defining routines follows a pattern you are already familiar with since your very first program
.
A program
is, in some regards, like a special routine:
You can run it as many times as you want through OS-defined means.
A program
’s definition looks almost just like a routine’s.
A routine is defined by,
- a header, and
- a block
in that order.
The routine header shows a couple differences depending on whether it is a function
or procedure
.
We will first take a look at blocks, since these are the same for both types of routines.
Block
[edit | edit source]A block is the synthesis of a productive part (statements) and (optional) declarations and definitions. In Standard Pascal (as laid out by the ISO standard 7185) a block has a fixed order:[fn 2]
- constant definitions (the
const
-section) - type definitions (the
type
-section) - variable declarations (the
var
-section) - routine declarations and definitions
- sequence (
begin … end
, possibly empty)
All items but the last one, the productive part, are optional.
Sections (const , type , or var -section) may not be empty. Once you specify a section heading, you have to define/declare at least one symbol in the just started section.
|
In EP, the fixed order restriction has been lifted.
There, sections and routine declarations and definitions may occur as many times as needed and do not necessarily have to adhere to a particular order.
The consequences are detailed in the chapter “Scopes”.
For the remainder of this book we will refer to EP’s definition of block, because all major compilers support this.
Nevertheless, the order defined by Standard Pascal is a good guideline:
It makes sense to define types, before there is a section that may use those types (i. e. var
-section).
Header
[edit | edit source]A routine header consists of
- the word
function
orprocedure
, - an identifier identifying this routine,
- possibly a parameter list, and,
- lastly, in the case of functions, the data type of an expression a call to this function results in, the result data type.
The parameter list for routines also defines the data type of every single parameter.
Thus, the header of the function odd
could look like this:
function odd(x: integer): Boolean;
Take notice of the colon (:
) after the parameter list separating the function’s result data type.
You can view functions as sort of special variable declaration which also separates an identifier with a colon, except in the case of a function the “variable’s” value is computed dynamically.
Formal parameters, i. e. parameters in the context of a routine header, are separated by a semicolon. Consider the following procedure header:
procedure printAligned(x: integer; tabstop: integer);
Note that every routine header is terminated with a semicolon.
Body
[edit | edit source]While the routine header tells the processor (usually a compiler), “Hey, there’s a routine with the following properties: […]”, it is not enough. You have to “flesh out”, give the routine a body. This is done in the subsequent block.
Inside the block all parameters can be read as if they were variables.
Function result
[edit | edit source]In the sequence of the block defining a function there is automatically a variable of the function’s name. You have to assign a value exactly one time, so the function, mathematically speaking, becomes defined. Confer this example:
function getRandomNumber(): integer;
begin
// chosen by fair dice roll,
// guaranteed to be random
getRandomNumber := 4;
end;
Note that the block did not contain a var
-section declaring the variable getRandomNumber
, but it is already implicitly declared by the function’s header:
Both the name and the data type are part of the function header.
Declaration
[edit | edit source]A routine declaration happens most of the time implicitly. Declaring a routine, or in general any identifier, refers to the process of giving the processor (i. e. usually a compiler) information in order to correctly interpret your program source code. This information is not directly encoded in your executable program, but it is implicitly there. Examples are:
- A variable declaration tells the processor to install proper provisions in order to reserve some memory space. This chunk of memory will be interpreted according to its associated data type. However, neither the variable’s name, nor the data type are in any way stored in your program. Only the processor knows about this information as it is reading your source code file.
- A routine header constitutes a routine declaration (which is usually directly followed by its definition[fn 3]). Here again, the information given in a routine header are not stored directly in the executable file, but they ensure the processor (the compiler) will correctly transform your source code.
- Likewise,
type
declarations merely serve the purpose of clean and abstract programming, but those declarations do not end up in the executable program file.[fn 4]
Declarations make an identifier known to denote a certain object (“object” mathematically speaking).
Definitions on the other hand will, hence their name, define what this object exactly is.
Whether it is a value of a constant, the value of a variable, or the steps taken in a routine (the statement sequence), data defined through definitions will result in specific code in your executable file, which may vary according to the information given in related declarations;
writing a variable possessing the data type integer
is fundamentally different than writing a value of the type real
.
The code for properly storing, calculating and retrieving integer
and real
values differs, but the computer is not aware of that.
It just performs the given instructions, the circumstance that a certain set of instructions resemble operations on Pascal’s data type real
for instance is, so to speak, a “coincidence”.
Calling routines
[edit | edit source]Routing
[edit | edit source]Routines are selected based on their signature. A routine signature consists of
- the routine’s name,
- the data type’s of all arguments, and
- (implicitly) their correct order.
Thus the signature of the function odd
reads odd(integer)
.
The function named odd
accepts one integer
value as the first (and only) argument.
Overloading
[edit | edit source]Pascal allows you to declare and define routines of the same name, but differing formal parameters. This is usually called overloading. When calling a routine there must be exactly one routine of that name that accepts parameters with their corresponding data types.
Pre-defined routines
[edit | edit source]signature | description | returned value’s type |
---|---|---|
abs(integer)
|
absolute value of argument | integer
|
odd(integer)
|
parity (is given value divisible by two) | Boolean
|
sqr(integer)
|
the value squared | integer
|
Persistent variables
[edit | edit source]Some compilers, such as the FPC, allow you to use constants as if they were variables, but different lifetime.
In the following example the “constant” numberOfInvocations
exists for the entire duration of program execution, but is only accessible in the scope it was declared in.
program persistentVariableDemo(output);
{$ifDef FPC}
// allow assignments to _typed_ “constants”
{$writeableConst on}
{$endIf}
procedure foo;
const
numberOfInvocations: integer = 0;
begin
numberOfInvocations := numberOfInvocations + 1;
writeLn(numberOfInvocations);
end;
begin
foo;
foo;
foo;
end.
The program will print 1
, 2
, 3
for every call.
Lines 2, 4, and 5 contain specially crafted comments that instruct the compiler to support persistent variables.
These comments are non-standard, yet some are explained in the appendix, chapter “Preprocessor Functionality”.
Note, the concept of typed “constants” is not standardized. Some object-oriented programming extensions will give nicer tools to implement such behavior as demonstrated above. We primarily explained the concept of persistent variables to you, so you can read and understand source code by other people.
Benefit
[edit | edit source]Routines can be used as many times as you want. They are no tools of mere “text substitution”: The definition of a routine is not “copied” to the place where it is called, the call site. The size of the executable program file remains about the same.
Utilizing routines can also be and usually is beneficial to the development progress of a program. By splitting up a programming project into smaller understandable problems you can focus on solving isolated issues as part of the big task. This approach is known as divide and conquer. We now ask you to slowly shift toward thinking more about your programming tasks before you start typing anything. You may need to spend more time on thinking about, for example, how to structure a routine’s parameter list. What information, what parameters, does this routine require? Where and how can a recurring pattern be generalized through a routine definition? Identifying such questions needs time and expertise, so do not be discouraged if you are not seeing everything the task’s sample answers show. You will learn through your mistakes.
Keep in mind, though, routines are no panacea. There are situations, very specific situations, where you do not want to use routines. Recognizing those, however, is out this book’s scope. For the sake of this textbook, and in 99% of all your programming projects you want to use routines if possible. Modern compilers can even recognize some situations where a routine was “unnecessary”, yet the only gain is that your source code becomes more structured and thus readable, albeit at the expense of being more abstract and therefore complex.[fn 5]
Tasks
[edit | edit source]printM
, printI
, printS
, printP
, will significantly speed up development.program mississippi(output);
const
width = 8;
procedure printI;
begin
writeLn( '# ':width);
writeLn( '# ':width);
writeLn( '# ':width);
writeLn( '# ':width);
writeLn( '# ':width);
writeLn;
end;
procedure printM;
begin
writeLn('# #':width);
writeLn('## ##':width);
writeLn('# ## #':width);
writeLn('# #':width);
writeLn('# #':width);
writeLn;
end;
procedure printP;
begin
writeLn( '### ':width);
writeLn( '# # ':width);
writeLn( '### ':width);
writeLn( '# ':width);
writeLn( '# ':width);
writeLn;
end;
procedure printS;
begin
writeLn(' ### ':width);
writeLn(' # # ':width);
writeLn(' ## ':width);
writeLn('# # ':width);
writeLn(' ### ':width);
writeLn;
end;
begin
printM;
printI;
printS;
printS;
printI;
printS;
printS;
printI;
printP;
printP;
printI;
end.
Notes:
- ↑ Some dialects of Pascal are not so strict about that: The FPC has the option
{$extendedSyntax on}
which will allow the program above to compile anyway. - ↑ The
label
-section has intentionally been omitted. - ↑ The Extended Pascal standard allows so-called “forward declarations” [remote directive]. A forward declaration of a routine is just the declaration, no definition.
- ↑ Some compilers support the generation of non-standardized “run-time type information” (RTTI). By enabling RTTI,
type
declarations do produce data that is stored in your program. - ↑ One such compiler optimization is called inlining. This will effectively copy a routine definition to the call site. Pure functions even stand to benefit by being defined as isolated functions, provided the compiler does support appropriate optimizations.
Enumerations
One powerful notational as well as syntactical tool of Pascal is the declaration of custom enumeration data types.
Handling
[edit | edit source]Notion
[edit | edit source]An enumeration data type is a finite list of named discrete values. Enumerations virtually give names to individual integer values, however, you cannot (directly) do arithmetic operations on it.
Declaration
[edit | edit source]An enumeration data type is declared by following the data type identifier with a non-empty comma-separated list of (new, not previously used) identifiers.
type
weekday = (Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday, Sunday);
The individual list items refer to specific values the data type may assume. The data type identifier identifies the data type as a whole.
Operations
[edit | edit source]Once an enumeration data type has been declared, you can use it like any other data type:
var
startOfWeek: weekday;
begin
startOfWeek := Sunday;
end.
The variable startOfWeek
is restricted to assume only legal values of the data type weekday
.
Note that Sunday
is not enclosed by typewriter quotation marks ('
) which usually indicate a string literal.
The identifier Sunday
indicates a value in its own right.
Ordinal values
[edit | edit source]Automatism
[edit | edit source]Every enumeration data type declaration implicitly defines an order.
The comma-separated list is per definition a sorted list.
The built‑in function ord
, short for ordinal value, gives you the opportunity to obtain the ordinal value of an enumeration element, that is an integer
-value unique/specific to that enumeration member.
The first element of an enumeration is numbered as 0
.
The second, if applicable, has the number 1
, and so forth.
Override
[edit | edit source]Some compilers, such as the FPC, allow you to specify explicit indexes for some, or even all elements of an enumeration:
type
month = (January := 1, February, March, April, May, June,
July, August, September, October, November, December);
Here, January
will have the ordinal value 1
.
And all following items have an ordinal value greater than 1
.
The automatic assignment of numbers still ensures every enumeration member has a unique number among the entire enumeration data type.
February
will have the ordinal value 2
, March
the value 3
, and so on.
The value 0
, however, is not assigned to any element of that enumeration.
Specifying explicit indices is a non-standard extension.
In FPC’s {$mode Delphi}
you need to use a plain equal sign (=
) instead of :=
.
This is also referred to as “C‑style enumeration declaration”, since the programming language C uses that syntax.
Inverse
[edit | edit source]Pascal does not provide a generic function that lets you determine the enumeration element based on a number.
There is no function returning January
, for instance, if it is supplied with the integer
-value 1
.[fn 1]
Neighbors
[edit | edit source]The standard functions pred
and succ
, short for predecessor and successor respectively, are automatically defined for every enumeration data type.
These functions return the previous or next value of an enumeration value.
For example succ(January)
will return February
, as it is the successor of the value January
.
However, pred(January)
will fail as there is technically no member prior January
.
An enumeration list is not cyclical.
Although in real life January follows December, the enumeration data type month
does not “know” that.
The EP standard allows a second optional integer
parameter to be supplied to either pred
or succ
.
succ(January, 2)
is identical to succ(succ(January))
, yet more convenient and shorter, but also pred(January, -2)
returns the same value.
Utilizing this functionality you can obtain an enumeration value given its index.
succ(Monday, 3)
evaluates to the weekday
value that has the ordinal value 3
, thus virtually providing a means for an inverse ord
function.
However, it is necessary to know the first element of the enumeration though, and the enumeration may not use any explicit indices in its declaration (unless all indices coincide with the automatic numbering pattern).
Operators
[edit | edit source]Enumeration data type values are automatically eligible to be used with several operators. Since every enumeration value has an ordinal value, they can be ordered and you can test for that. The relational operators
<
>
<=
>=
=
<>
work in conjunction with enumeration values.
For example, January < February
will evaluate to true
, because January
has a smaller ordinal value than February
.
Although, technically you can compare apples and oranges (spoiler alert: they are unequal), all relational operators only work in conjunction with two values of the same kind.
In Pascal, you cannot compare a weekday
value with a month
value.
Nonetheless, something like ord(August) > ord(Monday)
is legal, since you are then in fact comparing integer
values.
Note, arithmetic operators (+
, -
, and so on) do not work with enumeration data types, despite their ordinal values.
Boolean
as an enumeration data type
[edit | edit source]Definition
[edit | edit source]The data type Boolean
is a built‑in special enumeration data type.
It is guaranteed that
ord(false)
= 0,ord(true)
= 1, and, in consequence,pred(true)
=false
.
Logical operators
[edit | edit source]Boolean
is only enumeration data type operations can be directly performed on using logical operators.
Negation
[edit | edit source]The most basic operator is the negation.
It is a unary operator, that means it expects only one operand.
In Pascal it uses the keyword not
.
By preceding a Boolean
expression with not
(and some separator such as a space character), the expression is negated.
expression | result |
---|---|
not true |
false
|
not false |
true
|
Conjunction
[edit | edit source]While this may be pretty straightforward, the so-called logical conjunction, indicated by and
, might not be.
The truth table for it looks like this:
value of tired
|
value of intoxicated
|
result of tired and intoxicated
|
---|---|---|
false |
false |
false
|
false |
true |
false
|
true |
false |
false
|
true |
true |
true
|
In EE this is frequently written as (“times”) or even omitted, because (like an mathematics) an invisible “times” is assumed.
Given that the ordinal values of false
and true
are as defined above, you could calculate the and
result by multiplying them.
Disjunction
[edit | edit source]A little more confusing, because it may be contradictory to someone’s natural language, is the word or
.
If either operand is true
, the overall expression’s result becomes true
.
value of raining
|
value of snowing
|
result of raining or snowing
|
---|---|---|
false |
false |
false
|
false |
true |
true
|
true |
false |
true
|
true |
true |
true
|
Electrical engineers frequently use the symbol to denote this operation.
With respect to Boolean
’s ordinal value, though, you must “define” that was still .
Precedence
[edit | edit source]Like the usual rule in mathematics “multiplication and division first, then addition and subtraction”, a conjunction is evaluated first before a disjunction is. However, since the negation is a unary operator, it is evaluated first in any case. That means you must be really careful not to forget placing parenthesis. The expression
not hungry or thirsty
is fundamentally different to
not (hungry or thirsty)
Ranges
[edit | edit source]Ordinal types
[edit | edit source]Enumeration data types belong to the category of ordinal data types. Other ordinal data types are:
integer
,char
,- and all enumeration data types, including
Boolean
.
They all have in common, that a value of them can be mapped to a distinct integer
-value.
The ord
function lets you retrieve that value.
Intervals
[edit | edit source]Sometimes, it makes sense to restrict a set of values to a certain range.
For instance, the hours on a military time clock may show values from 0
up to and including 23
.
Yet the data type integer
will permit other values too.
Pascal allows you to declare (sub‑)range data types.
A (sub‑)range data type has a host data type, e. g. integer
, and two limits.
One lower and one upper limit.
A range is specified by giving the limits in ascending order, separated through two periods back-to-back (..
):
type
majuscule = 'A'..'Z';
The limits may be given as any computable expression, as long as it does not depend on run-time data.[fn 2] For example constants (that have already been defined) may be used:
type
integerNonNegative = 0..maxInt;
Note, we named this range integerNonNegative
and not nonNegativeInteger
, because this will facilitate alphabetical sorting of some documentation tools or in IDEs.
Restriction
[edit | edit source]A variable possessing one (sub‑)range data type may only assume values within the range. If the variable exceeds its legal range, the program aborts. The following error message may appear (memory address at the end can vary):
./a.out: value out of range (error #300 at 402a54)
The corresponding test program has been compiled with GPC. Other compilers may emit different messages.
The default configuration of the FPC, however, ignores this.
Assigning out-of-range values to variables will not yield an error (if it depends on run-time data).
The developers of the FPC cite compatibility reasons to other compilers, which decided to ignore out-of-range values for speed reasons.[fn 3]
You need to specifically request that illegal values cannot be assigned to ordinal type variables.
This can be done by placing a specially crafted comment prior any (crucial) assignments:
{$rangeChecks on}
(case-insensitive) or {$R+}
for short (case-sensitive) will ensure illegal values are not assigned and the program aborts if any attempts are made anyway.
Specifying this compiler switch once in your source code file is sufficient.
FPC’s ‑Cr
command-line switch has the same effect.
Selections
[edit | edit source]With the advent of enumeration data types, it may become cumbersome and tedious to check for values just using if
‑branches.
Explanation
[edit | edit source]The case
selection statement unites multiple exclusive if
‑branches in one language construct.[fn 4]
case sign(x) of
-1:
begin
writeLn('You have entered a negative number.');
end;
0:
begin
writeLn('The numbered you have entered is sign-less.');
end;
1:
begin
writeLn('That is a positive number.');
end;
end;
Between case
and of
any expression that evaluates to an ordinal value may appear.
After that, -1:
, 0:
and 1:
are case labels.
These case labels mark the start of alternatives.
After a case label follows a statement.
-1
, 0
and 1
denote case values.
Every case label consists of a non-empty comma-separated list of case values, followed by a colon (:
).
All case values have to be legal constant values, constant expressions, that are compatible to the comparison expression above, what is written between case
and of
.
Every specified case value needs to appear exclusively in one case label.
No case label value can appear twice.
It is not necessary to put them in order, according their ordinal value, although it can make your source code more readable.
Shorthand for many cases
[edit | edit source]In EP case labels may contain ranges.
program letterReport(input, output);
var
c: char;
begin
write('Give me a letter: ');
readLn(c);
case c of
'A'..'Z':
begin
writeLn('Wow! That’s a big letter!');
end;
'a'..'z':
begin
writeLn('What a nice small letter.');
end;
end;
end.
This shorthand notation allows you to catch many cases.
The case label 'A'..'Z':
includes all upper-case letters, without requiring you to list them all individually.
Take care that no range overlaps with other case label values.
This is forbidden.
Good processors will complain about such a mistake though.
The GPC yields the error message duplicate case-constant in `case' statement
, the FPC reports just duplicate case label
[fn 5], both telling you some information about the location in your source code.
Fall-back
[edit | edit source]It is important that any (expected) value of the comparison expression matches one case label.
If the comparison expression evaluates to a value no case label contains the corresponding value, the program aborts.[fn 6]
If this is not desired the “Extended Pascal” standard allows a special case label called otherwise
(note, without a colon).
This case treats all values that have no explicit case label associated with them.
program asciiTest(input, output);
var
c: char;
begin
write('Supply any character: ');
readLn(c);
case c of
// empty statement, so the control characters are not
// considered by the otherwise-branch as non-ASCII characters
#0..#31, #127: ;
#32..#126:
begin
writeLn('You entered an ASCII printable character.');
end;
otherwise
begin
writeLn('You entered a non-ASCII character.');
end;
end;
end.
otherwise
may only appear at the end.
There must be at least one case label beforehand, otherwise (no pun intended) the otherwise
case is always taken, rendering the entire case
-statement useless.
BP, that is Delphi, re-uses the word else
having the same semantics, the same meaning as otherwise
.
The FPC and GPC support both, although GPC can be instructed to only accept otherwise
.
Tasks
[edit | edit source]function
that returns the successor of month
, but for December
the value January
is returned.case
statement is just perfect:
function successor(start: month): month;
begin
case start of
January..November:
begin
successor := succ(start);
end;
December:
begin
successor := January;
end;
end;
end;
For the purposes of this exercise (demonstrating that relational operators such as <
are automatically defined for enumeration data types) the following is acceptable too:
function successor(start: month): month;
begin
if start < December then
begin
successor := succ(start);
end
else
begin
successor := January;
end;
end;
Yet the case
-implementation is, mathematically speaking, more precise.
In the first implementation, if the parameter is wrong, out of range, the program aborts.
if … then … else
will be “wrongly” defined for illegal values too.
value of hasRained
|
value of streetWet
|
result of |
---|---|---|
false |
false |
true
|
false |
true |
true
|
true |
false |
false
|
true |
true |
true
|
Boolean
expression in Pascal resulting in the same truth values. Say hasRained
and streetWet
are Boolean
variables; how would you link them so the entire Boolean
expression is the same as the mathematical expression ?Boolean
is a built-in enumeration data type. This means it is ordered and thus members of this data type can be put in relational ordering. In programing the most frequent translation of you will encounter isnot hasRained or streetWet
hasRained <= streetWet
Boolean
being an enumeration data type. Some people, however, who are not programming in Pascal (e. g. writing personal text messages) may use <=
as a way of writing which is just the opposite of (that means in mathematics , i. e. with swapped and , is another equally valid way of writing ). If you are one of those people you may find the shorter expression counterintuitive, because in Pascal <=
is in fact ≤ (less than or equal to) and not a ⇐.Notes:
- ↑ Some compilers, such as the FPC, allow “typecasting” effectively transforming
1
intoJanuary
. However, this – typecasting – is not a function, especially typecasting does not work properly if the values are out of range (no RTE-generation, nor whatsoever). - ↑ This is an “Extended Pascal” (ISO 10206) extension. In Standard “unextended” Pascal (ISO 7185) only constants are allowed.
- ↑ The Pascal ISO standards do allow this. It is at the compiler programmer’s discretion to ignore such errors. Nonetheless, accompanying documents (manuals, etc.) are meant to point that out.
- ↑ This is an analogy.
case
-statements are usually not translated into a series ofif
‑branches. - ↑ This error message is imprecise. The error message of GPC is more correct. The problem is that a certain value, “case-constant”, appears multiple times.
- ↑ Many compilers do not respect this requirement in their default configuration. The GPC needs to be instructed to be “completely” ISO-compliant (
‑‑classic‑pascal
,‑‑extended‑pascal
, or just‑‑case‑value‑checking
). In BP, Delphi will just continue, leaving a missing case unnoticed. As of version 3.2.0 the FPC does not regard this requirement at all.
Sets
This chapter introduces you to a new custom data type. Sets are one of the basic structured data types. When programming you will frequently find that some logic can be modeled with sets. Learning and mastering usage of sets is a key skill, since you will encounter them a lot in Pascal.
Basics
[edit | edit source]Notion
[edit | edit source]Sets are (possibly empty) aggregations of distinguishable objects. Either a set contains an object, or it does not. An object being part of a set is also referred to as element of that set.
Let's say we know the objects “apple”, “banana” and “pencil”. The set Fruit ≔ {“apple”, “banana”} contains the objects “apple” and “banana”. “Pencil” is not a member of the set Fruit.
Digitization
[edit | edit source]When a computer is supposed to store and process a set, it actually handles a series of Boolean
values.[fn 1]
Every one of those Boolean
values tells us whether a certain element is part of a set.
A computer does also store a Boolean value for every object that is not part of a certain set.
|
Sets in Pascal
[edit | edit source]The computer needs to know how many Boolean
values it needs to set aside.
In order to achieve this, a set in Pascal requires an ordinal type as a set’s base type.
An ordinal type always has a finite range of permissible discrete values, thus the computer knows beforehand how many Boolean
values to reserve, how many elements we can expect a set contain at most.
In consequence, a valid set type
declaration is:
type
characters = set of char;
A variable of the data type characters
can only contain char
values.
This set cannot contain, for instance, 42
, that is an integer
value, nor is this information stored in any way.
A data type declaration for a set of real is illegal, since the set’s base type, the data type real , is not an ordinal data type.
Remember, in order to qualify as an ordinal data type there must be a means to assign every legal value an integer value.[fn 2] |
Sets are particularly useful in conjunction with enumeration data types, which you just learned in previous chapter. Let’s consider an example in Pascal:
program setDemo;
type
skill = (cooking, cleaning, driving, videogames, eating);
skills = set of skill;
var
slob: skills;
begin
slob := [videogames, eating];
end.
Here, we have declared a variable slob
, which represents a set of the skill
enumeration data type values.
In the penultimate line we populate our set slob
with two objects, videogames
and eating
.
The brackets indicate a set literal.
[videogames, eating]
is a set expression which we are assigning to the slob
variable.
The set variable slob
contains no other objects.
However, the computer still stores five Boolean
values for every potential member of that set.
The number five is number of elements in skill
, the set’s base type.
The information that cooking
, cleaning
and driving
are not part of the set slob
is stored explicitly (by the proper Boolean
value false
).
Inspecting a set
[edit | edit source]If we want to learn, whether a certain object is part of a set, the set operator in
yields the corresponding Boolean
value the computer uses to store that information.
program setInspection(output);
type
skill = (cooking, cleaning, driving, videogames, eating);
skills = set of skill;
var
slob: skills;
begin
slob := [videogames, eating];
if videogames in slob then
begin
writeLn('Is she a level 45 dungeon master?');
end;
end.
The in
operator is one of Pascal’s non-commutative operators.
This means, you cannot swap the operands.
On the RHS you always need to write an expression evaluating to a set
value, whereas the LHS has to be an expression evaluating to this set’s base type.
Even though we, as humans, can say that 42 in slob
is wrong, i. e. false
, such a comparison is illegal.
Per definition, the slob
set can only contain skill
values.
Operations
[edit | edit source]So far, sets probably seemed like a really complicated way for using Boolean
values.
The true power of sets lies in a number of distinct operations, making sets an easier, and thus better alternative to handling two or more individual (but related) Boolean
values directly.
Combinations
[edit | edit source]In Pascal, two sets of the same kind, the same data type, can be combined forming a new set of the respective data type. Following operators are available:
name | mathematical symbol | source code symbol |
---|---|---|
union | ∪ | +
|
difference | ∖ | -
|
intersection | ∩ | *
|
symmetric difference | △ | >< †
|
† The symmetric difference operator is only defined in EP.
Union
[edit | edit source]The result of unifying two sets into one is called union. Let’s say, recently our slob has learned how to drive and does that now too. This can be written as:
slob := slob + [driving];
Now, slob
contains all objects it previously held, plus all objects from the other set, [driving]
.
Difference
[edit | edit source]Of course sets can be deprived of a set of elements by using the difference operator, in source code written as -
.
slob := slob - [];
This removes all objects present in the second set from the first set.
Here, the empty set ([]
) does not contain any objects, thus removing no objects has virtually no effect on slob
.
Intersection
[edit | edit source]Furthermore you can intersect sets. The intersection of two sets is defined as the set of elements both operands contain.
program intersectionDemo;
type
skill = (cooking, cleaning, driving, videogames, eating);
skills = set of skill;
var
slob, blueCollar, common: skills;
begin
blueCollar := [cooking, cleaning, driving, eating];
slob := [driving, videogames, eating];
common := blueCollar * slob;
end.
The set common
now (only) contains driving
and eating
, because those are the objects member of both operands, of both given sets.
Symmetric difference
[edit | edit source]A disjunct result to the intersection gives the symmetric difference. It is the union of the operands without the elements contained in both sets.
unique := blueCollar >< slob;
Now unique
is [cooking, cleaning, videogames]
, because those are the values from either set, but not both.
Comparisons
[edit | edit source]Two sets of the same kind, the same data type, can be compared by looking at each element in both sets.
name | mathematical symbol | source code symbol |
---|---|---|
equality | = | =
|
inequality | ≠ | <>
|
inclusion | ⊆ | <=
|
inclusion | ⊇ | >=
|
element | ∈ | in
|
All comparison operators, as before, evaluate to a Boolean
expression.
Inclusion
[edit | edit source]The inclusion of a two sets means that all objects one set contains are present in another set.
If the expression A <= B
evaluates to true
, all objects present in the set A
are also present B
.
In a Venn diagram you will notice that one circle’s area is completely surrounded by another circle, if not identical to the other circle.
Equality and inequality
[edit | edit source]The equality of two sets is defined as A <= B and B <= A
.
All objects contained in the left-hand set are present in the right-hand set and vice versa.
In other words, there is not a single object that is present in just one of the sets.
The inequality is just the negation thereof.
Element of
[edit | edit source]The in
operator is the only set operator that does not act on two sets but on one potential set member candidate and a set.
It has been introduced above.
With respect to Venn diagrams, though, you can say that the in
operator is “like” pointing with your index finger to a point inside a circle, or outside of it.
Pre-defined set routines
[edit | edit source]Cardinality
[edit | edit source](After initialization) at any time a set contains a certain number of elements.
In mathematics the number of objects being part of a set is called cardinality.
The cardinality of a set can be retrieved using the function card
, an EP extension.
emptySet := [];
writeLn(card(emptySet));
This will print 0
as there are no elements in an empty set.
Unfortunately, not all compilers implement the card
function.
The FPC does not have none.
The GPC does supply one, though.
Universe
[edit | edit source]Originally, Wirth proposed a function all
:
[1]
all(T)
is the set of all values of type T
For example:
superwoman := all(skill);
The set superwoman
would contain all available skill
values, cooking
, cleaning
, driving
, videogames
, eating
.
Unfortunately, this proposal never made it into the ISO standards, nor do the FPC or GPC support that function, or provide an equivalent.
The only alternative is to use an appropriate set constructor (an EP extension):
[cooking..eating]
is equivalent to all(skill)
, provided that cooking
is the first skill
and eating
the last skill
value (referring to the order these items were listed during the data type declaration of skill
).
Inclusion and exclusion
[edit | edit source]Not standardized, but convenient is BP’s definition of include
and exclude
procedures.
These are shorthand for very frequent set manipulations.
The procedures allow you to quickly add or remove one object from one set.
include(recognizedLetters, 'Z');
is identical to
recognizedLetters := recognizedLetters + ['Z'];
but you do not need to type out the set name twice and everything, thus reducing the chance of typing mistakes. Likewise,
exclude(primeSieve, 4);
will do the same as
primeSieve := primeSieve - [4];
Both, the FPC and GPC, support these handy routines, which are in fact in all cases implemented as compiler intrinsics, not actual procedure
s.
Intermediate usage
[edit | edit source]Set literals
[edit | edit source]Effectively stating sets is a required skill when handling sets.
It is important to understand that sets merely store the information that an object is a member of a set, or not.
The set ['A', 'A', 'A']
is identical to ['A']
.
Specifying 'A'
multiple times does not make it “more” part of that set.
Also, it is not necessary to list all members in any particular order.
['X', 'Z', 'Y']
is just as acceptable as ['X', 'Y', 'Z']
is.
Mathematically speaking, sets are not ordered.
Pascal’s requirement that a set’s base data type has to be an ordinal type is purely a technical requirement.
For readability reasons it is usually sensible, though, to list elements in ascending order.
The EP standard gives you nice short notation for set
literals containing a continuous series of values.
Instead of writing [7, 8, 9, 11, 12, 13]
you can also write ranges like [7..9, 11..13]
evaluating to the very same value.
Of course, all numbers could also be variables, or expressions in general.
Set literals are always a positive statement which objects are in a set.
If we wanted a set of integer
values between 0
and 10
without 3
, 5
and 7
, but do not want to write this set out entirely (i. e. as [0, 1, 2, 4, 6, 8, 9, 10]
), you can either write [0..2, 4, 6, 8..10]
or the expression [0..10] - [3, 5, 7]
.
The latter is probably a little easier to grasp what objects are and which are not in the final set.
Memory restrictions
[edit | edit source]Although a set of integer
is legal and complies with all Pascal standards, many compilers do not support such large sets.
Per definition, a set of integer
can contain (at most) all values in the range -maxInt..maxInt
.
That is a lot (try writeLn(maxInt)
or read your compiler’s documentation to find out this value).
On a 64‑bit platform this value (usually) is 263−1, i. e. 9,223,372,036,854,775,808.
As of the year 2020 many computers will quickly run out of main memory if they attempted to hold that many Boolean
values.
As a consequence, BP restricts permissible set’s base types.
In BP the base type’s largest and smallest values’ ordinal values have to be in the range 0..255
.
The value 255
is 28−1.
As of version 3.2.0, the FPC sets the same limitations.
The GPC allows set
definitions beyond 28 elements, although some configuration is required:
You need to specify the ‑‑setlimit
command-line parameter or a specially crafted comment in your source code:
program largeSetDemo;
{$setLimit=2147483647} // this is 2³¹ − 1
type
// 1073741823 is 2³⁰ − 1
characteristicInteger = -1073741823..1073741823;
integers = set of characteristicInteger;
var
manyManyIntegers: integers;
begin
manyManyIntegers := [];
include(manyManyIntegers, 1073741823);
end.
This will instruct the GPC that a set of characteristicInteger
can only store up to this many characteristicInteger
values.
Loops
[edit | edit source]Now that you have made the acquaintance of enumeration data types and sets, you see yourself faced with dealing a growing number of data. Pascal, like many other programming languages, support a language construct called loops.
Characteristics
[edit | edit source]Loops are (possibly empty) sequences of statements that are repeated over and over again, or even never, based on a Boolean
value.
The sequence of statements is termed loop body.
The loop head contains (possibly implicitly) a Boolean
expression determining whether the loop body is executed.
Every time the loop body is run, an iteration is in progress.
The term loop originates from the circumstance that some early models of computers required programs to be fed (“loaded”) via punched paper tape. If a portion of that paper tape was meant to be processed multiple times, that piece of paper tape was cut, bent and temporarily fixated so it formed a physical loop. Thankfully, advancements in computer technology has made it far more convenient to handle repeating code.
Pascal (and many other programming languages) differentiate between two groups of loops:
- counting loops, presented here, and
- conditional loops, presented in a chapter to come.
Counting loops have in common that, before running the first iteration it can already be determined how many times the loop body will be executed just by evaluating the loop head.[fn 3]
Conditional loops on the other hand are based on an abort condition, i. e. a Boolean
expression.
Except for infinite loops, there is no way to tell in advance how many times, how many iterations a conditional loop will have without thoroughly (mathematically) analyzing the loop body and loop head, and possibly even considering circumstances beyond the loop.
Counting loops
[edit | edit source]Counting loops do not necessarily count a quantity. They are named after the fact that they employ a variable, a counting variable. This variable of any ordinal data type (de facto) assigns every iteration a number.
A counting loop is introduced by the reserved word for
:
program forLoopDemo(output);
var
i: integer;
begin
for i := 1 to 10 do // loop head
begin // ┐
writeLn(i:3); // ┝ loop body
end; // ┘
end.
After for
follows a specially crafted assignment to the counting variable.
Range of counting variable
[edit | edit source]1 to 10
(with the auxiliary reserved word to
) denotes a range of values the counting variable i
will assume while executing the loop body.
1
and 10
are both expressions possessing the counting variable’s data type, that means there could also appear variables or more complex expressions, not just constant literals as shown.
This range is like a set
.
It may possibly be empty:
The range 5 to 4
is an empty range, since there are no values between 5
up to and including 4
.
In consequence, the counting variable will not be assigned any value out of this empty range, as there simply are none available, and the loop body is never executed.
Nevertheless, the range 8 to 8
contains exactly one value, i. e. 8
.
During the first iteration the corresponding counting variable, here i
, will have the first value out of the given range, the start value, in the example above this is the value 1
.
In the successive iteration the variable i
has the value 2
, and so forth up to and including the final value of the given range, here 10
.
Immutability of counting variable
[edit | edit source]It is not necessary to actually utilize the counting variable inside the loop body, but you can use it if you are just obtaining its current value:
Inside the loop body of for
‑loops it is forbidden to assign any values to the counting variable.
Forbidden assignments include, but are not limited to putting the counting the variable on the LHS of :=
, but also read
/readLn
may not use the counting variable.
Tampering with the counting variable is forbidden, because the loop head will effectively employ succ(i)
to obtain the next iteration’s value.
The loop head implicitly contains the Boolean
expression counting variable ≠ final value.
If the counting variable was manipulated this condition might never be met, thus destroying the characteristics of a for
‑loop.
Preventing the programmer to do any assignments preemptively ensures such an infinite loop is not, accidentally as well as deliberately, created.
Reverse direction
[edit | edit source]Pascal also allows for
‑loops in a reversed direction using the reserved word downto
instead of to
:
program downtoDemo(output);
var
c: char;
begin
for c := 'Z' downto 'B' do
begin
write(c, ' ');
end;
writeLn('A');
end.
Here, the range is 'Z'
down and including to 'B'
.
The loop’s terminating condition is still counting variable ≠ final value, but in this case the counting variable c
becomes pred(c)
(not succ
) at the end of each iteration, after the loop body has been executed.
Loops on collections
[edit | edit source]EP allows to iterate over discrete aggregations, such as sets. This is particularly useful if you have a routine that needs to be applied to every value of an aggregation. Here is an example to demonstrate the principle:
program forInDemo(output);
var
c: char;
vowels: set of char;
begin
vowels := ['A', 'E', 'I', 'O', 'U'];
for c in vowels do
begin
writeLn(c);
end;
end.
Now you see the word in
again, but in this context c in vowels
is not an expression.
The data type restrictions for in
are still in effect:
On the RHS an aggregation expression is given, whereas the LHS is in this case a variable that has the aggregation’s data type.
This variable will be assigned every value out of the given aggregation every time an iteration is processed.
Since the RHS just needs an expression, not necessarily a variable, so you can shorten the example even further to:
for c in ['A', 'E', 'I', 'O', 'U'] do
Note, unlike the counting loops above, you are not supposed to make any assumptions about the order the loop variable is assigned values to.
It may be in ascending, descending, or completely mixed up “order”, but the specific order is “implementation defined”, i. e. it depends on the used compiler.
Accompanying documents of the compiler explain in which order the for … in
loop is processed.
Tasks
[edit | edit source]fruit - fruit
evaluate to?Sources:
- ↑ Wirth, Niklaus. The Programming Language Pascal (Revised Report ed.). p. 38.
{{cite book}}
: Unknown parameter|month =
ignored (help); Unknown parameter|year =
ignored (help)
Notes:
- ↑ This is an analogy for explanation. The ISO standards do not set this requirement. Compiler developers are free to choose whatever implementation of the set data type they think is suitable. It would be perfectly OK to just store a list of elements that are present in a set and nothing else. Nevertheless, all, Delphi, FPC, as well as the GPC do store sets as a series of bits (“
Boolean
values”) as it is explained here. - ↑ The data type
real
does not qualify as an ordinal data type. Although it stores a finite subset of ℚ, the set of rational numbers, so there is a map ℚT ↦ ℕ, this depends on thereal
data type’s precision (T), thus there is no one standard way of definingord
forreal
values, but many. - ↑ This statement ignores many dialects’ extensions
break
/leave
/continue
/cycle
that neither the ISO standards 7185 (Standard Pascal) or 10206 (Extended Pascal) define, or thegoto
statement.
Arrays
An array is a structure concept for custom data types. It groups elements of the same data type. You will use arrays a lot if you are dealing with lots of data of the same data type.
Lists
[edit | edit source]Notion
[edit | edit source]In general, an array is a limited and arranged aggregation of objects, all of which having the same data type called base type or component type.[1] An array has at least one discrete, bounded dimension, continuously enumerating all its objects. Each object can be uniquely identified by one or more scalar values, called indices, along those dimensions.
Declaration
[edit | edit source]In Pascal an array data type is declared using the reserved word array
in combination with the auxiliary reserved word of
, followed by the array’s base type:
program arrayDemo(output);
type
dimension = 1..5;
integerList = array[dimension] of integer;
Behind the word array
follows a non-empty comma-separated list of dimensions surrounded by brackets.[fn 1]
All array’s dimensions have to be ordinal data types, yet the base type type can be of any kind.
If an array has just one dimension, like the one above, we may also call it a list.
Access
[edit | edit source]A variable of the data type integerList
as declared above, holds five independent integer
values.
Accessing them follows a specific scheme:
var
powerN: integerList;
begin
powerN[1] := 5;
powerN[2] := 25;
powerN[3] := 125;
powerN[4] := 625;
powerN[5] := 3125;
writeLn(powerN[3]);
end.
This program
will print 125
, since it is the value of powerN
that has the index value 3
.
Arrays are like a series of “buckets” each holding one of the base data type’s values.
Every bucket can be identified by a value according to the dimension specifications.
When referring to one of the buckets, we have to name the group, that is the variable name (in this case powerN
), and a proper index surrounded by brackets.
Character array
[edit | edit source]Lists of characters frequently have and had special support with respect to I/O and manipulation.
This section is primarily about understanding, as in the next chapter we will get to know a more sophisticated data type called string
.
Direct assignment
[edit | edit source]String literals can be assigned to array[…] of char
variables using an assignment statement, thus instead of writing:
program stringAssignmentDemo;
type
fileReferenceDimension = 1..4;
fileReference = array[fileReferenceDimension] of char;
var
currentFileReference: fileReference;
begin
currentFileReference[1] := 'A';
currentFileReference[2] := 'Z';
currentFileReference[3] := '0';
currentFileReference[4] := '9';
You can simply write:
currentFileReference := 'AZ09';
end.
Note, that you do not need to specify any index anymore.
You are referring to the entire array variable on the LHS of the assignment symbol (:=
).
This only works for overwriting the values of the whole array.
There are extensions allowing you to overwrite merely parts of a char
array, but more on that in the next chapter.
Most implementations of string literal to char
array assignments will pad the given string literal with insignificant char
values if it is shorter than the variable’s capacity.
Padding a string means to fill the remaining positions with other characters in order meet a certain size.
The GPC uses space characters (' '
), whereas the FPC uses char
values whose ord
inal value is zero (#0
).
Reading and printing
[edit | edit source]Although not standardized,[fn 2] read
/readLn
and write
/writeLn
usually support writing to and reading from array[…] of char
variables.
program interactiveGreeting(input, output);
type
nameCharacterIndex = 1..20;
name = array[nameCharacterIndex] of char;
var
host, user: name;
begin
host := 'linuxnotebook'; { in lieu of getHostName }
writeLn('Hello! What is your name?');
readLn(user);
write('Hello, ', user, '. ');
writeLn('My name is ', host, '.');
end.
Hello! What is your name?
Aïssata
Hello, Aïssata. My name is linuxnotebook.
user
. The user input has been highlighted, and the program
was compiled with the FPC.This works because text
files, like the standard files input
and output
, are understood to be infinite sequences of char
values.[fn 3]
Since our array[…] of char
is also, although finite sequence of char
values, individual values can be copied pretty much directly to and from text
files, not requiring any kind of conversion.
Primitive comparison
[edit | edit source]Unlike other arrays, array[…] of char
variables can be compared using =
and <>
.
if user <> host then
begin
write('Hello, ', user, '. ');
writeLn('My name is ', host, '.');
end
else
begin
writeLn('No. That is my name.');
end;
This kind of comparison only works as expected for identical data types.
It is a primitive character-by-character comparison.
If either array is longer or shorter, an =
comparison will always fail, because not all characters can be compared to each other.
Correspondingly, an <>
comparison will always succeed for array[…] of char
values that differ in length.
Note, the EP standard also defines the EQ
and NE
functions, beside many more.
The difference to =
and <>
is that blank padding (i. e. #0
in FPC or ' '
in GPC) has no significance in EQ
and NE
.
In consequence, that means using these functions you can compare strings and char
arrays regardless of their respective maximum capacity and still get the naturally expected result.
The =
and <>
comparisons on the other hand look at the memory’s internal representation.
Matrices
[edit | edit source]An array’s base type can be any data type,[fn 4] even another array. If we want to declare an “array of arrays” there is a short syntax for that:
program matrixDemo(output);
const
columnMinimum = -30;
columnMaximum = 30;
rowMaximum = 10;
rowMinimum = -10;
type
columnIndex = columnMinimum..columnMaximum;
rowIndex = rowMinimum..rowMaximum;
plot = array[rowIndex, columnIndex] of char;
This has already been described above. The last line is identical to:
plot = array[rowIndex] of array[columnIndex] of char;
It can be expanded to two separate declarations allowing us to “re-use” the “inner” array data type:
row = array[columnIndex] of char;
plot = array[rowIndex] of row;
Note that in the latter case plot
uses row
as the base type which is an array by itself.
Yet in the short notation we specify char
as the base type, not a row
but its base type.
When an array was declared to contain another array, there is a short notation for accessing individual array items, too:
var
curve: plot;
x: columnIndex;
y: rowIndex;
v: integer;
begin
// initialize
for y := rowMinimum to rowMaximum do
begin
for x := columnMinimum to columnMaximum do
begin
curve[y, x] := ' ';
end;
end;
// graph something
for x := columnMinimum to columnMaximum do
begin
v := abs(x) - rowMaximum;
if (v >= rowMinimum) and (v <= rowMaximum) then
begin
curve[v, x] := '*';
end;
end;
This corresponds to the array’s declaration.
It is vital to ensure the indices you are specifying are indeed valid.
In the latter loop the if
branch checks for that.
Attempting to access non-existent array values, i. e. by supplying illegal indices, may crash the program, or worse remain undetected thus causing difficult to find mistakes.
A program compiled with the GPC will terminate with./a.out: value out of range (error #300 at 402a76) ‑Cr command-line switch, or placing the compiler directive comment{$rangeChecks on} Confer also chapter “Enumerations”, subsection “Restriction”. |
Note, the “unusual” order of x
and y
has been chosen to facilitate drawing an upright graph:
// print graph, note reverse `downto` direction
for y := rowMaximum downto rowMinimum do
begin
writeLn(curve[y]);
end;
end.
That means, it is still possible to refer to entire “sub”-arrays as a whole. You are not forced to write all dimension an array value has, given it makes sense.
Array data types that have exactly two dimensions are also called matrices, singular matrix.
In mathematics a matrix does not necessarily have to be homogenous, but could contain different “data types”. |
Real values
[edit | edit source]As introduced in one of the first chapters the data type real
is part of the Pascal programming language.
It is used to store integer values in combination with an optional fractional part.
Real literals
[edit | edit source]In order to distinguish integer
literals from real
literals, specifying real
values in your source code (and also for read
/readLn
) differs slightly.
The following source code excerpt shows some real
literals:
program realDemo;
var
x: real;
begin
x := 1.61803398;
x := 1E9;
x := 500e-12
x := -1.7320508;
x := +0.0;
end.
To summarize the rules:
- A
real
value literal always contains either a.
as a radix mark, or anE
/e
to separate an exponent (the in ), or both. - There is at least one Western-Arabic decimal digit before and one after the
.
(if there is any). - The entire number and exponent are preceded by signs, yet a positive sign is optional.
0.0 accepts a sign, but is in fact (in mathematics) a sign-less number. -0.0 and +0.0 denote the same value.
|
As it has always been, all number values cannot contain spaces.
Limitations
[edit | edit source]The real
data type has many limitations you have to be aware of in order to effectively use it.
First of all, we want to re-emphasize an issue that was mentioned when data types were introduced:
In computing real
variables can only store a subset of rational numbers (ℚ).[2]
That means, for example, you cannot store the (mathematically speaking) real number (ℝ) .
This number is an irrational number (i. e. not a rational number).
If you cannot write a number as a finite real
literal, it is impossible to store it in a system using a finite amount of memory, such as computer systems do.
Fortunately, in EP three constants aide your usage of real
values.
minReal
is the smallest positive real
value.
It conjunction with the constant maxReal
, it is guaranteed that all arithmetic operations in produce, quote, reasonable approximations.[3]
It is not specified what exactly constitutes a “reasonable” approximation, thus it can, for example, mean that maxReal - 1
yields as “an approximation” maxReal
.[fn 5]
Also, it is quite possible that real
variables can store larger values than maxReal
.
epsReal
is short for “epsilon real
”.
The small Greek letter ε (epsilon) frequently denotes in mathematics an infinitely small (positive) value, yet not zero.
According to the ISO standard 10206 (“Extended Pascal”), epsReal
is the result of subtracting 1.0
from the smallest value that is greater than 1.0
.[3]
No other value can be represented between this value and 1.0
, thus epsReal
represents the highest precision available, but just at that point.[fn 6]
Most implementations of the real
data type will show a significantly varying degree of precision.
Most notable, the precision of real
data type implementations complying with the IEEE standard 754 format, decays toward the extremes, when approaching (and going beyond) -maxReal
and maxReal
.
Therefore you usually use, if at all, a reasonable multiple of epsReal
that fits the given situation.
Transfer functions
[edit | edit source]Pascal’s strong typing system prevents you from assigning real
values to integer
variables.
The real
value may contain a fractional part that an integer
variable cannot store.
Pascal defines, as part of the language, two standard functions addressing this issue in a well-defined manner.
- The function
trunc
, short for “truncation”, simply discards any fractional part and returns, as aninteger
, the integer part. As a consequence this is effectively rounding a number toward zero.trunc(-1.999)
will return the value-1
. - If this feels “unnatural”, the
round
function rounds areal
number to its closestinteger
value.round(x)
is (regarding the resulting value) equivalent totrunc(x + 0.5)
for non-negative values, and equivalent totrunc(x - 0.5)
for negativex
values.[fn 7] In other words, this is what you were probably taught in grade school, or the first rounding method you learned in homeschooling. It is commonly referred to as “commercial rounding”.
Both functions will fail if there is no such integer
value fulfilling the function’s respective definition.
There is no function if you want to (explicitly) “transfer” an integer
value to a real
value.
Instead, one uses arithmetically neutral operations:
integerValue * 1.0
(using multiplicative neutral element), orintegerValue + 0.0
(using summing neutral element)
These expressions make use of the fact, as it was mentioned earlier as a passing remark in the chapter on expressions, that the expression’s overall data type will become real
as soon as one real
value is involved.[4]
It is not guaranteed that all possible integer values can be stored as real variables.[fn 8] This primarily concerns non-small values, but it is important to understand that the data type integer is the best choice to accurately store integral, whole-numbered values nonetheless.
|
Printing real
values
[edit | edit source]By default write
/writeLn
print real
values using “scientific notation”.
real
value constant pi
. It was adopted by many compilers.
program printPi(output);
begin
writeLn(pi);
end.
3.141592653589793e+00
- sign, where a positive sign is replaced with a blank
- one digit
- a dot
- a positive number of post-decimal digits
- an
E
(uppercase or lowercase) - the sign of the exponent, but this time a positive sign is always written and a zero exponent is preceded by a plus sign, too
- the exponent value, with a fixed minimum width (here 2) and leading zeros
While this style is very universal, it may also be unusual for some readers.
Particularly the E
notation is something now rather archaic, usually only seen on pocket calculators, i. e. devices lacking of enough display space.
Luckily, write
/writeLn
allow us to customize the displayed style.
real
parameters, too, but this also shows more digits:
program printPiDigits(output);
begin
writeLn(pi:40);
end.
3.141592653589793238512808959406186e+00
40
refers to the entire width including a sign, the radix mark, the e
and the exponent representation.The procedures write
/writeLn
accept for real
type values (and only for real
values) another colon-separated format specifier.
This second number determines the (exact) number of post-decimal digits, the “fraction part”.
Supplying two format specifiers also disables scientific notation.
All real
values are printed using regular positional notation.
That may mean for “large” numbers such as 1e100
printing a one followed by a hundred zeros (just for the integer part).
program realFormat(output);
var
x: real;
begin
x := 248e9 + 500e-6;
writeLn(x:32:8);
end.
248000000000.00048828
.
and in the case of negative numbers -
, is 32 characters. After the .
follow 8 digits. Bear in mind the precise number, especially the fractional part, may vary.In some regions and languages it is customary to use a ,
(comma) or other character instead of a dot as a radix mark.
Pascal’s on-board write
/writeLn
procedures will, on the other hand, always print a dot, and for that matter – read
/readLn
will always accept dots as radix marks only.
Nevertheless, all current Pascal programming suites, Delphi, FPC and GPC provide appropriate utilities to overcome this restriction.
For further details we refer to their manuals.
This issue should not keep us from continuing learning Pascal.
write
/writeLn
(and in EP also writeStr
) generate will be rounded with respect to the last printed digit.
program roundedWriteDemo(output);
begin
writeLn(3.75:4:1);
end.
3.8
real
’s limitations. It was verified that the computer used for this demonstration could indeed store precisely the specified value. The rounding you see in this particular case is not due to any technical circumstances.Comparisons
[edit | edit source]First of all, all (arithmetic) comparison operators do work with real
values.
The operators =
and <>
, though, are particularly tricky to handle.
In most applications you do not compare real
values to each other when checking for equality or inequality.
The problem is that numbers such as ⅓ cannot be stored exactly with a finite series of binary digits, only approximated, yet there is not one valid approximation for ⅓ but many legit ones.
The =
and <>
operators, however, compare – so to speak — for specific bit patterns.
This is usually not desired (for values that cannot be represented exactly).
Instead, you want to ensure the values you are comparing are within a certain range, like:
function equal(x, y, delta: real): Boolean;
begin
equal := abs(x - y) <= delta;
end;
Delphi and the FPC’s standard RTS provide the function sameValue
as part of the math
unit.
You do not want to program something other people already have programmed for you, i. e. use the resources.
Division
[edit | edit source]Now that we know the data type used for storing (a subset of) rational numbers, in Pascal known as real
, we can perform and use the result of another arithmetic operation:
The division.
Flavors
[edit | edit source]Pascal defines two different division operators:
- The
div
operator performs an integer division and discards, if applicable, any remainder. The expression’s resulting data type is alwaysinteger
. Thediv
operator only works if both operands areinteger
expressions. - The, probably more familiar, operator
/
(a forward slash), divides the LHS number (the dividend) by the RHS number (the divisor), too, but a/
-division always yields areal
type value.[4] This is also the case if the fractional part is zero. A “remainder” does not exist for the/
operation.
The div
operation can be put in terms of /
:
divisor div dividend = trunc(divisor / dividend)
However, this is only a semantic equivalent,[fn 9] it is not how it is actually calculated.
The reason is, the /
operator would first convert both operands to real
values and since, as explained above, not all integer
values can necessarily be represented exactly as real
values, this would produce results potentially suffering from rounding imprecisions.
The div
operator on the other hand is a pure integer
operator and works with “integer precision”, that means no rounding is involved in actually calculating the div
result.
Off limits divisor
[edit | edit source]Note, since there is no generally accepted definition for division by zero, a zero divisor is illegal and will result in your program to abort. If your divisor is not a constant and depends on run-time data (such as a variable read from user input), you should check that it is non-zero before doing a division:
if divisor <> 0 then
begin
x := dividend / divisor;
// ...
end
else
begin
writeLn('Error: zero divisor encountered when calculating rainbow');
end;
Alternatively, you can declare data types excluding zero, so any assignment of a zero value will be detected:
type
natural = 1..maxInt;
var
divisor: natural;
begin
write('Enter a positive integer: ');
readLn(divisor); // entering a non-positive value will fail
Some Pascal dialects introduce the concept of “exceptions” that can be used to identify such problems. Exceptions may be mentioned again in the “Extensions” part of this Wikibook.
Arrays as parameters
[edit | edit source]Arrays can be copied with one simple assignment dataOut := dataIn;
.
This requires, however, as it is customary with Pascal’s strong type safety, that both arrays are assignment-compatible:
That means their base type and dimension specifications are the same.[fn 10]
Because calling a routine involves invisible assignments, writing general-purpose code dealing with lots of different situations would be virtually impossible if the entire program had to use one array type only. In order to mitigate this situation, conformant array type parameters allow writing routines accepting differing array dimension lengths. Array dimension data types still have to match.
Let’s look at an example program using this feature:
program tableDemo(output);
procedure printTableRows(data: array[minimum..maximum: integer] of integer);
var
i: integer; // or in EP `i: type of minimum;` [preferred alternative]
begin
for i := minimum to maximum do
begin
writeLn(i:20, ' | ', data[i]:20);
end;
end;
A conformant-array parameter looks pretty similar to a regular array variable declaration, but the dimensions are specified differently.
Usually, when declaring new arrays you provide constant values as dimension limits.
Since we do not want constants, though, we name placeholder identifiers for the actual dimension limits of any array printTableRows
will receive.
In this case they are named minimum
and maximum
, joined by ..
inbetween indicating a range.
minimum
and maximum
become variables inside the definition of printTableRows
, but you are not allowed to assign any values to them.[fn 11]
You are not allowed to declare new identifiers bearing the same name as the array boundary variables.
In Pascal all constants implicitly have an unambiguously determinable data type.
Since our array limit identifiers are in fact variables they require a data type.
The : integer
indicates both minimum
and maximum
have the data type integer
.
In a conformant array paramter, the short notation for nested arrays uses a ; to separate multiple dimensions, thus resembling a regular parameter list.
|
Once we have declared and defined printTableRows
we can use it with lots of differently sized arrays:
var
table: array[0..3] of integer;
nein: array[9..9] of integer;
begin
table[0] := 1;
table[1] := 2;
table[2] := 4;
table[3] := 8;
printTableRows(table);
nein[9] := 9;
printTableRows(nein);
end.
Delphi and the FPC (as of version 3.2.0 released in 2020) do not support conformant-array parameters, but the GPC does.
Logarithms
[edit | edit source]Special support
[edit | edit source]Prior the 21st century logarithm tables and slide rules were heavily utilized tools in manual calculations, so much so it led to the inclusion of two basic functions in Pascal.
- The function
exp
exponentiates a number to the base , Euler’s constant. The value ofexp(x)
is equivalent to the mathematical term . - The function
ln
, short for Latin “logaritmus naturalis”, takes the natural logarithm of a positive number. “Natural” refers to again.
Both functions always return real
values.
Introduction
[edit | edit source]Since the use of logarithms is not necessarily taught in all curricula, or you might just need a refresher, here is a short primer: The basic idea of logarithms is that all operations are lifted one level.
logarithm level | |||
---|---|---|---|
real level |
On the lifted level many operations become simpler, especially if the numbers are large. For instance, you can perform a rather easy addition if you actually mean to take the product of two numbers. For this you have to “lift” all operands one level up: This is done by taking the logarithm. Pay particular attention to : On the logarithm level is a non-“logarithmized” factor, you only take the logarithm of
Once you are done, you have descend one level again to get the actual “real” result of the intended operation (on the underlying level).
The reverse operation of ln
is exp
.[fn 12]
To put this principle in Pascal terms:
x * y = exp(ln(x) + ln(y))
Remember, x
and y
have to be positive in order to be valid parameters to ln
.
Application
[edit | edit source]Taking the logarithm and then exponentiating values again are considered “slow” operations and introduce a certain overhead. In programming, overhead means taking steps that are not directly related to the actual underlying problem, but only facilitate solving it. In general, overhead is avoided, since (at first) it takes us even farther away from the solution.
real
data type if intermediate results are out of its range, but it is known that the final result will definitely be within the range of real
again.
program logarithmApplicationDemo(output);
const
operand = maxReal;
var
result: real;
begin
// for comparison
writeLn(maxReal:40);
result := sqr(operand);
result := sqrt(result);
writeLn(result:40);
// “lift” one level
result := ln(operand);
result := 2.0 * result; // corresponds to sqr(…)
result := 0.5 * result; // corresponds to sqrt(…)
// reverse `ln`: bring `result` “back down”
result := exp(result);
writeLn(result:40);
end.
1.7976930000000000495189307440746532950903200318892038e+308
Inf
1.7976929999999315921963138504476453672053533033977331e+308
sqr
in line 10 exceeds the range of real
rendering any subsequent results invalid. In this particular implementation this situation is displayed as Inf
(infinity). Since we know that reversing this operation by taking the principal square root results in a storable result again, we can perform the same operation with logarithms instead.
The shown output was generated by the program
being compiled with the GPC. The program was executed on a 64-bit platform with an FPU using 80-bit numbers.As you can see, this goes to the detriment of precision. It is a compromise between fast operations, and “accurate enough” results.
The best solution is, of course, finding a better algorithm.
The above demonstration is effectively , that is abs(x)
(remember, squaring a number always yields a non-negative number).
This operation’s result will stay in the range of real
.
Tasks
[edit | edit source]All tasks, including those in the following chapters, can be solved without conformant-array parameters. This takes account of the fact that not all major compilers support them.[fn 13] Nonetheless, using routines with conformant-array parameters are often the most elegant solution.
program
that prints the following multiplication table:
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
6 12 18 24 30 36 42 48 54 60
7 14 21 28 35 42 49 56 63 70
8 16 24 32 40 48 56 64 72 80
9 18 27 36 45 54 63 72 81 90
10 20 30 40 50 60 70 80 90 100
writeLn
). The generation of data and printing data shall be implemented by separate routines (these routine may not call each other).program multiplicationTable(output);
const
xMinimum = abs(1);
xMaximum = abs(10);
yMinimum = abs(1);
yMaximum = abs(10);
// NB: Only Extended Pascal allows constant definitions
// based on expressions. The following two definitions
// are illegal in Standard Pascal (ISO 7185).
zMinimum = xMinimum * yMinimum;
zMaximum = xMaximum * yMaximum;
Calculating the maximum and minimum expected value now (as constants) has the advantage that the compiler will emit a warning during compilation if any value exceeds maxInt
.
The abs
were inserted as a means of documentation:
The program only works properly for non-negative values.
type
x = xMinimum..xMaximum;
y = yMinimum..yMaximum;
z = zMinimum..zMaximum;
table = array[x, y] of z;
Using z
as the table
array’s base type (and not just integer
) has the advantage that if we accidentally implement the multiplication incorrectly, assigning out-of-range values will fail.
For such a trivial task like this one it is of course irrelevant, but for more difficult situations deliberately constricting the allowed range can thwart programming mistakes.
Do not worry if you just used a plain integer
.
var
product: table;
Note, the product
variable has to be declared outside and before populateTable
and printTable
are defined.
This way both routines refer to the same product
variable.[fn 14]
procedure populateTable;
var
factorX: x;
factorY: y;
begin
for factorX := xMinimum to xMaximum do
begin
for factorY := yMinimum to yMaximum do
begin
product[factorX, factorY] := factorX * factorY;
end;
end;
end;
It is also possible to reuse previously calculated values, make use of the fact that the table can be mirrored along the diagonal axis, or do other “optimization stunts”.
The important thing for this task, though, is to correctly nest the for
loops.
procedure printTable;
var
factorX: x;
factorY: y;
begin
for factorY := yMinimum to yMaximum do
begin
for factorX := xMinimum to xMaximum do
begin
write(product[factorX, factorY]:5);
end;
writeLn;
end;
end;
An advanced implementation would, of course, first determine the maximum expected length and store it as a variable, instead of using the hardcoded format specifier :5
.
This, though, is out of this task’s scope.[fn 15]
It just should be mentioned hardcoded values like this one are considered bad style.
begin
populateTable;
printTable;
end.
real
value literals shorter than five characters, all denoting the value “positive one”. What are they?real
value literals denoting the value “positive one”:
1.0
+1.0
1.00
1E0
1E+0
1E-0
1E00
1.0
. This exercise is meant to sensitize you to the fact that (unlike integer
values) real
number values can have many valid representations. Note, some compilers will accept literals such as 1.
, too, but this non-standard. The GPC will only accept it in non-ISO-compliant modes, but still emit a warning.
function
that calculates the n-th integer power of a positive number. Restrict the parameters’ data types as much as possible. The function should return 0
if the result is invalid (i. e. out of range).type
naturalNumber = 1..maxInt;
wholeNumber = 0..maxInt;
{**
\brief iteratively calculates the integer power of a number
\param base the (positive) base in x^n
\param exponent the (non-negative) exponent in x^n
\return \param base to the power of \param exponent,
or zero in the case of an error
**}
function power(base: naturalNumber; exponent: wholeNumber): wholeNumber;
var
accumulator: wholeNumber;
begin
{ anything [except zero] to the power of zero is defined as one }
accumulator := 1;
for exponent := exponent downto 1 do
begin
{ if another “times `base`” would exceed the limits of `integer` }
{ we invalidate the entire result }
if accumulator > maxInt div base then
begin
accumulator := 0;
end;
accumulator := accumulator * base;
end;
power := accumulator;
end;
exponent := exponent
just to satisfy the Pascal’s syntax requirements. A good compiler will optimize that away. Note that the EP standard provides the integer
power operator pow
.[fn 16]
base
values as well. If your compiler supports the EP procedure halt
, your function
should print an error message and terminate the program
if , because there is no universally agreed definition for .{**
\brief iteratively calculates the integer power of a number
\param base the non-zero base in x^n
\param exponent the (non-negative) exponent in x^n
\return \param base to the power of \param exponent,
or zero in the case of an error
The program aborts if base = 0 = exponent.
**}
function power(base: integer; exponent: wholeNumber): integer;
var
accumulator: integer;
negativeResult: Boolean;
begin
if [base, exponent] = [0] then
begin
writeLn('Error in `power`: base = exponent = 0, but 0^0 is undefined');
halt;
end;
set
was chosen to sensitize you to that possibility. You will, nevertheless, usually and most probably write (base = 0) and (base = exponent)
or similar, which is just as valid.
{ anything [except zero] to the power of zero is defined as one }
accumulator := 1;
negativeResult := (base < 0) and odd(exponent);
{ calculating the _positive_ power of base^exponent }
{ simplifies the invalidation condition in the loop below }
base := abs(base);
if base > 1 then
begin
for exponent := exponent downto 1 do
begin
{ if another “times `base`” would exceed the limits of `integer` }
{ we invalidate the entire result }
if accumulator > maxInt div base then
begin
accumulator := 0;
end;
accumulator := accumulator * base;
end;
end;
if
branch may be not as apparent as it should be: Because we earlier extended the range of possible base
values to all integer
values, it has also become possible to specify 0
. However, remember division by zero is illegal. Since our invalidation condition relies on div base
we need to take precautionary steps.
if negativeResult then
begin
accumulator := -1 * accumulator;
end;
power := accumulator;
end;
- The user will terminate his input with an empty line. Print this instruction beforehand.
- When done, print the message.
- When printing, a line may be at most 80 characters long (or whatever is reasonable for you). You are allowed to presume the user’s input lines are at most 80 characters long.
- Ensure you only wrap lines at space characters (unless there are no space characters in a line).
More tasks you can solve can be found on the following Wikibook pages:
- A-level Computing 2009/AQA/Problem Solving, Programming, Data Representation and Practical Exercise/Fundamentals of Programming/One-Dimensional Arrays
- A-level Computing 2009/AQA/Problem Solving, Programming, Data Representation and Practical Exercise/Fundamentals of Programming/Two-Dimensional Arrays
Sources:
- ↑ Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 56. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5.
An array type consists of a fixed number of components (defined when the array type is introduced) all having the same type, called the component type.
{{cite book}}
: no-break space character in|title=
at position 7 (help) - ↑ This limitation comes from Pascal: ISO 7185:1990 (Report). ISO/IEC. 1991. p. 16. "real-type. The required type-identifier real shall denote the real-type. […] The values shall be an implementation-defined subset of the real numbers, denoted as specified in 6.1.5 by signed-real." The end of the last sentence implies only writable, those you can specify in your source code, are legal
real
values. For example, the value is different from , the decimial representation of which would be infinitely long (a. k. a. irrational number), thus the actual, correct value of cannot appear in source code as a “real
” value. - ↑ a b Joslin, David A. (1989-06-01). "Extended Pascal – Numerical Features". Sigplan Notices. 24 (6): 77–80. doi:10.1145/71052.71063.
The programmer can obtain some idea of the real range and precision from the (positive) implementation-defined constants
MINREAL
,MAXREAL
andEPSREAL
. Arithmetic in the ranges[-maxreal,-minreal]
,0
, and[minreal,maxreal]
"can be expected to work with reasonable approximations", whereas outside those ranges it cannot. As what constitutes a "reasonable approximation" is a matter of opinion, however, and is not defined in the standard, this statement may be less useful than it appears at first sight. The measure of precision is on somewhat firmer ground:EPSREAL
is the commonly employed measure of (typically floating-point) accuracy, i.e the smallest value such that1.0 + epsreal > 1.0
.{{cite journal}}
: line feed character in|quote=
at position 259 (help) - ↑ a b Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 20. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5.
As long as at least one of the operands is of type
Real
(the other possibly being of typeInteger
) the following operators yield a real value:*
multiply /
divide (both operands may be integers, but the result is always real) +
add -
subtract {{cite book}}
: line feed character in|quote=
at position 218 (help); no-break space character in|title=
at position 7 (help)
Notes:
- ↑ Some (old) computers did not know the bracket characters. Seriously, that’s not a joke. Instead, a substitute bigram was used:
var signCharacter: array(.Boolean.) of char;
, andsignCharacter(.true.) := '-';
. You may encounter this kind of notation in some (old) textbooks on Pascal. Anyway, using brackets is still the preferred method. - ↑ Only I/O concerning a
packed array[1..n] of componentType
, wheren
is greater than1
andcomponentType
is or is a subrange ofchar
, is standardized. However, in this part of the book you are not introduced to the concept of packing, thepacked
keyword. Therefore, the shown behavior is non-standard. - ↑ More precisely,
text
files are (possibly empty) sequences of lines, each line consisting of a (possibly empty) sequence ofchar
values. - ↑ Some compilers (such as the FPC) allow zero-sized data types [not allowed in any ISO standard]. If that is the case, an array that has a zero-size base type will be rendered ineffective, virtually forfeiting all characteristics of arrays.
- ↑ Modern
real
arithmetic processors can indicate a precision loss, i. e. when the result of an arithmetic operation had to be “approximated”. However, there is no standardized way to access this kind of information from your Pascal source code, and usually this kind of signaling is also not favorable, since the tiniest precision loss will set off the alarm, thus slowing down your program. Instead, if it matters, one uses software that allows arbitrary precision arithmetics, like for example the GNU Multiple Precision Arithmetic Library. - ↑ This number is not completely arbitrary. The most prevalent
real
number implementation IEEE 754 uses a “hidden bit”, making the value1.0
special. - ↑ Not all compilers comply with this definition of the standard. The FPC’s standard
round
implementation will round in the case of equidistance toward even numbers. Knowing this is relevant for statistical applications. The GPC uses for itsround
implementation functionality provided by the hardware. As such, the implementation is hardware-dependent, on its specific configuration, and may deviate from the ISO 7185 standard definition. - ↑ Given the most prevalent implementations Two’s complement for
integer
values and IEEE 754 forreal
values, you have to consider the fact that (virtually) all bits in aninteger
contribute to its (mathematical) value, whereas areal
number stores values for the expressionmantissa * base pow exponent
. In very simple terms, themantissa
stores theinteger
part of a value, but the problem is that it occupies fewer bits than aninteger
would use, thus there is (for values that require more bits) a loss in information (i. e. a loss in precision). - ↑ The exact technical definition reads like: The value of
dividend div divisor
shall be such thatwhere the value shall be zero ifabs(dividend) - abs(divisor) < abs((dividend div divisor) * divisor) <= abs(dividend)
abs(dividend) < abs(divisor)
; otherwise, […] - ↑ Furthermore, both arrays have to be either
packed
or “unpacked”. - ↑ The EP standard calls this characteristic
protected
. - ↑ It is important that both functions use one common base, in this case it is .
- ↑ The ISO standard 7185 (“Standard Pascal”) calls this, lack of conformant-array parameters, “Level 0 compliance”. “Level 1 compliance” includes support for conformant array parameters. Of the compilers presented in Getting started only the GPC is a “Level 1”-compliant compiler.
- ↑ This style of programming is slightly disfavored, keyword “global variables”, but as for now we do not know appropriate syntax (
var
parameters) to not do that. - ↑ For extra credit: You can make use of the fact that (assuming that
zMaximum
is positive) . You can use this value to find out the minimum number of digits required. - ↑ In EP there also exists a
real
power operator**
. The difference is similar to that of the division operators:pow
only acceptsinteger
values as operands and yields aninteger
value, whereas**
always yields areal
value. Your choice for either of which, again, should be based on the required degree of precision.
Strings
The data type string(…)
is used to store a finite sequence of char
values.
It is a special case of an array
, but unlike an array[…] of char
the data type string(…)
has some advantages facilitating its effective usage.
The data type string(…)
as presented here is an Extended Pascal extension, as defined in the ISO standard 10206.
Due to its high relevance in practice, this topic has been put into the Standard Pascal part of this Wikibook, right after the chapter on arrays.
Many compilers have a different conception of what constitutes a string . Consult their manual for their idiosyncratic differences. Rest assured, the GPC supports string(…) as explained here.
|
Properties
[edit | edit source]Capacity
[edit | edit source]Definition
[edit | edit source]The declaration of a string
data type always entails a maximum capacity:
program stringDemo(output);
type
address = string(60);
var
houseAndStreet: address;
begin
houseAndStreet := '742 Evergreen Trc.';
writeLn('Send complaints to:');
writeLn(houseAndStreet);
end.
After the word string
follows a positive integer number surrounded by parenthesis.
This is not a function call.[fn 1]
Implications
[edit | edit source]Variables of the data type address
as defined above will only be able to store up to 60
independent char
values.
Of course it is possible to store less, or even 0
, but once this limit is set it cannot be expanded.
Inquiry
[edit | edit source]String
variables “know” about their own maximum capacity:
If you use writeLn(houseAndStreet.capacity)
, this will print 60
.
Every string
variable automatically has a “field” called capacity
.
This field is accessed by writing the respective string
variable’s name and the word capacity
joined by a dot (.
).
This field is read-only:
You cannot assign values to it.
It can only appear in expressions.
Length
[edit | edit source]All string
variables have a current length.
This is the total number of legit char
values every string
variable currently contains.
To query this number, the EP standard defines a new function called length
:
program lengthDemo(output);
type
domain = string(42);
var
alphabet: domain;
begin
alphabet := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
writeLn(length(alphabet));
end.
The length
function returns a non-negative integer
value denoting the supplied string’s length.
It also accepts char
values.[fn 2]
A char
value has by definition a length of 1
.
It is guaranteed that the length
of a string
variable will always be less than or equal to its corresponding capacity
.
Compatibility
[edit | edit source]You can copy entire string values using the :=
operator provided the variable on the LHS has the same or a greater capacity than the RHS string expression.
This is different than a regular array
’s behavior, which would require dimensions and size to match exactly.
program stringAssignmentDemo;
type
zipcode = string(5);
stateCode = string(2);
var
zip: zipcode;
state: stateCode;
begin
zip := '12345';
state := 'QQ';
zip := state; // ✔
// zip.capacity > state.capacity
// ↯ state := zip; ✘
end.
As long as no clipping occurs, i. e. the omission of values because of a too short capacity, the assignment is fine.
Index
[edit | edit source]It is worth noting that otherwise strings are internally regarded as arrays.[fn 3]
Like a character array you can access (and alter) every array element independently by specifying a valid index surrounded by brackets.
However, there is a big difference with respect to validity of an index.
At any time, you are only allowed to specify indices that are within the range 1..length
.
This range may be empty, specifically if length
is currently 0
.
It is not possible to change the current length by manipulating individual string components:
program stringAccessDemo;
type
bar = string(8);
var
foo: bar;
begin
foo := 'AA'; { ✔ length ≔ 2 }
foo[2] := 'B'; { ✔ }
foo[3] := 'C'; { ↯: 3 > length }
end. |
Standard routines
[edit | edit source]In addition to the length
function, EP also defines a few other standard functions operating on strings.
Manipulation
[edit | edit source]The following functions return strings.
Substring
[edit | edit source]In order to obtain just a part of a string
(or char
) expression, the function subStr(stringOrCharacter, firstCharacter, count)
returns a sub-string of stringOrCharacter
having the non-negative length count
, starting at the positive index firstCharacter
.
It is important that firstCharacter + count - 1
is a valid character index in stringOrCharacter
, otherwise the function causes an error.[fn 4]
program substringDemo(output);
begin
writeLn(subStr('GCUACGGAGCUUCGGAGUUAG', 7, 3));
{ char index: 1 4 7 … }
end.
GAG
firstCharacter
index. Here we wanted to extract the third codon. However, firstCharacter
is not simply 2 * 3
but 2 * 3 + 1
. Indexing characters in a string
variable start at 1
. Note, a sophisticated implementation for encoding codons would not make use of string
, but define a custom enumeration data type.For string
-variables, the subStr
function is the same as specifying myString[firstCharacter..firstCharacter+count]
.[fn 5]
Evidently, if the firstCharacter
value is some complicated expression, the subStr
function should be preferred to prevent any programming mistakes.
string
.
program substringOverwriteDemo(output);
var
m: string(35);
begin
m := 'supercalifragilisticexpialidocious ';
m[21..35] := '-yadi-yada-yada';
writeLn(m);
end.
supercalifragilistic-yadi-yada-yada
string
.Furthermore, the third parameter to subStr
can be omitted:
This will simply return the rest of the given string
starting at the position indicated by the second parameter.[fn 6]
Remove trailing spaces
[edit | edit source]The trim(source)
function returns a copy of source
without any trailing space characters, i. e. ' '
.
In LTR scripts any blanks to the right are considered insignificant, yet in computing they take up (memory) space.
It is advisable to prune strings before writing them, for example, to a disk or other long-term storage media, or transmission via networks.
Concededly memory requirements were a more relevant issue prior to the 21st century.
First occurrence of substring
[edit | edit source]The function index(source, pattern)
finds the first occurrence of pattern
in source
and returns the starting index.
All characters from pattern
match the characters in source
at the returned offset:
1 | 2 | 3 | ✘ | |||||
pattern
|
X
|
Y
|
X
|
|||||
---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | ✘ | |||||
pattern
|
X
|
Y
|
X
|
|||||
1 | 2 | 3 | ✔ | |||||
pattern
|
X
|
Y
|
X
|
|||||
source
|
Z
|
Y
|
X
|
Y
|
X
|
Y
|
X
| |
1 | 2 | 3 | 4 | 5 | 6 | 7 |
Note, to obtain the second or any subsequent occurrence, you need to use a proper substring of the source
.
Because the “empty string” is, mathematically speaking, present everywhere, index(characterOrString, '')
always returns 1
.
Conversely, because any non-empty string cannot occur in an empty string, index('', nonEmptyStringOrCharacter)
always returns 0
, in the context of strings an otherwise invalid index.
The value zero is returned if pattern
does not occur in source
.
This will always be the case if pattern
is longer than source
.
Operators
[edit | edit source]The EP standard introduced an additional operator for strings of any length, including single characters.
The +
operator concatenates two strings or characters, or any combination thereof.
Unlike the arithmetic +
, this operator is non-commutative, that means the order of the operands matters.
expression | result |
---|---|
'Foo' + 'bar'
|
'Foobar'
|
'' + ''
|
''
|
'9' + chr(ord('0') + 9) + ' Luftballons'
|
'99 Luftballons'
|
Concatenation is useful if you intend to save the data somewhere.
Supplying concatenated strings to routines such as write
/writeLn
, however, may possibly be disadvantageous:
The concatenation, especially of long strings, first requires to allocate enough memory to accommodate for the entire resulting string.
Then, all the operands are copied to their respective location.
This takes time.
Hence, in the case of write
/writeLn
it is advisable (for very long strings) to use their capability of accepting an infinite number of (comma-separated) parameters.
Note, the common LOC
stringVariable := 'xyz' + someStringOrCharacter + …;
is equivalent to
writeStr(stringVariable, 'xyz', someStringOrCharacter, …);
The latter is particularly useful if you also want to pad the result or need some conversion.
Writing foo:20
(minimum width of 20
characters possibly padded with spaces ' '
to the left) is only acceptable using write
/writeLn
/writeStr
. WriteStr
is an EP extension.
The GPC, the FPC and Delphi are also shipped with a function concat
performing the very same task.
Read the respective compiler’s documentation before using it, because there are some differences, or just stick to the standardized +
operator.
Sophisticated comparison
[edit | edit source]All functions presented in this subsection return a Boolean
value.
Order
[edit | edit source]Since every character in a string has an ordinal value, we can think of a method to sort them. There are two flavors of comparing strings:
- One uses the relational operators already introduced, such as
=
,>
or<=
. - The other one is to use dedicated functions like
LT
, orGT
.
The difference lies in their treatment of strings that vary in length.
While the former will bring both strings to the same length by padding them with space characters (' '
), the latter simply clips them to the shortest length, but taking into account which one was longer (if necessary).
function name | meaning | operator |
---|---|---|
EQ |
equal | =
|
NE |
not equal | <>
|
LT |
less than | <
|
LE |
less than or equal to | <=
|
GT |
greater than | >
|
GE |
greater than or equal to | >=
|
All these functions and operators are binary, that means they expect and accept only exactly two parameters or operands respectively. They can produce different results if supplied with the same input, as you will see in the next two sub-subsections.
Equality
[edit | edit source]Let’s start with equality.
- Two strings (of any length) are considered equal by the
EQ
function if both operands are of the same length and the value, i. e. the character sequence that actually make up the strings, are the same. - An
=
‑comparison, on the other hand, augments any “missing” characters in the shorter string by using the padding character space (' '
).[fn 7]
program equalDemo(output);
const
emptyString = '';
blankString = ' ';
begin
writeLn(emptyString = blankString);
writeLn(EQ(emptyString, blankString));
end.
True
False
emptyString
got padded to match the length of blankString
, before the actual character-by-character =
‑expression took place.To put this relationship in other words, Pascal terms you already know:
(foo = bar) = EQ(trim(foo), trim(bar))
The actual implementation is usually different, because trim
can be, especially for long strings, quite resource-consuming (time, as well as memory).
As a consequence, an =
‑comparison is usually used if trailing spaces are insignificant, but are still there for technical reasons (e. g. because you are using an array[1..8] of char
).
Only EQ
ensures both strings are lexicographically the same.
Note that the capacity
of either string is irrelevant.
The function NE
, short for not equal, behaves accordingly.
Less than
[edit | edit source]A string is determined to be “less than” another one by sequentially reading both strings simultaneously from left to right and comparing corresponding characters. If all characters match, the strings are said to be equal to each other. However, if we encounter a differing character pair, processing is aborted and the relation of the current characters determines the overall string’s relation.
first operand | 'A'
|
'B'
|
'C'
|
'D'
|
---|---|---|---|---|
second operand | 'A'
|
'B'
|
'E'
|
'A'
|
determined relation | =
|
=
|
<
|
⨯ |
If both strings are of equal length, the LT
function and the <
‑operator behave the same.
LT
actually even builds on top of <
.
Things get interesting if the supplied strings differ in length.
- The
LT
function first cuts both strings to the same (shorter) length. (substring) - Then a regular comparison is performed as demonstrated above. If the shortened versions, common length versions turn out to be equal, the (originally) longer string is said to be greater than the other one.
<
‑comparison, on the other, compares all remaining “missing” characters to ' '
, the space character. This can lead to differing results:
program lessThanDemo(output);
var
hogwash, malarky: string(8);
begin
{ ensure ' ' is not chr(0) or maxChar }
if not (' ' in [chr(1)..pred(maxChar)]) then
begin
writeLn('Character set presumptions not met.');
halt; { EP procedure immediately terminating the program }
end;
hogwash := '123';
malarky := hogwash + chr(0);
writeLn(hogwash < malarky, LT(hogwash, malarky));
malarky := hogwash + '4';
writeLn(hogwash < malarky, LT(hogwash, malarky));
malarky := hogwash + maxChar;
writeLn(hogwash < malarky, LT(hogwash, malarky));
end.
False True
True True
True True
<
‑comparison, the “missing” fourth character in hogwash
is presumed to be ' '
. The fourth character in malarky
is compared against ' '
.The situation above has been provoked artificially for demonstration purposes, but this can still become an issue if you are frequently using characters that are “smaller” than the regular space character, like for instance if you are programming on an 1980s 8‑bit Atari computer using ATASCII.
The LE
, GT
, and GE
functions act accordingly.
Details on string
literals
[edit | edit source]Inclusion of delimiter
[edit | edit source]In Pascal string
literals start with and are terminated by the same character.
Usually this is a straight (typewriter’s) apostrophe ('
).
Troubles arise if you want to actually include that character in a string
literal, because the character you want to include into your string is already understood as the terminating delimiter.
Conventionally, two straight typewriter’s apostrophes back-to-back are regarded as an apostrophe image.
In the produced computer program, they are replaced by a single apostrophe.
program apostropheDemo(output);
var
c: char;
begin
for c := '0' to '9' do
begin
writeLn('ord(''', c, ''') = ', ord(c));
end;
end.
Each double-apostrophe is replaced by a single apostrophe.
The string still needs delimiting apostrophes, so you might end up with three consecutive apostrophes like in the example above, or even four consecutive apostrophes (''''
) if you want a char
-value consisting of a single apostrophe.
Non-permissible characters
[edit | edit source]A string
is a linear sequence of characters, i. e. along a single dimension.
As such the only illegal “character” in strings is the one marking line breaks (new lines). The string literal in the following piece of code is unacceptable, because it spans across multiple (source code) lines.
welcomeMessage := 'Hello!
All your base are belong to us.'; |
You are nevertheless allowed to use the OS-specific code indicating EOLs, yet the only cross-platform (i. e. guaranteed to work regardless of the used OS) procedure is writeLn
.
Although not standardized, many compilers provide a constant representing the environment’s character/string necessary to produce line breaks.
In FPC it is called lineEnding
.
Delphi has sLineBreak
, which is also understood by the FPC for compatibility reasons.
The GPC’s standard module GPC
supplies the constant lineBreak
.
You will first need to import
this module before you can use that identifier.
Remainder operator
[edit | edit source]The final Standard Pascal arithmetic operator you are introduced to, after learning to divide, is the remainder operator mod
(short for modulo).
Every integer
division (div
) may yield a remainder.
This operator evaluates to this value.
i
|
-3
|
-2
|
-1
|
0
|
1
|
2
|
3
|
---|---|---|---|---|---|---|---|
i mod 2
|
1
|
0
|
1
|
0
|
1
|
0
|
1
|
i mod 3
|
0
|
1
|
2
|
0
|
1
|
2
|
0
|
Similar to all other division operations, the mod
operator does not accept a zero value as the second operand.
Moreover, the second operand to mod
must be positive.
There are many definitions, among computer scientists and mathematicians, as regards to the result if the divisor was negative.
Pascal avoids any confusion by simply declaring negative divisors as illegal.
The mod
operator is frequently used to ensure a certain value remains in a specific range starting at zero (0..n
).
Furthermore, you will find modulo in number theory.
For example, the definition of prime numbers says “not divisible by any other number”.
This expression can be translated into Pascal like that:
expression | is divisible by |
---|---|
mathematical notation | |
Pascal expression | x mod d = 0
|
odd(x) is shorthand for x mod 2 <> 0 .[fn 8]
|
Tasks
[edit | edit source]array[n..m] of string(c)
?string(…)
is basically a special case of an array
(namely one consisting of char
values), you can access a single character from it just like usual: v[i, p]
where i
is a valid index in the range n..m
and p
refers to the character index within 1..length(v[i])
.
true
if, and only if a given string(…)
contains non-blank characters (i. e. other characters than ' '
).program spaceTest(input, output);
type
info = string(20);
{**
\brief determines whether a string contains non-space characters
\param s the string to inspect
\return true if there are any characters other than ' '
*}
function containsNonBlanks(s: info): Boolean;
begin
containsNonBlanks := length(trim(s)) > 0;
end;
// … remaining code for testing purposes only …
Note, that this function (correctly) returns false
if supplied with an empty string (''
). Alternatively you could have written:
containsNonBlanks := '' <> s;
string(…)
data type to work properly. Remember, in these exercises there is no “best” solution.
program
that reads a string(…)
and transposes every letter in it by 13 positions with respect to the original character’s place in the English alphabet, and then outputs the modified version. This algorithm is known as “Caesar cipher”. For simplicity assume all input is lower-case.program rotate13(input, output);
const
// we will only operate ("rotate") on these characters
alphabet = 'abcdefghijklmnopqrstuvwxyz';
offset = 13;
type
integerNonNegative = 0..maxInt;
sentence = string(80);
var
secret: sentence;
i, p: integerNonNegative;
begin
readLn(secret);
for i := 1 to length(secret) do
begin
// is current character in alphabet?
p := index(alphabet, secret[i]);
// if so, rotate
if p > 0 then
begin
// The `+ 1` in the end ensures that p
// in the following expression `alphabet[p]`
// is indeed always a valid index (i.e. not zero).
p := (p - 1 + offset) mod length(alphabet) + 1;
secret[i] := alphabet[p];
end;
end;
writeLn(secret);
end.
array[chr(0)..maxChar] of char
) would have been acceptable, too, but care must be taken in properly populating it.
Note, it is not guaranteed that expressions such as succ('A', 13)
will yield the expected result. The range 'A'..'Z'
is not necessarily contiguous, so you should not make any assumptions about it. If your solution makes use of that, you must document it (e. g. “This program only runs properly on computers using the ASCII character set.”).
string
is a palindrome, that means it can be read forward and backwards producing the same meaning/sound provided word gaps (spaces) are adjusted accordingly. For simplicity assume all characters are lower-case and there are no punctuation characters (other than whitespace).program palindromes(input, output);
type
sentence = string(80);
{
\brief determines whether a lower-case sentence is a palindrome
\param original the sentence to inspect
\return true iff \param original can be read forward and backward
}
function isPalindrome(original: sentence): Boolean;
var
readIndex, writeIndex: integer;
derivative: sentence;
check: Boolean;
begin
check := true;
// “sentences” that have a length of one, or even zero characters
// are always palindromes
if length(original) > 1 then
begin
// ensure `derivative` has the same length as `original`
derivative := original;
// the contents are irrelevant, alternatively [in EP] you could’ve used
//writeStr(derivative, '':length(original));
// which would’ve saved us the “fill the rest with blanks” step below
writeIndex := 1;
// strip blanks
for readIndex := 1 to length(original) do
begin
// only copy significant characters
if not (original[readIndex] in [' ']) then
begin
derivative[writeIndex] := original[readIndex];
writeIndex := writeIndex + 1;
end;
end;
// fill the rest with blanks
for writeIndex := writeIndex to length(derivative) do
begin
derivative[writeIndex] := ' ';
end;
// remove trailing blanks and thus shorten length
derivative := trim(derivative);
for readIndex := 1 to length(derivative) div 2 do
begin
check := check and (derivative[readIndex] =
derivative[length(derivative) - readIndex + 1]);
end;
end;
isPalindrome := check;
end;
var
mystery: sentence;
begin
writeLn('Enter a sentence that is possibly a palindrome (no caps):');
readLn(mystery);
writeLn('The sentence you have entered is a palindrome: ',
isPalindrome(mystery));
end.
original
string
. For demonstration purposes the example shows if not (original[readIndex] in [' ']) then
. In fact an explicit set list would have been more adequate, i. e. if original[readIndex] in ['a', 'b', 'c', …, 'z']) then
. Do not worry if you simply wrote something to the effect of if original[readIndex] <> ' ' then
, this is just as fine given the task’s requirements.
LT('', '')
?
function
that determines whether a year in the Gregorian calendar is a leap year. Every fourth year is a leap year, but every hundredth year is not, unless it is the fourth century in a row.mod
operator you just saw:
{
\brief determines whether a year is a leap year in Gregorian calendar
\param x the year to inspect
\return true, if and only if \param x meets leap year conditions
}
function leapYear(x: integer): Boolean;
begin
leapYear := (x mod 4 = 0) and (x mod 100 <> 0) or (x mod 400 = 0);
end;
function
isLeapYear
in Delphi’s and the FPC’s sysUtils
unit
or in GPC’s GPC
module
. Whenever possible reuse code already available.
function
returning the leap year property of a year, write a binary function
returning the number of days in a given month and year.case
-statement. Recall that there must be exactly one assignment to the result variable:
type
{ a valid day number in Gregorian calendar month }
day = 1..31;
{ a valid month number in Gregorian calendar year }
month = 1..12;
{
\brief determines the number of days in a given Gregorian year
\param m the month whose day number count is requested
\param y the year (relevant for leap years)
\return the number of days in a given month and year
}
function daysInMonth(m: month; y: integer): day;
begin
case m of
1, 3, 5, 7, 8, 10, 12:
begin
daysInMonth := 31;
end;
4, 6, 9, 11:
begin
daysInMonth := 30;
end;
2:
begin
daysInMonth := 28 + ord(leapYear(y));
end;
end;
end;
dateUtils unit
provide a function
called daysInAMonth
. You are strongly encouraged to reuse it instead of your own code.More exercises can be found in:
Notes:
- ↑ In fact this is a discrimination of, what EP calls “schema”. Schemata will be explained in detail in the Extensions Part of this Wikibook.
- ↑ This functionality is useful if you are handling constants you or someone might change at some point. Per definition the literal value
' '
is achar
value, whereas''
(“null-string”) or'42'
are string literals. In order to write generic code,length
accepts all kinds of values that could denote a finite sequence ofchar
values. - ↑ In fact the definition essentially is
packed array[1..capacity] of char
. - ↑ This means, in the case of empty strings, only the following function call could be legal
subStr('', 1, 0)
. It goes without saying that such a function call is very useless. - ↑ The string variable may not be
bindable
when using this notation. - ↑ Omitting the third parameter in the case of empty strings or characters is not allowed.
subStr('', 1)
is illegal, because there is no “character1
” in an empty string. Also,subStr('Z', 1)
is not allowed, because'Z'
is achar
-expression and as such always has a length of1
, rendering any need for a “give me the rest of/subsequent characters of” function obsolete. - ↑ If you are a GPC user, you will need to ensure you are in a fully-EP-compliant mode for example by specifying
‑‑extended‑pascal
on the command line. Otherwise, no padding occurs. The Standard (unextended) Pascal, as per ISO standard 7185, does not define any padding algorithm. - ↑ The actual implementation of
odd
may be different. On many processor architectures it is usually something to the effect of the x86 instructionand x, 1
.
Records
The key to successful programming is finding the "right" structure of data and program.
—Niklaus Wirth[1]
After you have learned to use an array
, this chapter introduces you to another data type structure concept called record
.
Like an array
, the use of record
s primarily serves the purposes of allowing you to write clean, structured programs.
It is otherwise optional.
Concept
[edit | edit source]You briefly saw a record
in the first chapter.
While an array
is a homogenous aggregation of data, that means all members have to have the same base data type, a record
is potentially, but not necessarily an aggregation of data having various different data types.[2]
Declaration
[edit | edit source]A record
data type declaration looks pretty much like a collection of variable declarations:
program recordDemo;
type
(* a standard line on a text console *)
line = string(80);
(* 1st grade through 12th grade *)
grade = 1..12;
(* encapsulate all administrative data in one structure *)
student = record
firstname: line;
lastname: line;
level: grade;
end;
The declaration begins with the word record
and ends with end
.
Inbetween you declare fields, or members, member elements of the entire record
.
Here again the semicolon has the function of separating members. The keyword end will actually terminate a record declaration. Note, how in the following correct example there is no semicolon after the last member’s declaration:
program recordSemicolonDemo;
type
sphere = record
radius: real;
volume: real;
surface: real
end; |
|
All record
members have to bear distinct names within the record
declaration itself.
For instance in the example above, declaring two “variables”, member elements of the name level
will be rejected.
There is no requirement on how many fields you have to declare.
An “empty” record
is also possible:[fn 1]
type
emptyRecord = record
end;
Many fields of the same data type
[edit | edit source]Similar to the declaration of variables you can define multiple fields of the same data type at once by separating identifiers with a comma.
The previous declaration of sphere
could also be written as:
type
sphere = record
radius, volume, surface: real;
end;
Most Pascal veterans and style guides, however, discourage the use of this shorthand notation (both for variable as well as record
declarations, but also in formal parameter lists).
It is only reasonable if all declared identifiers absolutely always have same data type;
it is virtually guaranteed you will never want to change the data type of just one field in the comma-separated list.
If in doubt, use the longhand.
In programming, convenience plays a tangential role.
Use
[edit | edit source]By declaring a record
variable you immediately have the entire set of “sub”‑variables at your hand.
Accessing them is done by specifying the record
variable’s name, plus a dot (.
), followed by the record
field’s name:
var
posterStudent: student;
begin
posterStudent.firstname := 'Holden';
posterStudent.lastname := 'Caulfield';
posterStudent.level := 10;
end.
You already saw the dot notation in the previous chapter on strings, where appending .capacity
on a name of a string(…)
variable refers to the respective variable’s character capacity.
This is not a coincidence.
program dotNoGo(output); { This program does not compile. }
type
line = string(80);
quizItem = record
question: line;
answer: line;
end;
var
response: line;
challenge: quizItem;
begin
writeLn(line.capacity); { ↯ `line` is not a variable }
writeLn(response.capacity); { ✔ correct }
writeLn(quizItem.question); { ↯ `quizItem` refers to a data type }
{ Data type declarations (as per definition) do not reserve any memory }
{ thus you cannot “read/write” from/to a data type. }
writeLn(challenge.question); { ✔ correct }
end.
.
) notation is only valid if there is memory.[fn 2]
Advantages
[edit | edit source]But why and when do we want to use a record
?
At first glance and in the given examples so far it may seem like a troublesome way to declare and use multiple variables.
Yet the fact that a record
is handled as one unit entails one big advantage:
- You can copy entire
record
values via a simple assignment (:=
). - This means you can pass much data at once: A
record
can be a parameter of routines, and in EP functions can return them as well.[fn 3]
Evidently you want to group data together that always appear together. It does not make sense to group unrelated data, just because we can. Another quite useful advantage is presented below in the section on variant records.
Routing override
[edit | edit source]As you saw earlier, referring to members of a record
can get a little tedious, because we are repeating the variable name over and over again.
Fortunately, Pascal allows us abbreviate things a bit.
With
-clause
[edit | edit source]The with
-clause allows us to eliminate repeating a common prefix, specifically the name of a record
variable.[3]
begin
with posterStudent do
begin
firstname := 'Holden';
lastname := 'Caulfield';
level := 10;
end;
end.
All identifiers that identify values are first looked for in the record
scope of posterStudent
.
If there is no match, all variable identifiers outside of the given record
are considered too.
Of course it is still possible to denote a record
member by its full name.
E. g. in the source code above it would be perfectly legal to still write posterStudent.level
within the with
-clause.
Concededly, this would defeat the purpose of the with
-clause, but sometimes it may still be beneficial to emphasize the specific record
variable just for documentation.
It is nevertheless important to understand that the FQI, the fully-qualified identifier, the one with a dot in it, does not lose its “validity”.
In principle, all components of structured values “containing dots” can be abbreviated with with
.
This is also true for the data type string
you have learned in the previous chapter.
program withDemo(input, output);
type
{ Deutsche Post „Maxi-Telegramm“ }
telegram = string(480);
var
post: telegram;
begin
with post do
begin
writeLn('Enter your telegram. ',
'Maximum length = ',
capacity, ' characters.');
readLn(post);
{ … }
end;
end.
Here, within the with
-clause capacity
, and for that matter post.capacity
, refer to post.capacity
.
Multiple levels
[edit | edit source]If multiple with
-clauses ought to be nested, there is the short notation:
with snakeOil, sharpTools do
begin
…
end;
which is equivalent to:
with snakeOil do
begin
with sharpTools do
begin
…
end;
end;
It is important to bear in mind, first identifiers in sharpTools
are searched, and if there is no match, secondly, identifiers in snakeOil
are considered.
Variant records
[edit | edit source]In Pascal a record
is the only data type structure concept that allows you to, so to speak, alter its structure during run-time, while a program
is running.
This super practical property of record
permits us to write versatile code covering many cases.
Declaration
[edit | edit source]Let’s take a look at an example:
type
centimeter = 10..199;
// order of female, male has been chosen, so `ord(sex)`
// returns the [minimum] number of non-defective Y chromosomes
sex = (female, male)
// measurements according EN 13402 size designation of clothes [incomplete]
clothingSize = record
shoulderWidth: centimeter;
armLength: centimeter;
bustGirth: centimeter;
waistSize: centimeter;
hipMeasurement: centimeter;
case body: sex of
female: (
underbustMeasure: centimeter;
);
male: (
);
end;
The variant part of a record
starts with the keyword case
, which you already know from selections.
After that follows a record
member declaration, the variant selector, but instead of a semicolon you put the keyword of
thereafter.
Below that follow all possible variants.
Each variant is marked by a value out of the variant selector’s domain, here female
and male
.
Separated by a colon (:
) follows a variant denoter surrounded by parentheses.
Here you can list additional record
members that are only available if a certain variant is “active”.
Note that all identifiers across all alternatives must be unique.
The individual variants are separated by a semicolons, and there can be at most one variant part which has to appear at the end.
Because you will need to be able to list all possible variants, the variant selector has to be an ordinal data type.
Use
[edit | edit source]Using variant records requires you to first select a variant.
Variants are “activated” by assigning a value to the variant selector.
Note, variants are not “created”; they all already exist at program
startup.
You merely need to make a choice.
boobarella.body := female;
boobarella.underbustMeasure := 69;
Only after assigning a value to the variant selector and as long as this value remains unchanged, you are allowed to access any fields of the respective variant.
It is illegal to reverse the previous two lines of code and attempt accessing the underbustMeasure
field even though body
is not defined yet and, more importantly, does not bear the value female
.
It is certainly permissible to change the variant selector later in your program
and then use a different variant, but all previously stored values in the variant part relinquish their validity and you cannot restore them.
If you switch back the variant to a previous, original value, you will need to assign all values in that variant anew.
Application
[edit | edit source]This concept opens up new horizons:
You can design your programs more interactively in a neat fashion.
You can now choose a variant based on run-time data (data that is read while the program
is running).
Because at any time (after the first assignment of a value to the variant selector) only one variant is “active”, your program
will crash if it attempts reading/writing values of an “inactive” variant.
This is a desirable behavior, because that is the whole idea of having distinct variants.
It guarantees your programs overall integrity.
Anonymous variants
[edit | edit source]Pascal also permits having anonymous variant selectors, that is selectors not bearing any name. The implications are
- you cannot explicitly select (nor query) any variant, so
- in turn all variants are considered “active” at the same time.
“But wasn’t this the object of the exercise?” you might ask.
Yes, indeed, since there is no named selector your program
cannot keep track which variant is supposed to work and which one is “defective”.
You are responsible to determine which variant you can sensibly read/write at present.
program anonymousVariantsDemo(output);
type
bitIndex = 0..(sizeOf(integer) * 8 - 1);
exposedInteger = record
case Boolean of
false: (
value: integer;
);
true: (
bit: set of bitIndex;
);
end;
var
i: exposedInteger;
begin
i.bit := [4];
writeLn(i.value);
end.
16
16
is (and this should be considered “a coincidence”) . We stress that all Pascal standards do not make any statement regarding internal memory structure. A high-level programming language is not concerned about how data is stored, it even does not know the notion of “bits”, “voltage high”/“voltage low”.
This concept exists in many other programming languages too. In the programming language C, for instance, it is called a union.
Conditional loops
[edit | edit source]So far we have been exclusively using counting loops. This is great if you can predict in advance the number of iterations, how many times the loop’s body needs to be executed. Yet every so often it is not possible to formulate a proper expression determining the number of iterations in advance.
Conditional loops allow you to make the execution of the next iteration dependent on a Boolean
expression.
They come in two flavors:
- Head-controlled loop, and
- tail-controlled loop.
The difference is, the loop’s body of a tail-controlled loop is executed at least once in any case, whereas a head-controlled loop might never execute the loop body at all. In either case, a condition is evaluated over and over again and must uphold for the loop to continue.
Head-controlled loop
[edit | edit source]A head-controlled loop is frequently called while
-loop because of its syntax.
program characterCount(input, output);
type
integerNonNegative = 0..maxInt;
var
c: char;
n: integerNonNegative;
begin
n := 0;
while not EOF do
begin
read(c);
n := n + 1;
end;
writeLn('There are ', n:1, ' characters.');
end.
$ cat ./characterCount.pas | ./characterCount
There are 240 characters.
$ printf '' '' | ./characterCount
There are 0 characters.
Boolean
expression framed by the words while
and do
. The condition must evaluate to true
for any (subsequent) iteration to occur.
As you can see from the output, in the second case, it may even be zero times: Evidently for empty input n := n + 1
was never executed.EOF
is shorthand for EOF(input)
.
This standard function
returns true
if there is no further data available to read, commonly called end of file.
It is illegal, and will horribly fail, to read
from a file if the respective EOF
function call returns true
.
Unlike a counting loop, you are allowed to modify data the conditional loop’s condition depends on.
const
(* instead of a hard-coded length `64` *)
(* you can write `sizeOf(integer) * 8` in Delphi, FPC, GPC *)
wordWidth = 64;
type
integerNonNegative = 0..maxInt;
wordStringIndex = 1..wordWidth;
wordString = array[wordStringIndex] of char;
function binaryString(n: integerNonNegative): wordString;
var
(* temporary result *)
binary: wordString;
i: wordStringIndex;
begin
(* initialize `binary` with blanks *)
for i := 1 to wordWidth do
begin
binary[i] := ' ';
end;
(* if n _is_ zero, the loop's body won't be executed *)
binary[i] := '0';
(* reverse Horner's scheme *)
while n >= 1 do
begin
binary[i] := chr(ord('0') + n mod 2);
n := n div 2;
i := i - 1;
end;
binaryString := binary;
end;
The n
the loop’s condition depends on will be repeatedly divided by two.
Because the division operator is an integer division (div
), at some point the value 1
will be divided by two and the arithmetically correct result 0.5
is truncated (trunc
) toward zero.
Yet the value 0
does not satisfy the loop’s condition anymore, thus there will not be any subsequent iterations.
Tail-controlled loop
[edit | edit source]In a tail-controlled loop the condition appears below the loop’s body, at the foot. The loop’s body is always run once before even the condition is evaluated at all.
program repeatDemo(input, output);
var
i: integer;
begin
repeat
begin
write('Enter a positive number: ');
readLn(i);
end
until i > 0;
writeLn('Wow! ', i:1, ' is a quite positive number.');
end.
The loop’s body is encapsulated by the keywords repeat
and until
.[fn 4]
After until
follows a Boolean
expression.
In contrast to a while
loop, the tail-controlled loop always continues, always keeps going, until
the specified condition becomes true
.
A true
condition marks the end.
In the above example the user will be prompted again and again until he eventually complies and enters a positive number.
Date and time
[edit | edit source]This section introduces you to features of Extended Pascal as defined in the ISO standard 10206. You will need an EP‑compliant compiler to use those features.
Time stamp
[edit | edit source]In EP there is a standard data type called timeStamp
.
It is declared as follows:[fn 5]
type
timeStamp = record
dateValid: Boolean;
timeValid: Boolean;
year: integer;
month: 1..12;
day: 1..31;
hour: 0..23;
minute: 0..59;
second: 0..59;
end;
As you can see from the declaration, timeStamp
also contains data fields for a calendar date, not just the time as indicated by a standard clock.
Getting a time stamp
[edit | edit source]EP also defines a unary procedure
that populates a timeStamp
variable with values.
GetTimeStamp
assigns values to all members of a timeStamp record
passed in the first (and only) parameter.
These values represent the “current date” and “current time” as at the invocation of this procedure.
However, in the 1980’s not all (personal/home) computers did have a built-in “real time” clock.
Therefore, the ISO standard 10206 devised prior 21st century stated that the word “current” was “implementation-defined”.
The dateValid
and timeValid
fields were specifically inserted to address the issue that some computers simply do not know the current date and/or time.
When reading values from a timeStamp
variable, it is still advisable to check their validity first after having getTimeStamp
fill them out.
If getTimeStamp
was unable to obtain a “valid” value, it will set
day
,month
andyear
to a value representing January 1, 1 CE, but alsodateValid
tofalse
.- In the case of time,
hour
,minute
andsecond
become all0
, a value representing midnight. ThetimeValid
field becomesfalse
.
Both are independent from each other, so it may certainly be the case that just the time could be determined, but the date is invalid.
Note that the Gregorian calendar was introduced during the year 1582 CE, so the timeStamp
data type is generally useless for any dates before 1583 CE.
Printable dates and times
[edit | edit source]Having obtained a timeStamp
, EP furthermore supplies two unary functions:
date
returns a human-readablestring
representation ofday
,month
andyear
, andtime
returns a human-readablestring
representation ofhour
,minute
andsecond
.
Both functions will fail and terminate the program
if dateValid
or timeValid
indicate an invalid datum respectively.
Note, the exact format of string
representation is not defined by the ISO standard 10206.
program
:
program dateAndTimeFun(output);
var
ts: timeStamp;
begin
getTimeStamp(ts);
if ts.dateValid then
begin
writeLn('Today is ', date(ts), '.');
end;
if ts.timeValid then
begin
writeLn('Now it is ', time(ts), '.');
end;
end.
Today is 11 Nov 2024. Now it is 04:16:42.
dateValid
and timeValid
are false
.Summary on loops
[edit | edit source]This is a good time to take inventory and reiterate all kinds of loops.
Conditional loops
[edit | edit source]Conditional loops are the tools of choice if you cannot predict the total number of iterations.
head-controlled loop | tail-controlled loop |
---|---|
while condition do
begin
…
end;
|
repeat
begin
…
end
until condition; |
condition must evaluate to true for any (including subsequent) iterations to occur.
|
condition must be false for any subsequent iteration to occur.
|
It is possible to formulate either loop as the other one, but usually one of them is more suitable.
A tail-controlled loop is particularly suitable if you do not have any data yet to make a judgment, to evaluate a proper condition
prior the first iteration.
Counting loops
[edit | edit source]Counting loops are good if you can predict the total number of iterations before entering the loop.
counting up loop | counting down loop |
---|---|
for controlVariable := initialValue to finalValue do
begin
…
end; |
for controlVariable := initialValue downto finalValue do
begin
…
end; |
After each non-final iteration controlVariable becomes succ(controlVariable) . controlVariable must be less than or equal to finalValue for another iteration to occur.
|
After each non-final iteration controlVariable becomes pred(controlVariable) . controlVariable must be greater than or equal to finalValue for another iteration to occur.
|
Both, the initialValue and finalValue expressions, are evaluated exactly once.[4] This is very different to conditional loops.
|
Inside counting loops’ bodies you cannot modify the counting variable, only read it. This prevents you from any accidental manipulations and ensures the calculated predicted total number of iterations will indeed occur.
Loops on aggregations
[edit | edit source]If you are using an EP-compliant compiler, you furthermore have the option to use a for … in
loop on sets.
program forInDemo(output);
type
characters = set of char;
var
c: char;
parties: characters;
begin
parties := ['R', 'D'];
for c in parties do
begin
write(c:2);
end;
writeLn;
end.
Tasks
[edit | edit source]You have made it this far, and it is quite impressive how much you already know.
Since this chapter’s concept of a record
should not be too difficult to grasp, the following exercises mainly focus on training.
A professional computer programmer spends most of his time on thinking what kind of implementation, using which tools (e. g. array
“vs.” set
), is the most useful/reasonable.
You are encouraged to think first, before you even start typing anything.
Nonetheless, sometimes (esp. due to your lack of experience) you need to just try things out, which is fine if it is intentional.
Aimlessly finding a solution does not discern an actual programmer.
record
contain another record
?array
to contain another array
, this is quite possible for a record
too. Write a test program
to see for yourself. The important thing is to note that the dot-notation can be expanded indefinitely (myRecordVariable.topRecordFieldName.nestedRecordFieldName.doubleNestedRecordFieldName
). Evidently at some point it becomes too difficult to read so use this wisely.
while true do
begin
…
end;
The condidition needs to be negated in a repeat … until
loop:
repeat
begin
…
end
until false;
true
, or expressions that can never be fulfilled (in the case of a repeat … until
loop), are not. For instance, given that i
was an integer
the loop while i <= maxInt do
will run indefinitely, because i
can never exceed maxInt
[fn 6] and thus break the loop’s condition. Therefore be reminded to carefully formulate expressions for conditional loops and ensure it will eventually reach a terminating state. Otherwise it can be frustrating for the user of your program
.
while
-loop:
repeat
begin
imagineJumpingSheep;
sheepCount := sheepCount + 1;
waitTwoSeconds;
end
until asleep;
while
-loop even begins:
imagineJumpingSheep;
sheepCount := sheepCount + 1;
waitTwoSeconds;
while not asleep do
begin
imagineJumpingSheep;
sheepCount := sheepCount + 1;
waitTwoSeconds;
end;
repeat … until
-loop is more suitable in this case.
program
that takes the output of the command getent passwd
as input and only prints the first field/column of every line. In a file, fields are separated by a colon (:
). Your program
will list all known user names.getent passwd | ./cut1
(the file name of your executable program may differ).
program cut1(input, output);
const
separator = ':';
var
line: string(80);
begin
while not EOF(input) do
begin
{ This reads the _complete_ line, but at most}
{ line.capacity characters are actually saved. }
readLn(line);
writeLn(line[1..index(line, separator)-1]);
end;
end.
index
will return the index of the colon character which you do not want to print, thus you will need to subtract 1
from its result. This program
will evidently fail if a line does not contain a colon.
program
so only user names whose UID is greater than or equal to 1000
. The UID is stored in the third field.program cut2(input, output);
const
separator = ':';
minimumID = 1000;
var
line: string(80);
nameFinalCharacter: integer;
uid: integer;
begin
while not EOF do
begin
readLn(line);
nameFinalCharacter := index(line, separator) - 1;
{ username:encryptedpassword:usernumber:… }
{ ↑ `nameFinalCharacter + 1` }
{ ↑ `… + 2` is the index of the 1st password character }
uid := index(subStr(line, nameFinalCharacter + 2), separator);
{ Note that the preceding `index` did not operate on `line` }
{ but an altered/different/independent “copy” of it. }
{ This means, we’ll need to offset the returned index once again. }
readStr(subStr(line, nameFinalCharacter + 2 + uid), uid);
{ Read/readLn/readStr automatically terminate reading an integer }
{ number from the source if a non-digit character is encountered. }
{ (Preceding blanks/space characters are ignored and }
{ the _first_ character still may be a sign, that is `+` or `-`.)}
if uid >= minimumID then
begin
writeLn(line[1..nameFinalCharacter]);
end;
end;
end.
subStr
can be omitted effectively meaning “give me the rest of a string
.” Note that this programming task mimics (some of) the behavior of . Use programs/source code that has already been programmed for you whenever possible. Reinventing the wheel is not necessary. Nonetheless, this basic task is a good exercise. On a RHEL system you may rather want to set minimumID
to 500
.
program
meets all requirements. Note, an implementation using an array[1..limit] of Boolean
would have been perfectly fine as well, although the shown set of natural
implementation is in principle preferred.
program eratosthenes(output);
type
{ in Delphi or FPC you will need to write 1..255 }
natural = 1..4095;
{$setLimit 4096}{ only in GPC }
naturals = set of natural;
const
{ `high` is a Borland Pascal (BP) extension. }
{ It is available in Delphi, FPC and GPC. }
limit = high(natural);
{ Note: It is important that `primes` is declared }
{ in front of `sieve` and `list`, so both of these }
{ routines can access the _same_ variable. }
var
primes: naturals;
{ This procedure sieves the `primes` set. }
{ The `primes` set needs to be fully populated }
{ _before_ calling this routine. }
procedure sieve;
var
n: natural;
i: integer;
multiples: naturals;
begin
{ `1` is by definition not a prime number }
primes := primes - [1];
{ find the next non-crossed number }
for n := 2 to limit do
begin
if n in primes then
begin
multiples := [];
{ We do _not_ want to remove 1 * n. }
i := 2 * n;
while i in [n..limit] do
begin
multiples := multiples + [i];
i := i + n;
end;
primes := primes - multiples;
end;
end;
end;
{ This procedures lists all numbers in `primes` }
{ and enumerates them. }
procedure list;
var
count, n: natural;
begin
count := 1;
for n := 2 to limit do
begin
if n in primes then
begin
writeLn(count:8, '.:', n:22);
count := count + 1;
end;
end;
end;
{ === MAIN program === }
begin
primes := [1..limit];
sieve;
list;
end.
sieve
task from the list
task, both routine definitions and the main part of the program
at the bottom remain quite short and are thus easier to understand.
program
that reads an infinite number of numerical values from input
and at the end prints on output
the arithmetic mean.program arithmeticMean(input, output);
type
integerNonNegative = 0..maxInt;
var
i, sum: real;
count: integerNonNegative;
begin
sum := 0.0;
count := 0;
while not eof(input) do
begin
readLn(i);
sum := sum + i;
count := count + 1;
end;
{ count > 0: do not do division by zero. }
if count > 0 then
begin
writeLn(sum / count);
end;
end.
Note that using a data type
excluding negative numbers (here we named it integerNonNegative
) mitigates the issue that count
may flip the sign, a condition known as overflow. This would cause the program
to fail if count := count + 1
became too large, and effectively falls out of the range 0..maxInt
.
maxReal
, no programmatic way to tell that sum
became too large or too small rendering it severely inaccurate, because any value of sum
may be legit nevertheless.
time function
that returns a string
in the “American” time format 9:04 PM
. This may look easy at first, but it can become quite a challenge. Have fun!time
itself. However, the output of time
itself is not standardized, so we will need to define everything by ourselves:
type
timePrint = string(8);
function timeAmerican(ts: timeStamp): timePrint;
const
hourMinuteSeparator = ':';
anteMeridiemAbbreviation = 'AM';
postMeridiemAbbreviation = 'PM';
type
noonRelation = (beforeNoon, afterNoon);
letterPair = string(2);
var
{ contains 'AM' and 'PM' accessible via an index }
m: array[noonRelation] of letterPair;
{ contains a leading zero accessible via a Boolean expression }
z: array[Boolean] of letterPair;
{ holds temporary result }
t: timePrint;
begin
{ fill `t` with spaces }
writeStr(t, '':t.capacity);
This fallback value (in the case ts.timeValid
is false
) allows the programmer/“user” of this function
to “blindly” print its return value. There will be a noticeable gap in the output. Another sensible “fallback” value would be an empty string
.
with ts do
begin
if timeValid then
begin
m[beforeNoon] := anteMeridiemAbbreviation;
m[afterNoon] := postMeridiemAbbreviation;
z[false] := '';
z[true] := '0';
writeStr(t,
((hour + 12 * ord(hour = 0) - 12 * ord(hour > 12)) mod 13):1,
hourMinuteSeparator,
z[minute < 10], minute:1, ' ',
m[succ(beforeNoon, hour div 12)]);
This is the most complicated part of this problem. First of all, all number parameters to writeStr
are explicitly suffixed with :1
as the minimum-width specification, because there are some compilers that would otherwise assume, for example, :20
as a default value. Since we know that timeStamp.hour
is in the range 0..23
we can use the div
and mod
operations as demonstrated. However, we will need account of an hour
value of 0
, which is usually denoted as 12:00 AM (and not zero). A conditional “shift” by 12 using the shown Boolean
expression and ord
“fixes” this. Furthermore, here is a brief reminder that in EP the succ
function accepts a second parameter.
end;
end;
timeAmerican := t;
end;
Sources:
- ↑ Wirth, Niklaus (1979). "The Module: a system structuring facility in high-level programming languages". proceedings of the symposium on language design and programming methodology. Berlin, Heidelberg: Springer. Abstract. doi:10.1007/3-540-09745-7_1. ISBN 978-3-540-09745-7. https://link.springer.com/content/pdf/10.1007%2F3-540-09745-7_1.pdf. Retrieved 2021-10-26.
- ↑ Cooper, Doug. "Chapter 11. The
record
Type". Oh! Pascal! (third edition ed.). p. 374. ISBN 0-393-96077-3.[…] records have two unique aspects:
First, the stored values can have different types. This makes records potentially heterogeneous—composed of values of different kinds. Arrays, in contrast, hold values of just one type, so they're said to be homogeneous.
[…]{{cite book}}
:|edition=
has extra text (help); line feed character in|quote=
at position 269 (help); syntaxhighlight stripmarker in|chapter=
at position 17 (help) - ↑ Wirth, Niklaus (1973-07-00). The Programming Language Pascal (Revised Report ed.). p. 30.
Within the component statement of the with statement, the components (fields) of the record variable specified by the with clause can be denoted by their field identifier only, i.e. without preceding them with the denotation of the entire record variable.
{{cite book}}
: Check date values in:|date=
(help) - ↑ Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 39. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5.
The initial and final values are evaluated only once.
Notes:
- ↑ This kind of
record
will not be able to store anything. In the next chapter you will learn a (and the only) instance it could be useful. - ↑ Indeed most compilers consider the dot as a dereferencing indicator and the field name denotes a static offset from a base memory address.
- ↑ In Standard (“unextended”) Pascal, ISO standard 7185, a
function
can only return “simple data type” and “pointer data type” values. - ↑ Actually the shown
begin … end
is redundant sincerepeat … until
constitute a frame in their own right. For pedagogical reasons we teach you to always usebegin … end
nevertheless wherever a sequence of statements usually appears. Otherwise you might change yourrepeat … until
loop to awhile … do
loop forgetting to surround the loop’s body statements with a properbegin … end
frame. - ↑ The
packed
designation has been omitted for simplicity. - ↑ According to most compilers’ definition of
maxInt
. The ISO standards merely require, that all arithmetic operations in the interval-maxInt..maxInt
work absolutely correct, but it is thinkable (although unlikely) that more values are supported.
Pointers
The new data type presented in this chapter adds another layer of abstraction to your repertoire: Pointers are by far the most complicated data type. If you master them, you have got what it takes to tackle even the supreme discipline of assembly programming. So, let’s get started!
Indirection
[edit | edit source]In Pascal there are two kinds of variable types.
- So far we have been using static variables. They exist during the entire execution of a block, e. g. while a
program
is running or just during execution of a routine. - There is another kind called dynamic variables.[fn 1] They do not necessarily “exist” during the entire block. That means, there is no static memory allocated, but the used memory space varies each time the program runs.
While using static variables, the compiler[fn 2] already knows which memory chunk will be used in advance.[fn 3] Dynamic variables, however, are, hence their name, dynamic in that they will be occupying different, unpredictable memory segments.
Memory is referred to by addresses.
An address is, in CS, simply a number, an integer
value so to speak.[fn 4]
When you want to refer to a certain memory block, you use its address.
The pointer data type is a value that stores an address. This address can then be used to access the memory it is referring to. A pointer, however, is just that: It is pointing, but not making any statement as regards to “whom”, what variable this block of memory “belongs.”
Declaration
[edit | edit source]In Pascal, a pointer data type declaration starts with a ↑
(up arrow), or alternatively and more frequently the ^
(caret) character, followed by the name of a data type.
program pointerDemo(output);
type
charReference = ^char;
A variable of this pointer data type can point to a single char
value (and no other data type).
In Pascal all pointer data types have to indicate the data type of the value the pointer is referring to.
This is because a pointer alone is just an address:
An address merely points to the start of a memory block.
There is no statement with respect to this block’s size, its length.
The domain restriction, the specification of targeted value’s data type, tells the compiler “how large” a memory block will be, and in consequence how to properly read and write, how to access it.
Unlike any other data type, a pointer data type is the only data type that can use data types not declared yet. Below you will see a usage scenario, but let’s continue in the script.
Allocating memory
[edit | edit source]When you are declaring a variable in the var
‑section, you are declaring a static variable.
In the following code fragment c
is a static variable, thus its memory location is already known.
var
c: charReference;
begin
{ artificially stall the program without breakpoints }
readLn;
At this point, there is no memory space alloted to a char
value yet.
There is already space to store a pointer value, the address of a char
value, but we do not have any space available to put it, a char
value, anywhere.
In Pascal you will first need to invoke the procedure new
to get memory assigned to your program
.
New
takes one pointer variable as an argument and will reserve enough memory space to hold one value of the pointer’s domain.
new(c);
After this operation
- you occupy additional memory for, in this case, one
char
value theprogram
previously did not “own”, and c
, the pointer variable itself, will give us the address of this newly allocated memory.
As with all variables of any kind, the memory space we have acquired now is totally undefined (unintialized).
Dereferencing
[edit | edit source]To use the memory we just gained we will have to follow a pointer.
This is done by appending ↑
(or usually ^
) to the name of the pointer variable.
c^ := 'X';
writeLn(c^);
This action is called dereferencing.
The pointer is a (kind of) reference to the underlying char
value.
This char
value does not have a name, but you use the pointer to access it anyway.
On this dereferenced variable we can perform all operations permissible on the pointer’s domain data type.
I. e. here we are allowed to assign a char
value 'X'
to it, and then use it, for instance, in a writeLn
as demonstrated above.
Note that something like c := 'X'
will not work, because in this case c
simply refers to the pointer, the address storage.
- The expression
c
has the data typecharReference
. - The expression
c^
has the data typechar
.
In Pascal it is forbidden to directly assign addresses to pointers, other than by using new
.
For the special case of nil
, see below.
Releasing memory
[edit | edit source]After invoking new
the respective memory is exclusively reserved to your program
.
This memory management occurs outside of your program
.
It is a typical task of the respective OS.
To reverse the operation of new
, there is a dedicated procedure
“unreserving” memory: Dispose
.
readLn;
dispose(c);
readLn;
end.
Dispose
takes the name of a pointer variable, and releases previously with new
allocated memory.
After a dispose
you may not follow, dereference, the pointer anymore.
Nevertheless the pointer itself still stores the address where, in this case, the referenced char
value was.
Meanwhile, the “freed” memory may be used again for something or by someone else.
Lifetime
[edit | edit source]In Pascal, memory of dynamic variables remains reserved
- as long as it is accessible, that means at least one pointer must point to it, or
- until you specifically request to “unreserve” memory.
If a chunk of memory is rendered inaccessible by some operation, it is automatically released.
This can happen implicitly:
In the program
above the pointer variable c
is “gone” upon program
termination.
Because this variable is/was the only pointer (left) pointing to our previously reserved char
value, there is an automatic “invisible” dispose
.
Insofar, the explicit dispose
from our side was not necessary.
However, unfortunately not all compilers comply with this specification as laid out in the Pascal ISO standards.
For instance, Delphi, as well as the FPC (even in its {$mode ISO}
compatibility mode, as of version 3.2.0) will not issue an automatic dispose
.
There, an explicit dispose
is necessary.[fn 5]
Rest assured, using the GPC it is not necessary though; the GPC fully complies with the ISO standard 7185 level 1.
Note, that memory accessibility is transitive: This means that, for instance, a pointer pointing to a pointer pointing to the memory still satisfies the accessibility requirement.
Indication
[edit | edit source]The additional housekeeping of allocating and releasing memory may seem like quite a hassle, so when does that make sense?
- All variables declared in a
var
-section need to indicate their size in advance. For some applications, however, you do not know how much data you will need to store and process. Pointers are a means to overcome this limitation. Further below we will explore how. - Pointer values can be used to represent graphs, networks, of data, allowing you to put everything into relation with each other. This means you do not need to store the same datum multiple times. A pointer value is usually, with respect to its memory requirements, a comparatively small data type. Handling pointers trades lower memory space demand for increased complexity.
Furthermore, pointer values are frequently used to implement variable parameters of routines:
Due to its smaller size passing a single pointer value can be faster than passing, that means copying, for instance an entire array
.
This kind of use of pointers is completely transparent.
Pascal equips you with an adequate language construct;
you will learn more about variable parameters in the chapter on scopes.
Links
[edit | edit source]Nil pointers
[edit | edit source]All pointers can be assigned a literal value nil
.
The nil
pointer value represents the notion “not pointing anywhere in particular.”
Coincidentally, nil
is the only pointer value that could be used for a pointer literal.
const
nowhere = nil;
There is no other pointer value that you could possibly specify anywhere in your source code.
This also means you cannot explicitely compare any specific pointer value except nil
.
Note that nil
is fundamentally different to an unintialized variable.
You are allowed to read the value of a pointer that has been assigned the value nil
, but you are still forbidden to attempt reading the value of a variable that has not been assigned any value at all.
Attempting to dereference a pointer that currently possesses the value nil constitutes a fatal error.
|
Permissible operators
[edit | edit source]In the introduction we used the analogy comparing pointers to integer
values.
However, this is really just that.
Unlike integer
values, pointers are by no means “ordered”; they do not belong the class of ordinal data types.
There is no ord
, succ
, pred
defined for a pointer, but also ordering comparison operators like <
or >=
do not apply to pointers, not to mention any arithmetic operator is invalid in combination with a pointer value.
The only operators applicable to pointers are[fn 6]
=
, do two pointer values refer to the same address,<>
, do two pointer values refer to different addresses, and:=
, the assignment of a pointer value, eithernil
or the value of an already defined pointer variable of the same data type, to a pointer variable.
It may seem at first like quite a restriction, but it prevents you from doing potentially harmful, or even just stupid stuff.
Chicken or egg
[edit | edit source]Pointers are the only data type that can be declared using a data type yet to be declared.[fn 7]
This circumstance makes it possible to declare data types containing pointers, possibly to the data type being declared at hand or other yet to be declared data types.
This is possible because a pointer to foo
has the same memory requirements as a pointer to bar
or any other data type.
The domain restriction of a pointer is not (necessarily/explicitly) stored in the program
.
In the following code fragment numberListItem
is not yet declared, but you are still allowed declare a new pointer data type with it anyway:
program listDemo(input, output);
type
numberListItemReference = ^numberListItem;
numberListItem = record
value: real;
nextItemLocation: numberListItemReference;
end;
Yet you cannot reverse the order of the declarations of numberListItemReference
and numberListItem
;
the compiler cannot magically conclude nextItemLocation
is a pointer until it has actually seen/read the respective declaration.
Putting things together
[edit | edit source]Now we can use this data structure to dynamically store a series of numbers. Pay attention when to derefercene the pointer in the following code:
var
numberListStart: numberListItemReference;
begin
new(numberListStart);
readLn(numberListStart^.value);
new(numberListStart^.nextItemLocation);
readLn(numberListStart^.nextItemLocation^.value);
dispose(numberListStart^.nextItemLocation);
dispose(numberListStart);
end.
The entire program
contains one static variable.
Only the variable numberListStart
was declared by you.
During run-time, however, while the program
is running you will have at one point two additional real
values at your disposal.
Take notice of this example’s order of dispose
statements:
The supplied pointer variable must be valid, so a reverse order would not be possible in this specific case here.
Concededly, this example could have been better implemented by simply declaring two real
variables.
The true power of pointers becomes apparent when you are, unlike the above code, use pointers as a means of abstraction.
This chapter’s exercises will delve into that.
Routines
[edit | edit source]In particular, let’s first explore a special kind of pointers: Routine parameters, that is functional and procedural parameters, are parameters of routines that allow you to statically modify the routine’s behavior by virtually passing the address of another routine. Let’s see how this works.
Declaration and use
[edit | edit source]In the formal parameter list of a routine you can declare a parameter that looks just like a routine signature:
program routineParameter(output);
procedure fancyPrint(function f: integer);
begin
writeLn('❧ ', f:1, ' ☙')
end;
Inside the definition of fancyPrint
you can use the parameter f
as if it was a regular function
declared and defined before and outside of fancyPrint
.
However, at this point it is not known what function will be used.
The actual parameter f
is in fact a pointer.[fn 8]
We only know that this pointer’s “domain” is a, any function
without parameters and returning an integer
value, but this is already enough we need to know.
One routine fits it all
[edit | edit source]To call this kind of routine you will need to specify an appropriate routine designator that matches the signature as regards to order, number and data types of parameters and, if applicable, the returned value’s data type.
function getRandom: integer;
begin
{ chosen by fair dice roll: guaranteed to be random }
getRandom := 4
end;
function getAnswer: integer;
begin
{ the answer to the ultimate question of life, the universe and everything }
getAnswer := 42
end;
begin
fancyPrint(getRandom);
fancyPrint(getAnswer)
end.
To supply a routine parameter value to a routine, simply name a compatible routine. Note that in this case you never specify any parameters, because you are not making a call here, but the called routine will do so “on behalf” of you. Specifying the routine’s name, and thus passing its address, is sufficient to achieve that.
Standard routines such as writeStr (EP) or sin cannot be used that way,[fn 9] because they are an integral part of the language. There is no (singular) routine definition for them.
|
Caveats
[edit | edit source]As a beginner, pointers are difficult to tame. Without experience, you will frequently observe (for you) “unexpected” behaviors. Some pitfalls are presented here.
with
-clause
[edit | edit source]Special care must be taken when using pointers in conjunction with a with
-clause.
The expressions listed at the top of a with
-clause are evaluated once before executing any following statement.
During the entire with
-statement the expressions using the “short” notation will actually use an invisible transient value.
This speeds up execution, because the same value is not evaluated over and over again, but there is also a caveat in it.
Surprisingly, the long notation using an FQI can become invalid, while the short notation at first seems to be still valid.
The following program
demonstrates the issue:
program withDemo(output);
type
foo = record
magnitude: integer;
end;
fooReference = ^foo;
var
bar: fooReference;
begin
new(bar);
bar^.magnitude := 42;
with bar^ do
begin
dispose(bar);
bar := nil;
{ Here, bar^.magnitude would fail horribly, }
{ but you can still do the following: }
writeLn(magnitude);
end;
end.
When you compile and run this program
, you will
- notice that it prints anything but
42
, but - it should be rather astonishing that it still prints anything at all.
The writeLn(magnitude)
does actually use a “hidden (pointer) variable” and not bar
.
This variable’s value was evaluated one time at the top of the with
-clause.
The compiler does not (and cannot) complain that bar
meanwhile became invalid.
You are not making any assignments to the actually utilized hidden variable (i. e. it is still considered bearing a valid value), thus there is no reason for complaints.
Limits
[edit | edit source]This section primarily concerns users of Delphi and the FPC, as well as possibly some other compilers. Users of the GPC could skip this section, but understanding the theory is encouraged. |
Memory is not an infinite resource. This has some grave implications.
Most OSs try their best to fulfill the processes’ requests. Using a non-ISO-compliant compiler, the following program is doomed to fail though:
program oomDemo;
var
p: ^integer;
begin
while true do
begin
new(p);
end;
end. program overwrites the previous pointer value, thus rendering the previously associated integer value inaccessible, the now inaccessible memory is still exclusively reserved to your program . Depending on OS internals and also the compiler used to compile your program , your computer will eventually freeze (become irresponsive to any input) or (a robust OS) will just kill your program (jargon for terminating it immediately without giving it any chance to fix the problem) and reclaim the once reserved but never released memory. |
There is no means to check whether any subsequent new
will exhaust the finite resource memory.
On multi-tasking OSs it is feasible that between the time you have queried the amount of free memory space and actually requesting additional memory, another program
running at the same time has acquired memory so there is none, or not enough left for you.
This kind of situation is known as time-of-check to time-of-use.
You need to simply in a make-or-break manner ask for more memory.
This issue is rather of theoretical concern for the scope of this textbook. A standard desktop computer manufactured in the 21st century or later will not run out of memory for any programming exercise given here. This is not supposed to mean you can waste memory. |
Do not hoard memory: To mitigate potential OOM conditions, it is generally sensible to dispose memory as soon as you are certain it will not be used anymore.
|
Tasks
[edit | edit source]program listDemo
so it accepts an unknown number of items. The program
should print the number of total items first and then a list of items.function readNumber: numberListItemReference;
var
result: numberListItemReference;
begin
new(result);
with result^ do
begin
readLn(value);
nextItemLocation := nil;
end;
readNumber := result;
end;
{ === MAIN ============================================================= }
var
numberListRoot: numberListItemReference;
currentNumberListItem: numberListItemReference;
numberListLength: integer;
begin
writeLn('Enter numbers and finish by abandoning input:');
{ input - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
numberListRoot := readNumber;
numberListLength := 1;
currentNumberListItem := numberListRoot;
while not EOF(input) do
begin
with currentNumberListItem^ do
begin
nextItemLocation := readNumber;
currentNumberListItem := nextItemLocation;
end;
numberListLength := numberListLength + 1;
end;
{ output - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
writeLn('You’ve entered ', numberListLength:1, ' numbers as follows:');
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
writeLn(value);
currentNumberListItem := nextItemLocation;
end;
end;
{ release memory - - - - - - - - - - - - - - - - - - - - - - - - - - - }
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
dispose(currentNumberListItem);
{ Note that at _this_ point, after dispose(…), writing
… := currentNumberListItem^.nextItemLocation
would be illegal! }
currentNumberListItem := nextItemLocation;
end;
end;
end.
procedure
that accepts a real function
and graphs its function values similar to this:
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
For that complete the following procedure
:
program graphPlots(output);
const
lineWidth = 80;
procedure plot(
function f(x: real): real;
xMinimum: real; xMaximum: real; xDelta: real;
yMinimum: real; yMaximum: real
);
{
this is the part you are supposed to implement
}
function wave(x: real): real;
begin
wave := sin(x);
end;
begin
plot(wave, 0.0, 6.283, 0.196, -1.0, 1.0);
end.
plot
could look like this:
procedure plot(
function f(x: real): real;
xMinimum: real; xMaximum: real; xDelta: real;
yMinimum: real; yMaximum: real
);
var
x: real;
y: real;
column: 0..lineWidth;
begin
x := xMinimum;
while x < xMaximum do
begin
y := f(x);
{ always reset `column` in lieu of doing that in an `else` branch }
column := 0;
{ is function value within window? }
if (y >= yMinimum) and (y <= yMaximum) then
begin
{ move everything toward zero }
y := y - yMinimum;
{ scale [yMinimum, yMaximum] range to [0..79] range }
y := y * (lineWidth - 1) / (yMaximum - yMinimum);
{ convert to integer }
column := round(y) + 1;
end;
The following use of write
/writeLn
is actually an EP extension. In Standard Pascal as laid out in ISO standard 7185 all format specifiers need to be positive integer
values. Extended Pascal also allows a zero value. While for printing integer
values the width specifier still indicates a minimum width, for char
and string
values it means the exact width. Thus the following can print a blank line when column
is zero, i. e. when the function value is outside of the window.
writeLn('*':column);
It should be an easy feat for you to adapt the writeLn
line should your compiler not support this EP extension.
x := x + xDelta;
end;
end;
plot
in such a generic way, i. e. accepting a functional parameter, you can reuse it for any other function you wish to.Notes:
- ↑ The Pascal ISO standards call this idea identified variables.
- ↑ For the sake of simplicity we say this was the compiler’s task. Usually it is rather a task of a linker, the link editor, that determines and substitutes specific addresses.
- ↑ Actually the compiler does not know which (physical) memory will be used, but another abstraction layer called virtual memory administerd by the OS permits us to think that way.
- ↑ This is an analogy for explanation purposes. The range of integer values does not necessarily correspond to permissible pointer values (i. e. addresses). For instance on x32 targets pointers have 32 significant bits, but an
integer
value occupies 64 bits. - ↑ Failing to release memory will probably go unnoticed. Your
program
will compile and run without a properdispose
. However, eventually the finite resource “memory” will be exhausted, a condition known as memory leak. If there is no sufficient memory available, anynew
will fail and terminate theprogram
immediately. - ↑ Some manuals call the
↑
/^
an “operator”. This language, however, is imprecise. The↑
does not alter the state of your program, do an operation, but merely instructs the compiler to treat an identifier differently than it would without the arrow’s presence. - ↑ The declaration of the pointer data type and the referenced data type must occur in the same scope, in the same block, in other words in one and the same
type
-section. - ↑ This is an implementation detail that is not specified by the ISO standards, although in fact most compilers will implement this as a pointer.
- ↑ Some compilers do not have this restriction, yet the ISO standards require an “activation”, which simply does not happen for standard routines.
Files
Ever wondered how to process bulks of data? Files are the solution in Pascal. You were already acquainted with some basics in the input and output chapter. Here we will elaborate more details as far as the ISO standard 7185 “Pascal” defines them. The “Extended Pascal” ISO standard 10206 defines even more features, but these will be covered in the second part of this WikiBook.
File data types
[edit | edit source]So far we have been only handling text files, i. e. files possessing the data type text
, but there are more file types.
Concept
[edit | edit source]Mathematically speaking, a file is a bounded finite sequence. That means,
- components are oriented along an axis (sequence),
- component values are chosen from one domain (bounded), and
- there is a certain number of components present (finite).
To put this in fancy math symbols:
Declaration
[edit | edit source]In Pascal we can declare file data types by specifying file of recordType
, where recordType
needs to be a valid record data type.
A permissible record data type can be any data type, except another file data type (including text
) or a data type containing such.
That means an array
of file data types, or a record
having a file
as a component is not permitted.
Let’s see an example:
program fileDemo(output);
type
integerFile = file of integer;
With a variable of the data type integerFile
we can access a file containing only one kind of data, integer
values (the domain restriction).
var
temperatures: integerFile;
i: integer;
Note, the variable temperatures
is not a file by itself.
This Pascal variable merely provides us with an abstract “handle”, something that permits us, the program
, to get a hold of the actual file (as described in § Concept).
Modes
[edit | edit source]All files have a current mode. Upon declaration of a file variable, this mode is, like usual, undefined. In Standard Pascal as defined by the ISO standard 7185 you can choose from either generation or inspection mode.
Generation mode
[edit | edit source]In order to write to a file you will need to call the standard built-in procedure
named rewrite
.
Rewrite
will attempt opening a file for writing from the start.
begin
rewrite(temperatures);
The file
immediately becomes empty, hence its name rewrite.
Extended Pascal also has the non-destructive procedure extend
.
Only after successfully opening a file for writing, all write routines become legal. Attempting to write to a file that has not been opened for writing will constitute a fatal error.
write(temperatures, 70);
write(temperatures, 74);
All parameters to write
after the destination
(here temperatures
) have to be of the destination
file’s recordType
.
There must be at least one.
Only if the destination
is a text
file, various built-in data types are permitted.
Note that the procedure(s) writeLn
(and readLn
) can only be applied to text
files.
Other files do not “know” the notion of lines, therefore the …Ln
procedures cannot be applied on them.
Inspection mode
[edit | edit source]In order to read a file you will need to call the standard built-in procedure
named reset
.
Reset
will attempt opening a file for reading from the start.
reset(temperatures);
while not EOF(temperatures) do
begin
read(temperatures, i);
writeLn(i);
end;
end.
Note that after reset(temperatures)
you cannot write anything to that file anymore.
Modes are exclusive:
Either you are writing or reading.[fn 1]
Application
[edit | edit source]The main and most apparent “advantage” of a file
might be:
Unlike an array
we do not need to specify a size in advance, in our source code.
The file
can be as large as needed.
Yet an array
can be copied with a :=
assignment.
Entire files cannot be copied this way.
The main “disadvantage” of a file
might be:
Access is only sequentially.
We have to start reading and writing a file
from the start.
If we want to have, say, the 94th record, we need to advance 93 times and also take account of the possibility that there might be less than 94 records available.[fn 2]
The words advantage and disadvantage were put between quotation marks, because a programming language cannot judge/rate what is “better” or “worse”. It is the programmer’s task to make the assessment. Files are especially suitable for I/O of unpredictable length, for instance user input.
Primitive routines
[edit | edit source]So far we have been using only read
/readLn
and write
/writeLn
.
These procedures are convenient and perfect for everday use.
However, Pascal also gives you the opportunity to have a comparatively “low-level” access to files, get
and put
.
Buffer
[edit | edit source]Every file variable is associated with a buffer.
A buffer is a temporary storage space.
Everything you read from and write to a file
passes through this storage space before the actual read or write action is communicated to the OS.[fn 3]
Buffered I/O is chosen for performance reasons.
In Pascal we can access one, the “current” component of the buffer by appending ↑
to the variable name, just as if it was a pointer.
The data type of this dereferenced value is the recordType
as in our declaration.
So if we have
var
foobar: file of Boolean;
the expression foobar↑
has the data type Boolean
.
To put everything into relation to each other let’s take a look at a diagram. This diagram is about understanding and shows a very specific situation. Focus on the relationships:
The upper part is in the purview of the OS.
The lower part is in the purview of the (our) program
.
The data of the file, here a sequence of 16 integer
values in total, are exclusively managed by the OS.
Any access of the data is done via the OS.
Directly reading or writing is not possible.
We ask the OS to copy the first 4 integer
data values for us into our buffer.
We do so, because copying 4 integers individually is slower than copying them all together in one go.[fn 4]
Sliding window
[edit | edit source]The three different storage locations – the actual data file, the internal buffer, and the buffer variable – work together in providing us a “view” of the file. If we overlay everything that contains the same information, we get the following image:
Here, the second quartet of integers was loaded into the internal buffer (green background). The file buffer points to the second component of the internal buffer. This is represented by a bluish hue over the sixth component of the entire file. Everything else is shaded, meaning we can view and manipulate only the sixth component.
Advancing the window
[edit | edit source]This sliding window can be advanced (in the rightwards direction, i. e. in the direction of EOF) with the routines get
and put
.
Both advance the file buffer to point to the next item in the internal buffer.
Once the internal buffer has been completely processed, the next batch of components is loaded or stored.
Calling get
is only legal while a file is inspection mode; respectively put
is only legal while a file is generation mode.
Using the window
[edit | edit source]Get
and put
take one non-optional parameter, a file
(or text
) variable.
Put
takes the current contents of the buffer variable and ensures they are written to the actual file.
Let’s see this in action.
Consider the following program
:
program getPutDemo(output);
type
realFile = file of real;
var
score: realFile;
begin
The following table shows in the right-hand column the state of score
, the contents and where the sliding window is at (blue background).
source code | state after successful operation | ||||||
---|---|---|---|---|---|---|---|
rewrite(score); |
| ||||||
score^ := 97.75; |
| ||||||
put(score); |
| ||||||
score^ := 98.38; |
| ||||||
put(score); |
| ||||||
score^ := 100.00 |
| ||||||
{ For demonstration purposes: no `put(score)` here. } |
|
Now let’s print the file score
we just filled with some real
values.
For a change we use get
.
Like read
/readLn
, get
is only allowed if not EOF
:
reset(score);
while not EOF(score) do
begin
writeLn(score^);
get(score);
end;
end.
Note that this prints just two real
values:
9.775000000000000E+01
9.838000000000000E+01
The third real
value, although defined, was not written by a corresponding put(score)
Requirements
[edit | edit source]As mentioned above, get
may only be called when the specified file is inspection mode, whereas put
may only be called when the file is generation mode.
More specifically, calling get(F)
is only allowed when EOF(F)
is false
, and calling put(F)
is only allowed when EOF(F)
is true
.
In other words, reading past the EOF is forbidden, while writing has to occur at the EOF.
After successfully calling rewrite(F)
(or the EP procedure extend(F)
) the value of EOF(F)
becomes true
.
Any subsequent put(F)
does not alter this value.
After calling reset(F)
the value of EOF(F)
depends on whether the given file is empty.
Any subsequent get(F)
may change this value from false
to true
(never in the reverse direction).
As you know, it is forbidden to read a variable that was not previously defined (i. e. you have to assign a value beforehand). Because it involves reading the buffer value, writing a buffer is only allowed if it was previously defined. Consider the following faulty code snippet:
temperatures^ := 88;
put(temperatures); { ✔ Good. Will successfully write 88. }
put(temperatures); { ↯ Bad. temperatures^ is not defined. }
put(temperatures); { ↯ temperatures^ still not defined. } get and put advance the sliding window. Only the first put(temperatures) reads the defined value temperatures^ . The next and following put(temperatures) would however read an undefined temperatures↑ . |
Text
buffer
[edit | edit source]The buffer value of a text
has some special behavior.
A text
file is essentiallly a file of char
.
Everything presented in this chapter can be applied to a text
file just as if it was file of char
.
However, as repeatedly emphasized, a text
file is structured into lines, each line consisting of a (possibly empty) sequence of char
values.
When EOLn(input)
becomes true
, the buffer variable input↑
returns a space character (' '
).
Thus when using buffer variables the only way to distinguish between a space character as part of a line, and a space character terminating a line is to call the function EOLn
.
Rationale: Various operating systems employ different methods of marking the end of a line. It has to be marked somehow, because this information cannot be magically deduced out of nowhere. However, there are multiple strategies out there. This is really inconvenient for the programmer who cannot take account of everything. Pascal has therefore chosen that, regardless of the specific EOL marker used, the buffer variable contains a simple space character at the end of a line. This is predictable, and predictable behavior is good.
Purpose
[edit | edit source]It is worth noting that all functionality of read
/readLn
and write
/writeLn
can at their heart be based on get
and put
respectively.
Here are some basic relationships:
If f
refers to a file of recordType
variable and x
is a recordType
variable, read(f, x)
is equivalent to
x := f^;
get(f);
Similarly, write(f, x)
is equivalent to
f^ := x;
put(f);
For text
variables the relationships are not as straightforward.
The behavior depends on the various destination/source variables’ data types.
Nonetheless, one simple relationship is, if f
refers to a text
variable, readLn(f)
is equivalent to
while not EOLn(f) do
begin
get(f);
end;
get(f);
The latter get(f)
actually “consumes” the newline marker.
Support
[edit | edit source]Unfortunately, from the compilers presented in the opening chapter, Delphi and the FPC do not support all ISO 7185 functionality.
- Delphi and the FPC require files to be explicitly associated with file names before performing any operations. It is required to back any kind of
file
by a file in background memory (e. g. on disk). How this works will be explained in the second part of this book, since ISO standard 10206 “Extended Pascal” defines some means for that, too. - The FPC provides the procedures
get
andput
, and file variable buffers only in{$mode ISO}
or{$mode extendedPascal}
. Delphi does not support this at all.
Rest assured, everything works fine if you are using the GPC. The authors cannot make a statement regarding the Pascal‑P compiler since they have not tested it.
Tasks
[edit | edit source]file
variable is initialized. That means a mode has to be selected by invoking reset
or rewrite
first.
Think of reset
/rewrite
as a special kind of new
and the file variable as a pointer. You may only dereference the pointer (= append ↑
) if it was previously defined.
program
that merges repeating space characters ' '
into a single space character. (A filter program means, process input
and write to output
with the specified rule applied on the given input.) Extra credit: Write a solution that does not declare any additional variables (i. e. there is no var
-section).program mergeRepeatingSpace(input, output);
const
{ Choose any character, but ' ' (a single space). }
nonSpaceCharacter = 'X';
begin
output^ := nonSpaceCharacter;
while not EOF do
begin
Since input↑
contains a space character when we are the EOL, the only correct way of emitting a new line is using writeLn
.
WriteLn
does not use the buffer variable.
In other words, output↑
may contain any value now.
if EOLn then
begin
writeLn;
In this branch of the if
statement, input↑
holds a space character.
However this instance of space character should not trigger the repeating space character detection.
Therefore we assign a non-space character to output↑
(now acting as a “previous character variable”).
output^ := nonSpaceCharacter;
end
else
begin
if [output^, input^] <> [' '] then
In Extended Pascal using the string
/char
concatenation operator +
you could write:
if output^ + input^ <> '' then
Remember that the plain =
‑comparison pads both operands to the same length using space characters.
begin
write(input^);
end;
output^ := input^;
{ The buffer variable (`output↑`) now contains the previous character. }
end;
get(input);
end;
end.
Boolean
variable as a flag whether the preceding character was non-newline space character.
program
that reads from input
and only writes the last input char
value to output
. On a standard Linux or FreeBSD system you can test your program
with the command line echo -n '123H' | ./printLastCharacter
. The ‑n
option flag is important. Otherwise your program
might just display a single space (' '
) character. Alternatively, you may use printf '123H' | ./printLastCharacter
. With either variant your program
should write a line consisting of the single character H
.program printLastCharacter(input, output);
begin
{ We cannot output anything, unless there is at least one character. }
if not EOF(input) then
begin
while not EOF(input) do
begin
{ After `get(input)`, `input↑` becomes undefined once
we reach `EOF(input)`. Therefore copy it beforehand. }
output^ := input^;
get(input);
end;
put(output);
writeLn(output);
end;
end.
By specifying input
in the program
parameter list, the post-assertions of reset
become true. That means, there has been an implicit (= invisible) get(input)
before our begin
in the second line and only after that the value the of EOF(input)
becomes defined.
If you happen to have a compiler supporting Extended Pascal’s halt
procedure
, you would eliminate one indentation level:
{ We cannot output anything, unless there is at least one character. }
if EOF(input) then
begin
halt;
end;
while not EOF(input) do
Notes:
- ↑ Extended Pascal, as defined by ISO standard 10206, also permits an update mode, i. e. reading and writing at the same time, yet this is only possible for “direct-access files” (files that are indexed).
- ↑ Extended Pascal, ISO knows “direct-access files”. Such a file type allows accessing the 94th record in an easy and fast manner, yet it cannot “grow” as needed.
- ↑ This is an implementation detail and not a requirement imposed by programming language. Already the mere presence of an OS is beyond Pascal’s horizon. Nonetheless, this description is a common scheme.
- ↑ This is of course under the presumption, that we do intend to need them. Unnecessarily copying data that will not be used later on is a waste of computing time.
Scopes
Part Ⅱ
Extensions
Units
In original Standard Pascal all functionality of a program other than the standard functions Pascal already defines had to be defined in one file, the program
source code file.
While in the context of teaching the sources remained rather short, entire applications quickly become cluttered despite various comments structuring the text.
Quite soon different attempts to modularize programs emerged. The most notable implementation that remains in use till today is UCSD Pascal’s concept of units.
UCSD Pascal units
[edit | edit source]A UCSD Pascal unit is like a program
except that it cannot run on its own, but is supposed to be used by program
s.
A unit
can define constants, types, variables and routines just like any program
, but there is no executable portion that can be run independently.
Using a unit means that this unit becomes a part of the program;
It is like copying the entire source code from the unit
to the program
, but not quite the same.
Usually, units are stored in separate files thus incredibly cleaning up the program
’s source code file.
However, this is not a set requirement, since after an end.
a module is considered complete and another module may follow.
From now on, as another layer of abstraction, module refers to either a program or unit . (In FP a library is another type of module.)
|
Defining units
[edit | edit source]A unit
definition shows many similarities to a regular program
, but with many additional features.
Header
[edit | edit source]The first line of a unit
looks like this:
unit myGreatUnit;
Unlike a program
there is no parameter list.
A unit
is a self-contained unit of certain functionality, thus cannot be parameterized in any way.[fn 1]
This line also declares a new identifier, in this example myGreatUnit
.
MyGreatUnit
becomes the first component of the so-called fully-qualified identifier.
More on that later.
Parts
[edit | edit source]The unit
concept provides means to encapsulate its definitions, so that the programmer using the unit
does not need to know how certain functionality is implemented.
This is done by splitting the unit
into two parts:
- the
interface
part, and - the
implementation
part.
A programmer using another unit only needs to know how to use the unit:
This is outlined in the interface
part.
The programmer who is programming the unit on the other hand will need to implement the unit’s functionality in the implementation
part.
Thus a bare minimum unit looks like this:
unit myGreatUnit;
interface
implementation
end.
The interface
part has to come before the implementation
part.
Also note that unit
s terminate with a end.
just as a program
does.
The interface
part of unit
consists of a block, except that it cannot contain any statements.
The interface
is merely declaratory.
All identifiers defined in the interface
part will become “public”, i. e. a programmer using the unit will have access to them.
All identifiers defined in the implementation
part, on the other hand, are “private”:
they are only available within the unit’s own implementation
part.
There is no way to circumvent this separation of exported and “private” code.
Example
[edit | edit source]unit randomness;
// public - - - - - - - - - - - - - - - - - - - - - - - - -
interface
// a list of procedure/function signatures makes
// them usable from outside of the unit
function getRandomNumber(): integer;
// a definition (an implementation) of a routine
// must not be in the interface-part
// private - - - - - - - - - - - - - - - - - - - - - - - - -
implementation
function getRandomNumber(): integer;
begin
// chosen by fair dice roll
// guaranteed to be random
getRandomNumber := 4;
end;
end.
Using units
[edit | edit source]Import
[edit | edit source]Now, it is great that we have finally outsourced some code, but the point of all of this is to use the outsourced code.
For this, UCSD Pascal defines the uses
clause.
A uses
clause instructs the compiler to import another unit’s code and familiarize with all identifiers declared in the interface
part of that unit.
Thus, all identifiers from the unit’s interface
part become available, as if they were part of the module importing them via the uses
clause.
Here is an example:
program chooseNextCandidate(input, output);
uses
// imports a unit
randomness;
begin
writeLn('next candidate: no. ', getRandomNumber());
end.
Note, that the program
chooseNextCandidate
neither defines nor declares the function getRandomNumber
, but nevertheless uses it.
Since getRandomNumber
’s signature is listed in the interface
part of randomness
, it is available for other modules using that module.
Each program may have at most one uses clause. It has to appear right after the program header.
|
Uses
clauses are allowed in any module.
Of course it is possible to use other units inside a unit
.
Moreover, you are allowed to have two uses
clauses in one unit
, one in the interface
and one in implementation
part each.
The units listed in the interface
part’s uses
clause propagate, that means they become also to the module that uses such units.[fn 2][fn 3]
Namespaces
[edit | edit source]Now, programming with units would have been a hassle if all units that were ever programmed had to explicitly define exclusive identifiers.
But this is not the case.
With the advent of modules all modules implicitly constitute a namespace.
A namespace is a self-contained scope where only within identifiers need to be unique.
You are quite welcome to define your own getRandomNumber
and still use the randomness
unit.
In order to distinguish between identifiers coming from various namespaces, identifiers can be qualified by prepending the namespace name to the identifier, separated by a dot.
Thus, randomness.getRandomName
unambiguously identifies the getRandomNumber
function exported by the randomness
unit.
This notation is called fully-qualified identifier, or FQI for short.
Precedence
[edit | edit source]This section is empty. Please help by expanding it. |
Dependencies
[edit | edit source]This section is empty. Please help by expanding it. |
More features
[edit | edit source]Initialization and Finalization section
[edit | edit source]This section is empty. Please help by expanding it. |
Distribution without source code
[edit | edit source]This section is empty. Please help by expanding it. |
Unit design
[edit | edit source]There are several considerations that should be accounted for:
- Whenever some code might be useful for other programs too, you may want to create a separate unit.
- One unit should provide all functionality necessary in order to be useful, however,
- a unit should not provide features that are unrelated to its main purpose.
- Your unit’s usability largely depends on well-defined interface. Requiring knowledge of the specific implementation is usually an indicator for bad code.
Special units
[edit | edit source]Run-time system
[edit | edit source]Some compilers use units for providing certain functionality that serves the gray zone between a compiler’s actual task and a program
(i. e. what you write).
Most notably, Delphi, the FPC as well as the GPC provide a run-time system (RTS) that includes all standard routines defined as part of the language (e. g. writeLn
and ord
).
In Delphi and the FPC this unit is called system
, whereas the GPC comes with the GPC
unit.
These units are sometimes referred to as run-time library, RTL for short.
Knowing what the RTS’s unit is called could be useful, since this implies that all identifiers of the RTL are part of one namespace.
That means, in (for example) Delphi and the FP one may refer to the standard function abs
by both, its short name as well as the FQI system.abs
.
The latter may be required if you are shadowing the abs
function in the current scope, but need to use Pascal’s own abs
function.
Debugging
[edit | edit source]The FPC comes with a special unit heapTrc
(heap trace).
This unit provides a memory manager.
It is used to find out whether the program
does not release any memory blocks it earlier reserved for itself.
Allocating memory and not handing it back to the OS is called “memory leaking” and is a very bad circumstance.
Due to the heapTrc
unit’s intrusive behavior into Pascal’s memory management, it also needs to be loaded very soon after the system
unit has been loaded.
Hence, FPC forbids you to explicitly include the heapTrc
unit in the uses
clause, but provides the -gh
compiler command-line switch that will ensure inclusion of that unit.
The heapTrc
unit is only used at the development stage.
It can print a memory report after the program
’s final end.
.
The heapTrc
unit is somewhat easy to use, but also limited in its features.
We recommend to use dedicated debugging and profiling tools such as valgrind(1)
as knowing how to use such tools will serve you well if you ever switch programming languages.
If you specify the -gv
switch on fpc(1)
’s invocation, the FPC will insert debugging information for usage with valgrind(1)
.
Other modularization implementations
[edit | edit source]The Extended Pascal standard lays out a specification for module
s.
These provide advanced means of modularization.
However, neither FPC nor Delphi support this, only the GPC does.
Tasks
[edit | edit source]finalization
section as hook to achieve that behavior:
unit friendly;
interface
implementation
finalization
begin
writeLn('Goodbye!');
end;
end.
Notes:
Object Oriented Programming
Back to Pascal Programming
Object Oriented Pascal allows the user to create applications with Classes and Types. This saves the developer time on developing programs that would be very flexible.
This is a sample program (tested with the FreePascal compiler) that will store a number 1 in private variable One, increase it by one and then print it.
program types; // this is a simple program
type MyType=class
private
One:Integer;
public
function Myget():integer;
procedure Myset(val:integer);
procedure Increase();
end;
function MyType.Myget():integer;
begin
Myget:=One;
end;
procedure MyType.Myset(val:integer);
begin
One:=val;
end;
procedure MyType.Increase();
begin
One:=One+1;
end;
var
NumberClass:MyType;
begin
NumberClass:=MyType.Create; // creating instance
NumberClass.Myset(1);
NumberClass.Increase();
writeln('Result: ',NumberClass.Myget());
NumberClass.Free; // destroy instance
NumberClass := Nil;
end.
This example is very basic and would be pretty useless when used as OOP. Much more complicated examples can be found in Delphi and Lazarus which include a lot of Object Oriented programming.
Exporting to Libraries
Foreign Function Interfaces
Pascal Programming/Foreign function interfaces
Objective Pascal
Pascal Programming/Objective Pascal
Generics
Miscellaneous Extensions
The last Pascal-related standards were published in 1990, ISO standard 7185 “[Standard] Pascal”, and ISO standard 10206 “Extended Pascal”. But ever since IT did not stop evolving. Several compiler manufacturers continued extending Pascal by miscellaneous extensions, some of which we are presenting here.
Inline assembly
[edit | edit source]Since TP version 1.0 there exists the possibility to include assembly language inside your Pascal source code.
This is called inline assembly.
While normal Pascal is surrounded by a begin … end
frame, assembly language can be framed by asm … end
.
Here is an example that can be compiled with the FPC:
program asmDemo(input, output, stdErr);
{$ifNDef CPUx86_64}
{$fail only for x86_64}
{$endIf}
var
foo: int64;
begin
write('Enter an integer: ');
readLn(foo);
// This directive will tell FPC
// a certain assembly language style is used
// within the asm...end frame.
{$asmMode intel}
asm
mov rax, [foo] // rax ≔ foo^
// ensure foo is positive
test rax, rax // x ≟ 0
jns @is_positive // if ¬SF then goto is_positive
neg rax // rax ≔ −rax
@is_positive:
// NOTE: Here we assume the popcnt instruction
// was supported by the processor,
// but this is bad style.
// You ought to use the cpuid instruction
// (if available) in order to determine
// whether popcnt is available.
popcnt rax, rax // rax ≔ popCnt(rax)
mov [foo], rax // foo ≔ rax
// An array of strings after the asm-block closing ‘end’
// tells the compiler which registers have changed
// (you do not want to mess with the compiler’s understanding
// which registers mean what)
end ['rax'];
writeLn('Your number has a binary digital sum of ', foo, '.');
end.
Writing inline assembly code is useful if you have special knowledge about data and the compiler generates inefficient code. You can try to optimize for speed or size in order mitigate performance bottlenecks.
All, Delphi, the FPC as well as the GPC support asm
frames, but each with a few subtle differences.
We therefore refer to the compiler’s manuals, and not forgetting this book is about programming in Pascal.
Resource strings
[edit | edit source]This section is empty. Please help by expanding it. |
Run-time type information
[edit | edit source]This section is empty. Please help by expanding it. |
Managed types
[edit | edit source]This section is empty. Please help by expanding it. |
Thread variables
[edit | edit source]This section is empty. Please help by expanding it. |
Preprocessor Functionality
Pascal Programming/Preprocessor
Syntax Cheat Sheet
Syntax cheat sheet
[edit | edit source] A reader requests expansion of this page to include more material. You can help by adding new material (learn how) or ask for assistance in the reading room. |
monospaced
denotes keywords and syntax- [ ] denotes optional syntax
- | denotes multiple possible syntaxes
- ( ) denotes grouped syntax
Statements
[edit | edit source]syntax | definition | availability |
---|---|---|
if condition then begin statement(s) end;
|
Conditional statement | standard |
while condition do begin statement(s) end;
|
while loop
|
standard |
repeat statement(s) until condition;
|
repeat loop
|
standard |
with variable do begin statement(s) end;
|
Eases the use of a variable or pointer variable of a structured type by omitting the dot notation for the variable. | standard |
Appendix
Jumps
[edit | edit source]Usage of jumps are deemed bad practice. This topic has deliberately not been covered in any of the previous chapters, but for the sake of completeness it is explained in the appendix.
Using labels
[edit | edit source]Standard Pascal includes the infamous "goto" statement. It redirects the computer to a labeled statement somewhere else in the program. Labels are unsigned integers, although some compilers allow them to be words.
1: write('Enter an even number please: '); (*Try to get an even number*)
readln(number);
even := (number mod 2) = 0;
if even then goto 2; (*Skip ahead*)
writeln('That was an odd number.');
goto 1; (*Try again*)
2: writeln('Thanks!'); (*Finished!*)
Declaring labels
[edit | edit source]If you use any labels, you are required to declare them ahead of time just as you would variables or constants.
program GetEvenNumber;
label 1, 2;
var
number: integer;
even: Boolean;
Avoiding labels
[edit | edit source]In some early programming languages, labeled jumps were the primary way of controlling the flow of execution. Too many labels and "goto" statements will read to unreadable code. Today, we would generally be expected to rewrite such code with the more specific control flow structures, like so:
repeat
write('Enter an even number, please: ');
readln(number);
even := (number mod 2) = 0;
if not even then writeln('That was an odd number.')
until even;
writeln('Thanks!');
Noteworthy types
[edit | edit source]type | definition | size (in bytes) | availability |
---|---|---|---|
AnsiChar | one ANSI-standard textual character | 1 | |
AnsiString | an array of ANSI-standard characters of indefinite length with an optional size constraint | 1 * number of characters | |
Boolean | true or false | 1 | standard |
Byte | whole number ranging from 0 to 255 | 1 | |
Cardinal | synonym depending on processor type (16 bit=Word, 32 bit=LongWord, 64 bit=QWord) | varies (2, 4, 8) | |
Char | one textual character (likely ASCII) | 1 | standard |
Comp | a floating point number ranging 19-20 digits that is effectively a 64-bit integer | 8 | |
Currency | a floating point number ranging 19-20 digits that is a fixed point data type | 8 | |
Double | a floating point number ranging 15-16 digits | 8 | |
DWord | whole number ranging from 0 to 4,294,967,295 | 4 | |
Extended | a floating point number ranging 19-20 digits | 10 | |
Int64 | whole number ranging from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | 8 | |
Integer | synonym depending on processor type (16 bit=SmallInt, 32 bit=LongInt, 64 bit=Int64) | varies (2, 4, 8) | standard |
LongInt | whole number ranging from -2,147,483,640 to 2,147,483,647 | 4 | |
LongWord | whole number ranging from 0 to 4,294,967,295 | 4 | |
Pointer | an untyped pointer holding an address | 32 bit=4, 64 bit=8 | standard |
PtrUInt | a pointer type implicitly convertable to an unsigned integer | 32 bit=4, 64 bit=8 | |
QWord | whole number ranging from 0 to 18,446,744,073,709,551,615 | 8 | Free Pascal |
Real | a floating point number whose range is platform dependent | 4 or 8 | standard |
ShortInt | whole number ranging from -128 to 127 | 1 | |
ShortString | an array of textual characters of up to 255 elements (likely ASCII) with an optional size constraint | 1 * number of characters (max 255) | standard |
Single | a floating point number ranging 7-8 digits | 4 | |
SmallInt | whole number ranging from -32,768 to 32,767 | 4 | |
String | synonym for ShortString (or AnsiString with the $H preprocessor directive turned on) | 1 * number of characters (max 255) | standard |
UInt64 | whole number ranging from 0 to 18,446,744,073,709,551,615 | 8 | Free Pascal, Delphi 8 or later |
WideChar | one UTF-8 textual character | 2 | |
WideString | an array of UTF-8 characters of indefinite length with an optional size constraint | 2 * number of characters | |
Word | whole number ranging from 0 to 65,535 | 2 |
Noteworthy preprocessor directives
[edit | edit source]directive | description | value(s) | example | availability |
---|---|---|---|---|
$COPERATORS | allows use of C-style operators | OFF or ON | {$COPERATORS ON}
i += 5;
i -= 5;
i *= 5;
i /= 5; |
Free Pascal |
$DEFINE | defines a symbol for the preprocessor (if '$macro on', can have a value assigned) | symbol name (:= value if '$macro on') | {$DEFINE Foo}
{$DEFINE Bar := 5} |
standard |
$H | implies whether the String type is a ShortString or AnsiString | - or + | Delphi, Free Pascal | |
$I | inserts a file's contents into the current source code | filename | {$I hello.txt} |
standard |
$IF | begins a preprocessor conditional statement | compile-time boolean expression | {$IF DEFINED(DELPHI) OR DECLARED(Foo)} |
standard |
$IFDEF | begins a preprocessor conditional statement depending if a preprocessor symbol is defined | preprocessor symbol | {$IFDEF MSWINDOWS} |
standard |
$IFNDEF | begins a preprocessor conditional statement depending if a preprocessor symbol is not defined | preprocessor symbol | {$IFNDEF UNIX} |
standard |
$IFOPT | begins a preprocessor conditional statement depending on the status of a preprocessor switch | compiler option | {$IFOPT D+} |
standard |
$INCLUDE | inserts a file's contents into the current source code | filename | {$INCLUDE hello.txt} |
standard |
$INLINE | allows inline functions and procedures | OFF or ON | unit Foo;
{$INLINE ON}
interface
function Give5: integer; inline;
implementation
function Give5: integer;
begin
Give5 := 5;
end;
end. |
Free Pascal |
$MACRO | allows defined symbols to hold values | OFF or ON | {$MACRO ON}
{$DEFINE Foo := 7} |
Free Pascal |
$MODE | sets the Pascal dialect | DELPHI, FPC, MACPAS, OBJFPC, TP | Free Pascal | |
$R | embeds a resource file into the code | a file name | {$R *.dfm} |
Delphi, Free Pascal |
$STATIC | allows use of the 'static' keyword | OFF or ON | unit Foo;
{$STATIC ON}
{$MODE OBJFPC}
interface
type
Bar = class
function Baz: string; static;
end;
implementation
function Bar.Baz: string;
begin
Result := 'This function is not part of a Bar instance.';
end;
end. |
Free Pascal |
Register
- Block
- Compiler
- Constants
else
- Enumerations
- Identifiers
if
- Pointers
- Routines
- Sets
- Statement
- Variables