Jump to content

Foundations of Computer Science/Printable version

From Wikibooks, open books for an open world


Foundations of Computer Science

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Foundations_of_Computer_Science

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Introduction

Have you ever wondered what computing is and how a computer works? What exactly is computer science? Why—beyond the obvious reasons—is it important? What do computer scientists do? What types of problems do they work on? What approaches do they use to solve those problems? How, in general, do computer scientists think?

Question 1: What do you think of when you hear "computer science?" Write a paragraph or list, or draw an image or diagram of what comes to mind.

Question 2: What are the parts of computer science that are most interesting or important to you currently? Why?

When you hear the term "computer science" perhaps you think of a specific computer. Or someone you know who works with computers. Or a particular computer use, say online games or social networks. There are many, many different aspects of computing and computer science.

There are a number of reasons why it is useful and important to know something about this computer science. Computers affect many, many aspects of our lives in different ways. For many people, computers are playing or will play a significant role in the work they do, in their recreational pursuits, in how they communicate with others, in their education, in their health care, etc. Think about the many different ways you encounter computers and computing, either directly or indirectly, in your daily life.

What, more specifically, will this book cover? The foremost purpose of this text is to give you a greater understanding of the fundamentals of computer science: What is computer science, anyway? Is the same as computer programming? What is a computer? For example, most people would agree that a "laptop computer" is a computer, as is a "tablet computer", but what about a smartphone? And how do computers work? For example, we can store not only numbers and text in computers, but also images, video files, and audio files; how do computers handle such disparate data? And what are some interesting and important subareas of computer science? For example, what is important to know about subareas such as computer graphics, networking, or databases? And why is any of this important? Isn't it sufficient for most people just to use computers, rather than have a deeper understanding of computers and computer science?

These are all fundamental questions about computing, and in this book we'll look at them and other questions. In summary, one purpose of this book is to provide an overview of computer science that not only exposes you to computer science fundamentals—such as how a computer works on a rudimentary level—but also explores why these fundamentals are important.

There are two parts of this overview that are particularly important: while the main theme is an overview of computer science, two essential subthemes are how mathematics is used in computer science and how computer science affects, and is affected by, society.

Both subthemes fit well in an overview of computer science book. Computer science relies heavily on mathematics (in fact, some colleges have computer science and mathematics programs in a joint department). Certain uses of mathematics in computer science are obvious—for example, in computational tools such as spreadsheets—but there are also many less obvious ways that mathematics is essential to computer science. For example at the lowest level in a computer, data (whether that data is numeric, text, audio, video, etc.) is all represented in binary, i.e., as strings of 0's and 1's. This means that to understand something very basic about computers you need to understand binary numbers and operations.

Computers also affect society in many ways, from the use of computer-generated imagery in films, to large government or commercial databases, to the multiple societal effects of the Internet. And society affects computers, for example through user behavior and through different types of regulation.

While mathematics and technology and society might seem too different to be included comfortably in the same book, there are actually many computer science topics that are useful to explore from both perspectives—in a sense, these different viewpoints are "two sides of the same coin." For example, one topic in the book is computer security. Mathematics plays a role in security, for example in encryption. And computer security also has many societal aspects, for example national security, infrastructure security, and individual security. Most of the topics in this text similarly have both mathematical underpinnings and societal aspects, and exploring these topics from both perspectives will result in a richer understanding.

What this book isn't

[edit | edit source]

There are a number of different types of introductory computer science books. So, in addition to explaining what this text is, it is also useful to state what it is not.

This is not a programming book. Programming is a central activity in computer science, but it is not the whole of computer science. Because programming is important, we'll spend some time on it. However, because computer science is much more than programming, and because this is an overview book, that time will be only a small part of this work.

This is not a computer applications book. Many other books cover basic computer applications. For example, a popular choice is teaching how to use a word processor, a spreadsheet, a database management program, and presentation software. These and other applications are important parts of computer science, and so in this book you will get a chance to learn about some applications that might be new to you. However—like programming— using applications is only part of learning about computer science, and so application use will be only a small part of this book.

This is not a "computer literacy" or "computer fluency" book. There are a variety of definitions of computer literacy or computer fluency. For example, the Wikipedia definition, derived from a report from the U.S. Congress of Technology Assessment, is "the knowledge and ability to use computers and related technology efficiently, with a range of skills covering levels from elementary use to programming and advanced problem solving."[1] Parts of this book will involve using computers to gain a variety of skills. For example, you will do a variety of computer-related tasks such as performing web searches, constructing web pages, doing elementary computer programming, and working with databases. However, this is just one part, rather than the totality, of the text. So this book shares some characteristics of a computer literacy book, but overall it has a wider focus than that type of a textbook.

This is not a "great ideas in computer science" book. One current trend in computer science introductory materials is to study computer science through its important, fundamental ideas.[2] And this book does cover some key ideas. For example, an early topic we'll study is how all data in computers, whether those data are numeric, text, video, or others, are represented within the computer as 0's and 1's. In general, the topics in the book are fundamental to computer science. However, this text also differs from a great ideas book. It is not focused solely on ideas, but explores broadly a number of computer-related issues, subtopics, and computer skills. Moreover, this book focuses more on mathematical thinking, and on technology and society, than a typical great ideas book would.

In addition to programming, applications, computer fluency, and great ideas, there are a number of other types of introductory computer science textbooks. Some survey a variety of computer science topics. Others focus on professional software development practices. Still others look at look at computing through a particular "lens" such as networks or computational biology. And so on. This book has some common characteristics with these other courses, but also has significant differences. In particular, the biggest difference is this book blends an overview of computer science with a strong emphasis on mathematics, and on society and technology; this is a balance of emphases that has a number of advantages, but is not usually seen in introductory computer science courses.

What is this book about?

[edit | edit source]

Both mathematical thinking and technology and society are significant parts of this book. Many textbooks present an introduction to computer science though programming, or through how computers work, or through some other aspect of computing. However, there is not a suitable text that combines an overview of computer science with both sufficient mathematical and sufficient society and technology emphases.

At first glance, it might seem odd that a book introducing computer science would deal with liberal education. What does computer science have to do with liberal education? Understanding computers well involves exploring them from a variety of different viewpoints. This includes understanding not only how computers work—including, for example, the mathematical underpinnings of computer science—but also how they affect, and are affected by, society. In summary, to have a good understanding of computers and computer science it is important to explore them from a variety of perspectives, including the perspectives embodied in liberal education.

Mathematical thinking

[edit | edit source]

Question 3. What do you think of when you hear the word "mathematics?" Write a paragraph or list, or draw an image or diagram of what comes to mind.

Question 4. Based on your experience with computers, write a list of some places where mathematics is used in computing.

What do computers and mathematics have in common? Why is it appropriate for an overview of computer science book to require mathematical thinking?

Much of the use of mathematics in this book is applying mathematical ideas and operations to solve computer science problems. There are a number of important mathematical underpinnings of computer science, and so understanding computer science involves being able to solve mathematical problems involving these underpinnings. At the same time, the different uses of mathematics in this text exemplify characteristics of mathematics as a whole, and of the close tie between the fields of mathematics and computer science. For instance, the mathematics in the book illustrates the following:

  • The reliance of many key ideas in computer science, such as data representation, on mathematics.
  • The use of special mathematics- or logic-related notation and terminology in many parts of computer science.
  • The ability to represent and work with many different types of data in the computer, and the related ability to represent and work with quantities in different representations using a variety of operations.
  • The need for rigor in solving problems, analyzing situations, or specifying computational processes.
  • The use of numbers and arithmetic in solving computational problems. However, rather than being simple arithmetic problems, these problems often have some special characteristics such as involving repeated operations, or involving extremely large or extremely small numbers.
  • The existence of a variety of different algorithms for solving such diverse problems as pattern matching, counting specified values in a table of data, or finding the shortest path between two nodes in a graph.

Solving many of the problems in this book will involve doing some mathematics, and therefore manipulating mathematical or logical symbols. Here are a few examples:

  • In exploring low-level logical operations you'll need to manipulate binary representation and logical operators.
  • In studying the growth rate of algorithms you'll need to work with the Ο and Θ notations commonly used by computer scientists.
  • In specifying computational processes you'll need to use "pseudocode" or a programming language. These share many notational characteristics with mathematical or logical symbols, especially when the computational processing involves a large number of numeric computations.

The level of mathematics in this book is introductory-level college mathematics. As such, the mathematics is not advanced, and there is no mathematical prerequisite for this book beyond the requirements needed for general college admission. At the same time, the mathematics in this book goes beyond high school mathematics even though many of the types of mathematics used in this text appear in some high school mathematics courses.

As an example, one appearance of mathematics in this book is binary (or base 2) representation. This is a topic that often appears in high school mathematics courses, and the basics of binary representation are not complicated. In this book we review such basics as how to convert numbers between decimal (base 10) and binary representation, and how to do simple operations such as adding two binary numbers. However, we also use binary representation in additional ways that underpin the workings of computers. Here are a few examples:

  • We'll look at a few different ways to represent numbers in binary representation. For example, integers are often represented in binary not using the usual straightforward binary representation, but in "two's complement" form. So part of this book is learning not only about the "usual" binary representation, but also about these alternatives.
  • We'll look at various issues with binary representation, such as the number of "bits" used, that are important in determining the range and precision of numbers used by computers.
  • In addition to representing numbers, we will also look at how computers use binary representation to represent and operate on other types of data such as text, colors, and images.
  • In addition to basic operations such as binary addition, we will also look at other operations on binary representations. For example, logical operations are important in masking colors in image processing, and in implementing arithmetic operations in low-level computer hardware.

In summary, even though many of the mathematical topics in this book appear in high school mathematics, they go beyond the usual high school treatment of those topics in breadth or depth.

Technology and society

[edit | edit source]

Question 5. What do you think of when you hear "technology and society?" Write a paragraph or list, or draw an image or diagram of what comes to mind.

Question 6. Based on your experience with computing, write a list of examples of how computing affects, and is affected by, society.

The topic of this book is computers and computing. Computers have affected society in numerous and diverse ways, some of which we'll explore in this book. And current and future computer applications will affect society in even more ways.

Through this book you should get an understanding of how computers work. This includes understanding the basics of computer hardware and computer software.

More broadly, however, computer science relies on results from other areas of science, engineering, and related fields. The most prominent example of this we will see in this text is various ways that mathematics is essential in computer science.

Technology affects society. However, it is not a one-way street. Society also affects technology. For example, society fosters technology by means such as government support for research. As another example, different individuals, businesses, and other organizations adopt and use technology in ways often not foreseen by the technology's creators.

In this book we'll look at a variety of instances of how society affects technology. These include government funding for the early Internet, Internet regulation, how business considerations affect computing products, and societal aspects of computer security.

In many topics in computers and society there are multiple stakeholders. These can include individual users, developers, companies (producers, consumers, and intermediaries), government bodies, professional organizations, and other types of organizations. These different stakeholders often have different views and different goals.

In this book we will often look at technology and society issues from numerous perspectives. Sometimes we will focus on a specific perspective or the role of a specific stakeholder. However, other times we will explore issues more broadly: Who are the stakeholders? What is their role in this issue? What are their goals?

One often hears conflicting views on computer and society issues. Computers are beneficial for society. Computers are harmful to society. The Internet is making it easier for people to communicate and is bringing people together. The Internet is making people more isolated. Computers and automation are robbing people of jobs. Computers and automation create jobs.[3]

In this book we'll often explore issues that are contentious and/or complicated. How do we avoid a superficial, one-sided understanding of such issues? How do we resolve conflicting claims about such issues?

Computing technology not only has had massive effects on society, but continues to affect society. Not a day goes by without some technological advance involving computing. In many ways the "computer revolution" is just beginning.

One goal of this book is that you'll learn enough about computing in general, about trends in computing, and about computing and society that you'll be able to evaluate new technology. Note that "evaluate" might mean different things in different contexts. For instance, it might mean give an informed projection about whether a new computer product will be successful or not. Or it might mean predict future computer advances in a certain area. Or it might mean analyze whether a new computer application is more likely to be more beneficial than harmful.

Additional questions for thought and discussion

[edit | edit source]

Here are some additional introductory questions.

Question 7. How do you use computers? List the most important ways.

Question 8. Write down a list of movies in which computing plays a major role. For each movie, indicate whether computing is portrayed as beneficial, harmful, beneficial in some ways but harmful in others, or neutral.

Question 9. Do you think computers, on the whole, have more positive effects than negative ones, more negative ones than positive, or about equal positive and negative effects? Why?

Question 10. List some ways computers are beneficial to society. Then list some ways they are harmful.

Question 11. Suppose you were to write a novel, play, screenplay, etc. about some aspect of computers and society. Describe what the theme or themes of your work would be.

Question 12. What does "technology" mean? What are some important ways you use technology in your daily life?

Question 13. Suppose you had to write a short essay or short story entitled "Computers and Me." What would be some key points or themes in that work?

Question 14. Suppose you had to write a short essay or short story entitled "Technology and Me." What would be some key points or themes in that work?

Notes

[edit | edit source]
  1. See Computer literacy at the English Wikipedia. Accessed May 20, 2015.
  2. For example, see Denning, Peter. "Great Principles of Computer Science". Retrieved May 20, 2015. This site organizes principles into seven categories: computation, communication, coordination, recollection, automation, evaluation, and design. There are a number of good ideas, insights, and frameworks in this and related approaches, and in fact many of the key ideas in this book will relate in some way to Denning's principles.
  3. See Putnam, Robert D. (August 2001). Bowling Alone: The Collapse and Revival of American Community. Simon and Schuster. ISBN 978-0-7432-0304-3. Retrieved 29 May 2015.


What is Computing

What is Computing

[edit | edit source]

In this course, we try to focus on computing principles (big ideas) rather than computer technologies, which are tools and applications of the principles. Computing is defined by a set of principles or ideas, which underlies a myriad of technologies that are created based on the principles. Technologies can be complex and constantly evolving but principles stays the same. In the second half of the course, we will study various technologies to demonstrate the power of computing and how principles are applied.

In addition to principles of computing and technologies there are practices of computing - what professionals do to advance computing.

The chart to the right illustrates the difference between principles of computing and practices of computing. Principles underlie technologies and practices. A consumer exploits the power of computing through the applications built for them for various tasks. We believe everyone needs to know the principles of computing because such principles are widely applicable. As professionals in the field of computing we need to know the two ends and everything in the middle - the practices (activities and skills that make computing useful and effective).

This chart illustrates the difference between principles of computing and practices of computing.

We will use the terms computing and computation interchangeably throughout the book.

Principles of Computing

[edit | edit source]

Computing is fundamentally about information processes. One of the big ideas of computing is that information processes can be carried out purely mechanically via symbol manipulation. The agent that does the computing, whether a thinking human being or a machine (computer), does not matter. Toward the end of the book we will see this is true for all modern computers - digital computers manipulate two symbols (zero and one) blindly according to instructions.

An Analogy

[edit | edit source]

The following analogy from the "Thinking as Computation" book [1] illustrates the idea. Imagine that we have the following table of symbols.

a b c d e f g h i j
a aa ab ac ad ae af ag ah ai aj
b ab ac ad ae af ag ah ai aj ba
c ac ad ae af ag ah ai aj ba bb
d ad ae af ag ah ai aj ba bb bc
e ae af ag ah ai aj ba bb bc bd
f af ag ah ai aj ba bb bc bd be
g ag ah ai aj ba bb bc bd be bf
h ah ai aj ba bb bc bd be bf bg
i ai aj ba bb bc bd be bf bg bh
j aj ba bb bc bd be bf bg bh bi

The symbols can be any set of symbols, we pick the letters from the English alphabet for simplicity. We can define a procedure P that takes two symbols ('a' through 'j') as the input and produces two symbols in the same set as the output. Internally, the procedure uses the first input symbol to find a row that starts with the same symbol, then uses the second input symbol to find a column with the same symbol at the top, and then report/return the symbols at the cross point. It is not hard to imagine such a table lookup procedure can be done purely mechanically (blindly) by a simple agent (e.g. a device or a machine). Of course a human being can do it but this type of symbol manipulation requires no human intelligence. Two conclusions can be drawn from this thought experiment:

  • Symbol manipulation can be done mechanically.
  • The machine that performs the manipulation does not need to know the meaning of the symbols nor the purpose of the manipulation.

This procedure can be meaningful if we know how to interpret the symbols. For example, if the symbols 'a' through 'j' represent quantities of 0 through 9 respectively, this procedure performs single decimal digit addition. For instance, p(d, f) = p(3, 5) = ai = 08, which is the correct result of 3+5. The following table is essentially the same as the previous one except that it uses symbols that are meaningful to humans.

0 1 2 3 4 5 6 7 8 9
0 00 01 02 03 04 05 06 07 08 09
1 01 02 03 04 05 06 07 08 09 10
2 02 03 04 05 06 07 08 09 10 11
3 03 04 05 06 07 08 09 10 11 12
4 04 05 06 07 08 09 10 11 12 13
5 05 06 07 08 09 10 11 12 13 14
6 06 07 08 09 10 11 12 13 14 15
7 07 08 09 10 11 12 13 14 15 16
8 08 09 10 11 12 13 14 15 16 17
9 09 10 11 12 13 14 15 16 17 18

Now that we have a simple procedure P that can instruct a simple agent to add two single digit decimal numbers, we can devise a new procedure P1 that can add three single-digit decimal numbers as shown in the following chart.

This chart illustrates a way to build a P1 procedure from a P1 procedure.

The new procedure P1 employs three instances of procedure P to add three decimal digits and return two digits as the result. We can view the procedures as machines with inputs and outputs and the lines are pipes that allow the symbols to go from one place to another place. It is not hard to imagine that an agent that can carry out P can carry out P1 as P1 is entirely made up of P. Note that the dotted rectangle represents the new procedure P1 made up of instances of P and the answer given by P1 for the sample inputs is correct. Again the symbols used in the process can be any set of symbols because internally simple table lookups are performed.

Now imagine that we could use P1 to construct more complex procedures, for example procedure P2 in the following chart.

This chart illustrates a way to build a P2 procedure from a P1 procedure.

P2 uses P1 to add two double-digit numbers, in fact we can simply add more P1s to the design to deal with any numbers of digits.

By now we can make the following observations:

  • Whatever machine that can perform P can perform P1, P2, and etc.
  • We have made procedures that perform seemingly intelligent activities by making them more complex and at the same time kept them doable by simple machines.

If we follow the same line of reasoning, it is not hard to imagine we can create increasingly more complex procedures to instruct the simple machine to do progressively more intelligent things, such as

  • integer subtraction
  • compare two integers (subtraction and check the sign of the result)
  • integer multiplication (repeated addition)
  • represent fractions using a pair of integers and do arithmetic on them
  • use matrices of integers to represent systems of equations and solve them using matrix operations
  • use systems of equations to model complex physical systems and perform numerical simulations of these systems

In summary, from this example we can see that simple symbolic operations can be assembled to form larger procedures to perform amazing activities through computational processes. Such activities are not limited to numerical calculations. If we can represent abstract ideas as symbols (as we represent abstract quantities as concrete numbers) and device procedures to manipulate the symbols according to the relations among the ideas we can model reasoning as computational processes. This is what computer science is fundamentally about - information processes with two essential components: representations and a sequence of rules for manipulation of the representations. Note that it has nothing to do with electronics or physics. The machine that carries out such processes does not need to know the meaning of the symbols and why the process yields correct results. The machine only needs to follow the procedures (a set of rules) blindly.

As an example, you can read about a mechanical computer (difference engine) designed by Charles Babbage that can tabulate polynomial functions:

A difference engine: computing the solution to a polynomial function

Another Analogy

[edit | edit source]

Richard Feynman used another similar analogy (file clerk) to explain how computers work from the inside out in his Computer Heuristics Lecture (1 hour 15 mins): http://www.youtube.com/watch?v=EKWGGDXe5MA

History

[edit | edit source]

Now that we have learned computing is, in essence, a certain manipulation of symbols. A computer's ability to perform amazing tasks depends on its ability to manipulate symbols according to well defined rules. In fact, digital computers only manipulate two symbols - zeros and ones. The intelligence of computing lies in the design and implementations of the rules/programs.

In the future, when talking about computer machinery, we will see computers are constructed using such principles.

You may wonder where the ideas come from. Many people in history made significant contributions to ideas of computing and computers. Gottfried Leibniz (1646–1716), a German philosopher, is considered the first person to dream of reducing reasoning to calculation and building a machine capable of carrying out such calculations. He observed that in arithmetic we represent abstract quantities using symbols and manipulate the symbols to get useful results according to rules. He dreamed that we could represent abstract ideas using symbols and reason with the ideas according to the logics between the ideas via similar concrete symbol manipulation as we do in arithmetic. Such manipulations give us correct results not because whoever does the manipulation is intelligent but because the rules of manipulation mirrors relationships between quantities and logics between ideas.

Because of Leibniz’s dream, now we have computer science and universal machines called computers. A computer is fundamentally a physical device that can manipulate symbols following very simple logic rules. Almost all computers are electronic because it happens to be cheaper and easier to build that way. Computer science is fundamentally about the information process (dealing with abstract ideas) that takes place through symbol manipulation, which follows a recipe (a set of rules). Such recipes are also known as algorithms. No wonder so many computer programming books are called cookbooks :) In computer science we study how to represent information and how to design and apply algorithms to get meaningful results. There are usually many ways to perform the same task. Comparing algorithms for evaluation purposes is called algorithm (complexity) analysis. Communicating an algorithm (recipe) to a computer is called programming/software development. The languages we use for such communication are called computer programming languages. The artifacts of programming are computer programs or software. The engineering disciplines we try to apply in the software development process to produce quality software is called software engineering. So computer science is more about problem solving than computers. Computing science is probably a more appropriate name for this discipline.

Practices of Computing

[edit | edit source]

Principles are fundamental ideas that permeate all aspects of computing. Practices are not principles but are very useful to identify because they identify the central practices of computing professionals. Practices, sometimes called "know-hows", define someone's skill set and the level of competency: beginner, competent, and expert. The four core practices of computing are identified in the Great Principles of Computing project:[2]

  • Programming (including multilingual programming practice)
  • Systems and systems thinking
  • Modeling, validating, testing, and measuring
  • Innovating

Programming is an integral part of computer science because it allows us to explore abstract ideas in computer science in concrete ways. It is also an exciting creative process, which brings a great deal of satisfaction when we can make computers do useful things. In this course we will program in a very high-level graphical programming environment to explore ideas in computer science.

Donald Knuth regards programming like composition: well-written programs are a pleasure for others or yourself to read. He believes that programming is triply rewarding :

  • beautiful code (aesthetic)
  • do useful work (humanitarian)
  • get paid (economic)

Programming a computer is essentially teaching the computer how to do things. As we mentioned previously computers are simple machines that strictly follow orders. For a computer to do the right task the instructions in our program must be correct and logical. Programs that are executable on a computer are software - serving as the brain of the computer. Software with errors is called buggy (see for the history of this name) software. Testing software on an actual computer can help catch most bugs in the software. Testing provides almost immediate feedback to the quality of our programs so that we can fix bugs and improve it. Because of this, we believe programming makes us better thinkers and learners. We will see why it is hard to prove the correctness of programs.

A person interacts with a computer in a programming activity.

References

[edit | edit source]
  1. Levesque, Hector,Thinking as Computation, Hector J. Levesque, ISBN 9780262016995
  2. Denning, Peter, The Great Principles of Computing, http://denninginstitute.com/pjd/GP/GP-site/welcome.html


Information Representation

Information Representation

[edit | edit source]

Introductory problem

[edit | edit source]

Computers often represent colors as a red-green-blue (RGB) set of numbers, called a "triple", where each of the red, green, and blue components is an integer between 0 and 255. For example, the color (255, 0, 10) has full red, no green, and a small amount of blue. Write an algorithm that takes as input the RGB components for a color, and returns a message indicating the largest component or components. For example, if the input color is (100, 255, 0), the algorithm should output "Largest component(s): green". And if the input color is (255, 255, 255), then the algorithm should output "Largest component(s): red, green, blue".

Overview of this chapter

[edit | edit source]

One amazing aspect of computers is they can store so many different types of data. Of course computers can store numbers. But unlike simple calculators they can also store text, and they can store colors, and images, and audio, and video, and many other types of data. And not only can they store many different types, but they can also analyze them, and they can transmit them to other computers. This versatility is one reason why computers are so useful, and affect so many areas of our lives.

To understand computers and computer science, it is important to know something about how computers deal with different types of data. Let's return to colors. How are colors stored in a computer? The introductory problem states one way: as an RGB triple. This is not the only possible way. RGB is just one of many color systems. For example, sometimes colors are represented as an HSV triple: by hue, saturation, and value. However, RGB is the most common color representation in computer programs.

This leads to a deeper issue: how are numbers stored in a computer? And why is it important anyway that we understand how numbers, and other different types of data, are stored and processed in a computer? This chapter deals with these and related questions. In particular, we will look at the following:

  1. Why is this an important topic?
  2. How do computers represent numbers?
  3. How do computers represent text?
  4. How do computers represent other types of data such as images?
  5. What is the binary number system and why is it important in computer science?
  6. How do computers do basic operations such as addition and subtraction?

Goals

[edit | edit source]

Upon completing this chapter, you should be able to do the following:

  1. Be able to explain how, on the lowest level, computers represent both numeric and text data, as well as other types of data such as color data.
  2. Be able to explain and use the basic terminology in this area: bit, byte, megabyte, RGB triple, ASCII, etc.
  3. Be able to convert numbers and text from one representation to another.
  4. Be able to convert integers from one representation to another, for example from decimal representation to two's complement representation.
  5. Be able to add and subtract numbers written in unsigned binary or in two's complement representation.
  6. Be able to explain how the number of bits used to represent data affects the range and precision of the representation.
  7. Be able to explain in general how computers represent different types of data such as images.
  8. Be able to do calculations involving amounts of memory or download times for certain datasets.

Data representation and mathematics

[edit | edit source]

How is data representation related to liberal education and mathematics? As you might guess, there is a strong connection. Computers store all data in terms of binary (i.e., base 2) numbers. So to understand computers it is necessary to understand binary. Moreover, you need to understand not only binary basics, but also some of the complications such as the "two's complement" notation discussed below.

Binary representation is important not only because it is how computers represent data, but also because so much of computers and computing is based on it. For example, we will see it again in the chapter on machine organization.

Data representation and society and technology

[edit | edit source]

The computer revolution. That is a phrase you often hear used to describe the many ways computers are affecting our lives. Another phrase you might hear is the digital revolution. What does the digital revolution mean?

Nowadays, many of our devices are digital. We have digital watches, digital phones, digital radio, digital TVs, etc. However, previously many devices were analog: "data ... represented by a continuously variable physical quantity" [1] Think, for example, of an old watch with second, minute, and hour hands that moved continuously (although very slowly for the minute and hour hands). Compare this with many modern-day watches that shows a digital representation of the time such as 2:03:23.

This example highlights a key difference between analog and digital devices: analog devices rely on a continuous phenomenon and digital devices rely on a discrete one. As a second example of this difference, an analog radio receives audio radio broadcast signals which are transmitted as radio waves, while a digital radio receives signals which are streams of numbers.[2]

The digital revolution refers to the many digital devices, their uses, and their effects. These devices include not only computers, but also other devices or systems that play a major role in our lives, such as communication systems.

Because digital devices usually store numbers using the binary number system, a major theme in this chapter is binary representation of data. Binary is fundamental to computers and computer science: to understand how computers work, and how computer scientists think, you need to understand binary. The first part of this chapter therefore covers binary basics. The second part then builds on the first and explains how computers store different types of data.

Representation basics

[edit | edit source]

Introduction

[edit | edit source]

Computing is fundamentally about information processes. Each computation is a certain manipulation of symbols, which can be done purely mechanically (blindly). If we can represent information using symbols and know how to process the symbols and interpret the results, we can access valuable new information. In this section we will study information representation in computing.

The algorithms chapters discuss ways to describe a sequence of operations. Computer scientists use algorithms to specify behavior of computers. But for these algorithms to be useful they need data, and so computers need ways to represent data.[3]

Information is conveyed as the content of messages, which when interpreted and perceived by our senses, causes certain mental responses. Information is always encoded into some form for transmission and interpretation. We deal with information all the time. For example, we receive information when we read a book, listen to a story, watch a movie, or dream a dream. We give information when we write an email, draw a picture, act in a show or give a speech. Information is abstract but it is conveyed through concrete media. For instance, a conversation on the phone communicates information but the information is represented by sound waves and electronic signals along the way.

Information is abstract/virtual and the media that carry the information must be concrete/physical. Therefore before any information can be processed or communicated it must be quantified/digitized: a process that turns information into (data) representations using symbols.

People have many ways to represent even a very simple number. For example, the number four can be represented as 4 or IV or |||| or 2 + 2, and so on. How do computers represent numbers? (Or text? Or audio files?)

The way computers represent and work with numbers is different from how we do. Since early computer history, the standard has been the binary number system. Computers "like" binary because it is extremely easy for them. However, binary is not easy for humans. While most of the time people do not need to be concerned with the internal representations that computers use, sometimes they do.

Why binary?

[edit | edit source]

Suppose you and some friends are spending the weekend at a cabin. The group will travel in two separate cars, and you all agree that the first group to arrive will leave the front light on to make it easier for the later group. When the car you are in arrives at the cabin you will be able to tell by the light if your car arrived first. The light therefore encodes two possibilities: on (the other group has already arrived) or off (the other group hasn't arrived yet).

To convey more information you could use two lights. For example, both off could mean the first group hasn't arrived yet, the first light off and second on indicate the first group has arrived but left to get supplies, the first on and second off that the group arrived but left to go fishing, and both on that the group has arrived and hasn't left.

Note the key ideas here: a light can be on or off (we don't allow different level of light, multiple colors, or other options), just two possibilities. But the second is that if we want to represent more than two choices we can use more lights.

This "on or off" idea is a powerful one. There are two and only two distinct choices or states: on or off, 0 or 1, black or white, present or absent, large or small, rough or smooth, etc.—all of these are different ways of representing possibilities. One reason the two-choice idea is so powerful is it is easier to build objects—computers, cameras, CDs, and so on—where the data at the lowest level is in two possible states, either a 0 or a 1.[4]

In computer representation, a bit (i.e., a binary digit) can be a 0 or a 1. A collection of bits is called a bitstring. A bitstring that is 8 bits long is called a byte. Bits and bytes are important concepts in computer storage and data transmission, and later on we'll explain them further along with some related terminology and concepts. But first we will look at the basic question of how a computer represents numbers.

A brief historic aside

[edit | edit source]

Claude Shannon is considered the father of information theory because he is the first person who studied and built mathematical models for information and communication of information. He also made many other significant contributions to computing. His seminal paper “A mathematical theory of communication” (1948) changed our view of information, laying the foundation for the information age. Shannon discovered that the fundamental unit of information is a yes or no answer to a question or one bit with two distinct states, which can be represented by only two symbols. He also founded the design theory of digital computers/circuits by proving that propositions of Boolean algebra can be used to build a "logic machine" capable of carrying out general computation (manipulation of two types of symbols). Data, another term closely related to information, is an abstract concept of representations of information. We will use information representations and data interchangeably.

External and internal information representation

[edit | edit source]

Information can be represented on different levels. It is helpful to separate information representations into two categories: external representation and internal representation. External representation is used for communication between humans and computers. Everything we see on a computer monitor or screen, whether it is text, image, or motion picture, is a representation of certain information. Computers also represent information externally using sound and other media, such as touch pads for the blind to read text.

Internally all modern computers represent information as bits. We can think of a bit as a digit with two possible values. Since a bit is the fundamental unit of information it is sufficient to represent all information. It is also the simplest representation because only two symbols are needed to represent two distinct values. This makes it easy to represent bits physically - any device capable of having two distinct states works, e.g. a toggle switch. We will see later that modern computer processors are made up of tiny switches called transistors.

Review of the decimal number system

[edit | edit source]

When bits are put together into sequences they can represent numbers. We are familiar with representing quantities with numbers. Numbers are concrete symbols representing abstract quantities. With ten fingers, humans conveniently adopted the base ten (decimal) numbering system, which requires ten different symbols. We all know decimal representation and use it every day. For instance, the arabic numerals use 0 through 9. Each symbol represents a power of ten depending on the position the symbol is in.

So, for example, the number one hundred and twenty-four is . We can emphasize this by writing the powers of 10 over the digits in 124:

10^2  10^1 10^0
   1     2    4

So if we take what we know about base 10 and apply it to base 2 we can figure out binary. But first recall that a bit is a binary digit and a byte is 8 bits. In this file most of the binary numbers we talk about will be one byte long.

(Computers actually use more than one byte to represent most numbers. For example, most numbers are actually represented using 32 bits (4 bytes) or 64 bits (8 bytes). The more bits, the more different values you can represent: a single bit permits 2 values, 2 bits give 4 values, 3 bits gives 8 values, ..., 8 bits give 256 values, and in general n bits gives values. However when looking at binary examples we'll usually use 8 bit numbers to make the examples manageable.

This base ten system used for numbering is somewhat arbitrary. In fact, we commonly use other base systems to represent quantities of different nature: base 7 for days in a week, base 60 for minutes in an hour, 24 for hours in a day, 16 for ounces in a pound, and so on. It is not hard to imagine base 2 (two symbols) is the simplest base system, because with fewer than two symbols, we cannot represent change (and therefore no information).

Unsigned binary

[edit | edit source]

When we talk about decimal, we deal with 10 digits—0 through 9 (that's where decimal comes from). In binary we only have two digits, that's why it's binary. The digits in binary are 0 and 1. You will never see any 2's or 3's, etc. If you do, something is wrong. A bit will always be a 0 or 1.

Counting in binary proceeds as follows:

    0     (decimal 0) 
    1     (decimal 1) 
   10     (decimal 2) 
   11     (decimal 3) 
  100     (decimal 4) 
  101     (decimal 5) 
  ...

An old joke runs, "There are 10 types of people in the world. Those who understand binary and those who don't."

The next thing to think about is what values are possible in one byte. Let's write out the powers of two in a byte:

2^7  2^6  2^5  2^4  2^3  2^2  2^1  2^0
128  64   32   16   8    4    2    1 

As an example, the binary number 10011001 is Note each of the 8 bits can either be a 0 or a 1. So there are two possibilities for the leftmost bit, two for the next bit, two for the bit after that, and so on: two choices for each of the 8 bits. Multiplying these possibilities together gives or 256 possibilities. In unsigned binary these possibilities represent the integers between 0 (all bits 0) to 255 (all bits 1).

All base systems work in the same way: the rightmost digit represents the quantity of the base raised to the zeroth power (recall that anything raised to the 0th power results in 1), and each digit to the left represents a quantity that is base times larger than the one represented by the digit immediately to the right. The binary number 1001 represents the quantity 9 in decimal, because the rightmost 1 represents , the zeroes contribute nothing at the and positions, and finally the leftmost one represents . When we use different base systems it is necessary to indicate the base as the subscript to avoid confusion. For example, we write to indicate the number 1001 in binary (which represents the quantity 9 in decimal). The subscript 2 means "binary": it tells the reader that it does not represent a thousand and one in decimal. This example also shows us that representations have no intrinsic meaning. The same pattern of symbols, e.g. 1001, can represent different quantities depending on the way it is interpreted. There are many other ways to represent the quantity (remember: read this as "nine in base 10 / decimal"); for instance, the symbol 九 represents the same quantity in Chinese.

As the same quantity can be represented differently, we can often change the representation without changing the quantity it represents. As shown before, the binary representation is equivalent to the decimal representation - representing exactly the same quantity. In studying computing we often need to convert between decimal representation, which we are most familiar with, and binary representation, which is used internally by computers.

Binary to decimal conversion

[edit | edit source]

Converting the binary representation of a non-negative integer to its decimal representation is a straight-forward process: summing up the quantities each binary digit represents yields the result.

Decimal to binary conversion

[edit | edit source]

One task you will need to do in this book, and which computer scientists often need to do, is to convert a decimal number to or from a binary number. The last subsection showed how to convert binary to decimal: take each power of 2 whose corresponding bit is a 1, and add those powers together.

Suppose we want to do a decimal to binary conversion. As an example, let's convert the decimal value 75 to binary. Here's one technique that relies on successive division by 2:

75/2   quotient=37   remainder=1
37/2   quotient=18   remainder=1
18/2   quotient=9    remainder=0
9/2    quotient=4    remainder=1
4/2    quotient=2    remainder=0
2/2    quotient=1    remainder=0
1/2    quotient=0    remainder=1

We then take the remainders bottom-to-top to get 1001011. Since we usually work with group of 8 bits, if it doesn't fill all eight bits, we add zeroes at the front until it does. So we end up with 01001011.

Binary mathematics

[edit | edit source]

Addition of binary numbers

[edit | edit source]

In addition to storing data, computers also need to do operations such as addition of data. How do we add numbers in binary representation?

Addition of bits has four simple rules, shown here as four vertical columns:

  0      0      1      1
+ 0    + 1    + 0    + 1
=========================
  0      1      1      10

Now if we have a binary number consisting of multiple bits we use these four rules, plus "carrying". Here's an example:

  00110101
+ 10101100
==========
  11100001

Here's the same example, but with the carried bits listed explicitly, i.e., a 0 if there is no carry, and a 1 if there is. When 1+1=10, the 0 is kept in that column's solution and the 1 is carried over to be added to the next column left.

  0111100
  00110101
+ 10101100
==========
  11100001

We can check binary operations by converting each number to decimal: with both binary and decimal we're doing the same operations on the same numbers, but with different representations. If the representations and operations are correct the results should be consistent. Let's look one more time at the example addition problem we just solved above. Converting to decimal produces (do the conversion on your own to verify its accuracy), and converting gives . Adding these yields , which, when converted back to binary is indeed .

But binary addition doesn't always work quite right:

  01110100
+ 10011111
==========
 100010011  

Note there are 9 bits in the result, but there should only be 8 in a byte. Here is the sum in decimal:

  116
+ 159 
=====
  275

Note 275 which is greater than 255, the maximum we can hold in an 8-bit number. This results in a condition called overflow. Overflow is not an issue if the computer can go to a 9-bit binary number; however, if the computer only has 8 bits set aside for the result, overflow means that a program might not run correctly or at all.

Subtraction of binary numbers

[edit | edit source]

Once again, let's start by looking at single bits:

  0      0      1      1
- 0    - 1    - 0    - 1
========================
  0     -1      1      0

Notice that in the -1 case, what we often want to do is get a 1 result and borrow. So let's apply this to an 8-bit problem:

  10011101
- 00100010
==========
  01111011

which is the same as (in base 10),

  157
-  34
======
  123

Here's the binary subtraction again with the borrowing shown:

  1100010
  10011101
- 00100010
==========
  01111011

Most people find binary subtraction significantly harder than binary addition.

[edit | edit source]

You might have had questions about the binary representation in the last section. For example, what about negative numbers? What about numbers with a fractional part? Aren't all those 0's and 1's difficult for humans to work with? These are good questions. In this and a couple of other sections we'll look at a few other representations that are used in computer science and are related to binary.

Hexadecimal

[edit | edit source]

Computers are good at binary. Humans aren't. Binary is hard for humans to write, hard to read, and hard to understand. But what if we want a number system that is easier to read but still is closely tied to binary in some way, to preserve some of the advantages of binary?

One possibility is hexadecimal, i.e., base 16. But using a base greater than 10 immediately presents a problem. Specifically, we run out of digits after 0 to 9 — we can't use 10, 11, or greater because those have multiple digits within them. So instead we use letters: A is 10, B is 11, C is 12, D is 13, E is 14, and F is 15. So the digits we're using are 0 through F instead of 0 through 9 in decimal, or instead of 0 and 1 in binary.

We also have to reexamine the value of each place. In hexadecimal, each place represents a power of 16. A two-digit hexadecimal number has a 16's place and a 1's place. For example, D8 has D in the 16's place, and 8 in the 1's place:

16^1  16^0  <- hexadecimal places showing powers of 16
16    1     <- value of these places in decimal (base 10)
D     8     <- our sample hexadecimal number

So the hexadecimal number D8 equals in decimal. Note any two digit hexadecimal number, however, can represent the same amount of information as one byte of binary. (That's because the largest two-digit hex number , the same maximum as 8 bits of binary.) So it's easier for us to read or write.

When working with a number, there are times when which representation is being used isn't clear. For example, does 10 represent the number ten (so the representation is decimal), the number two (the representation is binary), the number sixteen (hexadecimal), or some other number? Often, the representation is clear from the context. However, when it isn't, we use a subscript to clarify which representation is being used, for example for decimal, versus for binary, versus for hexadecimal.

Hexadecimal numbers can have more hexadecimal digits than the two we've already seen. For example, consider , which uses the following powers of 16:

16^7  16^6  16^5  16^4  16^3  16^2  16^1  16^0
F     F     0     5     8     1     A     4

So in decimal this is:

Hexadecimal doesn't appear often, but it is used in some places, for example sometimes to represent memory addresses (you'll see this in a future chapter) or colors. Why is it useful in such cases? Consider a 24-bit RGB color with 8 bits each for red, green, and blue. Since 8 bits requires 2 hexadecimal digits, a 24-bit color needs 6 hexadecimal digits, rather than 24 bits. For example, FF0088 indicates a 24-bit color with a full red component, no green, and a mid-level blue.

Now there are additional types of conversion problems:

* Decimal to hexadecimal
* Hexadecimal to decimal
* Binary to hexadecimal
* Hexadecimal to binary

Here are a couple examples involving the last two of these.

Let's convert the binary number 00111100 to hexadecimal. To do this, break it into two 4-bit parts: 0011 and 1100. Now convert each part to decimal and get 3 and 12. The 3 is a hexadecimal digit, but 12 isn't. Instead recall that C is the hexadecimal representation for 12. So the hexadecimal representation for 00111100 is 3C.

Rather than going from binary to decimal (for each 4-bit segment) and then to hexadecimal digits, you could go from binary to hexadecimal directly.

Hexadecimal digits and their decimal and binary equivalents: first, base 16 (hexadecimal), then base 10 (decimal), then base 2 (binary).

16 10  2 <- bases
===========
0  0   0000
1  1   0001
2  2   0010
3  3   0011
4  4   0100
5  5   0101
6  6   0110
7  7   0111
8  8   1000
9  9   1001
A  10  1010
B  11  1011
C  12  1100
D  13  1101
E  14  1110
F  15  1111

Now let's convert the hexadecimal number D6 to binary. D is the hexadecimal representation for , which is 1101 in binary. 6 in binary is 0110. Put these two parts together to get 11010110. Again we could skip the intermediate conversions by using the hexadecimal and binary columns above.

Text representation

[edit | edit source]

A piece of text can be viewed as a stream of symbols can be represented/encoded as a sequence of bits resulting in a stream of bits for the text. Two common encoding schemes are ASCII code and Unicode. ASCII code use one byte (8 bits) to represent each symbol and can represent up to 256 () different symbols, which includes the English alphabet (in both lower and upper cases) and other commonly used symbols. Unicode extends ASCII code to represent a much larger number of symbols using multiple bytes. Unicode can represent any symbol from any written language and much more.

Image, audio, and video files

[edit | edit source]

Images, audio, and video are other types of data. How computers represent these types of data is fascinating but complex. For example, there are perceptual issues (e.g., what types of sounds can humans hear, and how does that affect how many numbers we need to store to reliably represent music?), size issues (as we'll see below, these types of data can result in large file sizes), standards issues (e.g., you might have heard of JPEG or GIF image formats), and other issues.

We won't be able to cover image, audio, and video representation in depth: the details are too complicated, and can get very sophisticated. For example, JPEG images can rely on an advanced mathematical technique called the discrete cosine transform. However, it is worth examining a few key high-level points about image, audio, and video files:

  1. Computers can represent not only basic numeric and text data, but also data such as music, images, and video.
  2. They do this by digitizing the data. At the lowest level the data is still represented in terms of bits, but there are higher-level representational constructs as well.
  3. There are numerous ways to encode such data, and so standard encoding techniques are useful.
  4. Audio, image, and video files can be large, which presents challenges in terms of storing, processing and transmitting these files. For this reason most encoding techniques use some sophisticated types of compression.

Images

[edit | edit source]

A perceived image is the result of light beams physically coming into our eyes and triggering nerves to send signals to our brain. In computing, an image is simulated by a grid of dots (called pixels, for "picture element"), each of which has a particular color. This works because our eyes cannot tell the difference between the original image and the dot-based image if the resolution (number of dots used) is high enough. In fact, the computer screen itself uses such a grid of pixels to display images and text.

"The largest and most detailed photograph of our galaxy ever taken has been unveiled. The gigantic nine-gigapixel image captures more than 84 million stars at the core of the Milky Way. It was created with data gathered by the Visible and Infrared Survey Telescope for Astronomy (VISTA) at the European Southern Observatory's Paranal Observatory in Chile. If it was printed with the resolution of a newspaper it would stretch 30 feet long and 23 feet tall, the team behind it said, and has a resolution of 108,200 by 81,500 pixels."[5]

While this galaxy image is obviously an extreme example, it illustrates that images (even much smaller images) can take significant computer space. Here is a more mundane example. Suppose you have an image that is 1500 pixels wide, and 1000 pixels high. Each pixel is stored as a 24-bit color. How many bytes does it take to store this image?

This problem describes a straightforward but naive way to store the image: for each row, for each column, store the 24-bit color at that location. The answer is pixels multiplied by 24 bits/pixel multiplied by 8 bits per 1 byte = 4.5 million bytes, or about 4.5MB.

Note the file size. If you store a number of photographs or other images you know that images, and especially collections of images, can take up considerable storage space. You might also know that most images do not take 4.5MB. And you have probably heard of some image storage formats such as JPEG or GIF.

Why are most image sizes tens or hundreds of kilobytes rather than megabytes? Most images are stored not in a direct format, but using some compression technique. For example, suppose you have a night image where the entire top half of the image is black ((0,0,0) in RGB). Rather than storing (0,0,0) as many times as there are pixels in the upper half of the image, it is more efficient to use some "shorthand." For example, rather than having a file that has thousands of 0's in it, you could have (0,0,0) plus a number indicating how many pixels starting the image (if you read from line by line from top to bottom) have color (0,0,0).

This leads to a compressed image: an image that contains all, or most, of the information in the original image, but in a more efficient representation. For example, if an original image would have taken 4MB, but the more efficient version takes 400KB, then the compression ratio is 4MB to 400KB, or about 10 to 1.

Complicated compression standards, such as JPEG, use a variety of techniques to compress images. The techniques can be quite sophisticated.

How much can an image be compressed? It depends on a number of factors. For many images, a compression ratio of, say, 10:1 is possible, but this depends on the image and on its use. For example, one factor is how complicated an image is. An uncomplicated image (say, as an extreme example, if every pixel is black[6]), can be compressed a very large amount. Richer, more complicated images can be compressed less. However, even complicated images can usually be compressed at least somewhat.

Another consideration is how faithful the compressed image is to the original. For example, many users will trade some small discrepancies between the original image and the compressed image for a smaller file size, as long as those discrepancies are not easily noticeable. A compression scheme that doesn't lose any image information is called a lossless scheme. One that does is called lossy. Lossy compression will give better compression than lossless, but with some loss of fidelity.[7]

In addition, the encoding of an image includes other metadata, such as the size of the image, the encoding standard, and the date and time when it was created.

Video

[edit | edit source]

It is not hard to imagine that videos can be encoded as series of image frames with synchronized audio tracks also encoded using bits.

Suppose you have a 10 minute video, 256 x 256 pixels, 24 bits per pixel, and 30 frames of the video per second. You use an encoding that stores all bits for each pixel for each frame in the video. What is the total file size? And suppose you have a 500 kilobit per second download connection; how long will it take to download the file?

This problem highlights some of the challenges of video files. Note the answer to the file size question is (256x256) pixels 24 bits/pixel 10 minutes 60 seconds/minute 30 frames per second = approximately 28 Gb (Gb means gigabits). This is about 28/8 = 3.5 gigabytes. With a 500 kilobit per second download rate, this will take 28Gb/500 Kbps, or about 56,000 seconds. This is over 15 hours, longer than many people would like to wait. And the time will only increase if the number of pixels per frame is larger (e.g., in a full screen display) or the video length is longer, or the download speed is slower.

So video file size can be an issue. However, it does not take 15 hours to download a ten minute video; as with image files, there are ways to decrease the file size and transmission time. For example, standards such as MPEG make use not only of image compression techniques to decrease the storage size of a single frame, but also take advantage of the fact that a scene in one frame is usually quite similar to the scene in the next frame. There's a wealth of information online about various compression techniques and standards, storage media, etc.[8]

Audio

[edit | edit source]

It might seem, at first, that audio files shouldn't take anywhere as much space as video. However, if you think about how complicated audio such as music can be, you probably won't be surprised that audio files can also be large.

Sound is essentially vibrations, or collections of sound waves travelling through the air. Humans can hear sound waves that have frequencies of between 20 and 20,000 cycles per second.[9] To avoid certain undesirable artifacts, audio files need to use a sample rate of twice the highest frequency. So, for example, for a CD music is usually sampled 44,100 Hz, or 44,100 times per second.[10] And if you want a stereo effect, you need to sample on two channels. For each sample you want to store the amplitude using enough bits to give a faithful representation. CDs usually use 16 bits per sample. So a minute of music takes 44,100 samples 16 bits/samples 2 channels 60 seconds/minute 8 bits/1 byte = about 10.5MB per minute. This means a 4 minute song will take about 40MB, and an hour of music will take about 630 MB, which is (very) roughly the amount of memory a typical CD will hold.[11]

Note, however, that if you want to download a 40 MB song over a 1Mbps connection, it will take 40MB/1Mbps, which comes to about 320 seconds. This is not a long time, but it would be desirable if it could be shorter. So, not surprisingly, there are compression schemes that reduce this considerably. For example, there is an MPEG audio compression standard that will compress 4 minutes songs to about 4MB, a considerable reduction.[12]

Sizes and limits of representations

[edit | edit source]

In the last section we saw that a page of text could take a few thousand bytes to store. Images files might take tens of thousands, hundreds of thousands, or even more bytes. Music files can take millions of bytes. Movie files can take billions. There are databases that consist of trillions or quadrillions of bytes of data.

Computer science has special terminology and notation for large numbers of bytes. Here is a table of memory amounts, their powers of two, and approximate American English word.

1 kilobyte (KB) —  bytes — thousand bytes
1 megabyte (MB) —  bytes — million bytes
1 gigabyte (GB) —  bytes — billion bytes
1 terabyte (TB) —  bytes — trillion bytes
1 petabyte (PB) —  bytes — quadrillion bytes
1 exabyte (EB) —  bytes — quintillion bytes

There are still higher numbers or smaller quantities of these types.[13]

Kilobytes, megabytes, and the other sizes are important enough for discussing file sizes, computer memory sizes, and so on, that you should know both the terminology and the abbreviations. One caution: file sizes are usually given in terms of bytes (or kilobytes, megabytes, etc.). However, some quantities in computer science are usually given in terms involving bits. For example, download speeds are often given in terms of bits per second. "Mbps" is an abbreviation for megabits (not megabytes) per second. Notice the 'b' in Mbps is a lower case, while the 'b' in MB (megabytes) is capitalized.

In the context of computer memory, the usual definition of kilobytes, megabytes, etc. is a power of two. For example, a kilobyte is bytes, not a thousand. In some other situations, however, a kilobyte is defined to be exactly a thousand bytes. This can obviously be confusing. For the purposes of this book, the difference will usually not matter. That is, in most problems we do, an approximation will be close enough. So, for example, if we do a calculation and find a file takes 6,536 bytes, then you can say this is approximately 6.5 KB, unless the problem statement says otherwise.[14]

All representations are limited in multiple ways. First, the number of different things we can represent is limited because the number combinations of symbols we can use is always limited by the physical space available. For instance, if you were to represent a decimal number by writing it down on a piece of paper, the size of the paper and the size of the font limit how many digits you can put down. Similarly in a computer the number of bits can be stored physically is also limited. With three binary digits we can generate different representations/patterns, namely , which conventionally represent 0 through 7 respectively. Keep in mind representations do not have intrinsic meanings. So three bits can possibly represent seven different things. With n bits we can represent different things because each bit can be either one or zero and are the total combinations we can get, which limits the amount of information we can represent.

Another type of limit is due to the nature of the representations. For example, one third can never be represented precisely by a decimal format with a fractional part because there will be an infinite number of threes after the decimal point. Similarly, one third can not be represented precisely in binary format either. In other words, it is impossible to represent one third as the sum of a finite list of power of twos. However, in a base-three numbering system one third can be represented precisely as: because the one after the point represent a power of three: .

Notes and references

[edit | edit source]
  1. Analog at Wiktionary.
  2. Actually, it's more complicated than that because some devices, including some digital radios, intermix digital and analog. For example, a digital radio broadcast might start in digital form, i.e., as a stream of numbers, then be converted into and transmitted as radio waves, then received and converted back into digital form. Technically speaking the signal was modulated and demodulated. If you have a modem (modulator-demodulator) on your computer, it fulfills a similar function.
  3. Actually we need not only data, but a way to represent the algorithms within the computer as well. How computers store algorithm instructions is discussed in another chapter.
  4. Of course how a 0 or 1 is represented varies according to the device. For example, in a computer the common way to differentiate a 0 from a 1 is by electrical properties, such as using different voltage levels. In a fiber optic cable, the presence or absence of a light pulse can differentiate 0's from 1's. Optical storage devices can differentiate 0's and 1's by the presence or absence of small "dents" that affect the reflectivity of locations on the disk surface.
  5. [1]
  6. You might have seen modern art paintings where the entire work is a single color.
  7. See, for example, [2] for examples of the interplay between compression rate and image fidelity.
  8. For example, see [3] and the links there.
  9. This is just a rough estimate since there is much individual variation as well as other factors that affect this range.
  10. Hz, or Hertz, is a measurement of frequency. It appears in a variety of places in computer science, computer engineering, and related fields such as electrical engineering. For example, a computer monitor might have a refresh rate of 60Hz, meaning it is redrawn 60 times per second. It is also used in many other fields. As an example, in most modern day concert music, A above middle C is taken to be 440 Hz.
  11. See, for example, [4] for more information about how CDs work. In general, there is a wealth of web sites about audio files, formats, storage media, etc.
  12. Remember there is also an MPEG video compression standard. MPEG actually has a collection of standards: see Moving Picture Experts Group on Wikipedia.
  13. See, for example, binary prefixes.
  14. The difference between "round" numbers, such as a million, and powers of 10 is not as pronounced for smaller numbers of bytes as it is for larger. A kilobyte is bytes, which is only 2.4% more than a thousand. A megabyte is bytes, about 4.9% more than one million. A gigabyte is about 7.4% bytes more than a billion, and a terabyte is about 10.0% more bytes than a trillion. In most of the file size problems we do, we'll be interested in the approximate size, and being off by 2% or 5% or 10% won't matter. But of course there are real-world applications where it does matter, so when doing file size problems keep in mind we are doing approximations, not exact calculations.


Algorithms and Programs

Algorithms and Programs

[edit | edit source]

An algorithm can be defined as a set of steps used to solve a specific problem. For example, a cook may use a recipe when preparing a specific type of food. Similarly, in computer science, algorithms are the conceptual solutions used to create programs. It is important to distinguish an algorithm from a program. The implementation of an algorithm is known as a program.

Defining information processes

[edit | edit source]

Computer is about information processes. Once information is represented concretely using different patterns of symbols it can be processed to derive new information. We learned that computers use the binary system internally to represent everything as sequence of bits - zeros and ones. Chapter 1 of the Blown to Bits book talks about the digital explosion of bits as the result of the innovations in computing and technologies, which enable us to turn information into bits and shared them with unprecedented speed.

Creating information processes is the topic of this chapter. We will learn that information processes start with conceptual solutions to problems and then can be implemented (coded) in ways understandable to machines. The conceptual solutions are called algorithms and the executable implementations are called programs.

What is an algorithm?

[edit | edit source]

Algorithm is a rather fancy name for a simple idea: a step-by-step solution to a problem. Avi Wigderson once said algorithm is a common language for nature, human, and computer. The idea has been around for a long time. You are already familiar with many algorithms, such as tying your shoes, making coffee, send an email, and cooking a dish according to a recipe. Algorithms in computing are designed for computers to follow. Imagine we have built a machine that can perform the single digit addition procedure described in chapter one. Recall the procedure performs the addition using simple table lookup. If we give the machine two digits and ask it to perform the operation, it gives two digits back as the answer. Of course the numbers in the inputs and the output have to be represented (encoded) properly. Even though the machine doesn't understand addition, it should be able to perform the addition correctly. However, the machine will not perform the addition unless it is instructed to do so. The command (with input values) that signals the machine to perform an addition is called an instruction. We have imagined that it is not hard to use the addition procedure to create other more complex procedures that can perform more impressive activities. Before we can create such procedures we must identify a problem and find a conceptual solution to it. Often time the conceptual solution is one that can be carried out manually by a person. This conceptual solution is an algorithm.

Why study algorithms?

[edit | edit source]

Algorithm is a center piece in the computer science discipline. As we discussed in chapter one, computing can be done blindly or purely mechanically by a simple device. The intelligence of any computation (information process) lies in the algorithm that defines it.

For an algorithm to be useful, it must be correct - the steps must be logical and specific for machines to carry out - and efficient - it must finish in a reasonable amount of time. The correctness and efficiency of algorithms are two key issues in the study of algorithms.

Programs are implemented algorithms

[edit | edit source]

Studying algorithms allows us to solve problems conceptually regardless of the machines that carry out the solutions. An algorithm must communicate a conceptual solution in a unambiguous and human understandable fashion. A notational system for describing algorithms should allow us to describe and reason with ideas on paper. Once an algorithm's correctness is verified on paper, it can be implemented as a program understandable to a particular machine.

Formal definition of algorithm

[edit | edit source]

Alan Turing is the first person who studied algorithms mathematically by creating a universal machine model, later called Turing machine. He also proved that computation is unavoidable in circumstances - we will have to perform the computation to the result, which separates computing from mathematics (the birth of computer science). The turing machine can represent/encode information in a standard form and interpret and updates the representation according to rules (algorithms), which is part of the representation. This machine model is simple yet powerful. In fact, it is the most powerful computing model known to computer scientist. The turing machine can perform any computation done by any machine we could ever build. Turing machine equivalents is defined based on this idea.

The turing machine model allows us to study algorithms in abstraction. For instance, we can view each algorithm as a state machine: an algorithm always starts with a state - represented by a data representation of the input and the internal information - and move through a number of states to its final state as the result of performing operations prescribed in the algorithm. When the number of possible initial states approach infinity the same algorithm can generate potentially infinite number of computation. This explains why it is hard to verify the correctness of an algorithm through testing as the initial states can be too many to exhaustively enumerate.

Define algorithms

[edit | edit source]

An algorithm is simply a set of steps that allow us to solve a particular problem. A step is a unit of work that is unambiguous and can be done in a fixed amount of time. For example bring a pot of water to boil is a step in the tea-making process/algorithm. In computing we deal with representations of information (data), so a step can be adding two integers and storing the result to a variable. We will explain how to define and use variables later.

The definition of a unit of work depends on what the agent, who performs work, can do. Algorithms in computing are necessarily informed by the capability of the computing machines. Recall that algorithms must be implemented/described in a programming language understandable to a machine before the machine can perform the task. There are many different programming languages, therefore different ways to express the same algorithm. The only language understandable to a specific machine is called the machine language. Machine languages are written in instructions consisting of zeros and ones (binary bits) because computers are fundamentally machines that can manipulate two types of symbols. Each different type of machine is designed to understand its own native language—patterns of zeros and ones—because they can have very different computing hardware. As you can imagine writing programs in machine languages can be very hard. Normally we write programs to express our algorithms in high level languages - languages that are close to our natural language, e.g. English. Then we use tools (compilers and interpreters) to translate our programs in higher level languages to machine languages, which is analogous to using a personal interpreter when we travel abroad and don't understand the native language. To run the same program on a different machine we can simply recompile it or use a different interpreter. High level languages hides the differences between machines to allow us to write programs in a machine independent way, which is a huge time saver. When we write programs in high level languages we use an abstraction that is supported by all computers. For instance if a high level language allows the expression of an addition we assume it can be done by all computers.

Programming languages (high level or machine level) are tools for expressing algorithms to machines. When we create algorithms to solve problems conceptually we want to create them independent of the languages. A well-designed recipe ought to work for different cooks in different kitchens. So the steps or units of work must be defined in terms of a higher abstraction - a set of common data structure, operations and control structures that all languages support. By creating algorithms conceptually using abstractions allows us humans to think on a higher level closer to the problem domain we know. When an algorithm is implemented in a particular language the abstract steps can be mapped to the specific expression in the language. We are confident the chain of tools we have can translate the solution to the machine level executable code. The following diagram shows that the same algorithm can be implemented in a variety of programming languages and the result programs can be executed on different machine models (maybe after some translation).

This diagram shows that the same algorithm can be implemented in a variety of programming language and then result programs can be run on different machine models.

Here are the common operations and control structures we can assume all high level languages support:

  • data structures: variables of single value and list of values.
  • operations: arithmetic operations, comparisons, and relational operations (and, or, and not)
  • control structures: sequential (one after another), conditional (selective on a condition), and repetition

Here is an algorithm defined in pseudo code (natural language). This algorithm finds the largest number in a list of numbers with the following steps:

  1. set max (a variable) to the value of the first number in the list (store and retrieve values).
  2. go through the list one number at a time comparing each number to max, if the number is greater than max replace max with the number (conditional and repetition).
  3. the value stored in max is the answer.

We know the solution is correct because it is so simple. We know how to carry it out manually, but we would certainly not solve the problem using the process expressed in psuedo code. It is necessary to design and express the algorithm in this detailed fashion for computers. Keep in mind that computers are machines that perform computation mechanically and therefore the instructions must be specific. The psuedo code, even though in natural language, must use the aforementioned constructs (data structures, operations, and control structures) that are common to computers. You may wonder why we can not ask the computer to look at the whole list and identify the largest number as we humans do. Computers are simple machines that can not think. They are designed to perform simple operations, e.g. adding two digits via symbol manipulation. Even we, as human beings, will not be able to scan a long list, say a million numbers, and find the largest number at a glance. The algorithm we write must be expressed in terms of what a computer can do and are scalable to inputs (data sets) of arbitrary sizes. The algorithm we just studied can deal with a list of any size. In fact, it makes little difference to a computer whether the list has three numbers or three million numbers.

Another way to express the same algorithm is to use a graphical notation called flow-chart.

This flow-chart shows the steps involved in finding the largest number in a list of numbers.

This chart shows the logic of the solution more clearly. There are two conditionals - the checking of a condition and corresponding actions taken as the result. The top-most conditional defines a repetition (loop) because their an unconditional branching back to the conditional as expressed in the arrow with no label.

Both the pseudo code and the flow chart describe the same solution to the same problem. They are different representations of the same idea. The following figure shows an implementation of the algorithm in Scratch.

This stack of blocks scratch finds the largest number in a list of numbers.

As you can see, a concrete implementation of an algorithm must use the building "blocks" available in the particular implementation language. What is not shown in the figure is the part where the list is populated with data from a file or user input. The structure of the code resembles that of the flow-chart.

In summary constructing and studying algorithms allows us to study (algorithmic) solution to problems in a language and computing environment neutral way. We could use the "finding largest number" algorithm as a single block to form more complex algorithms, but we need to keep in mind this block is not a unit of work as the number of steps involved depends on the input size (the length of the list). We will revisit this topic in the future when we talk about functional decomposition (to keep algorithms simple) and algorithm complexity analysis (count the cost of algorithms).


Programs Each software program boils down to two components - data (structure) and algorithm. We will study some fundamental algorithms and data structures in computer science.

Example Algorithms

[edit | edit source]

Image enconding/representation

[edit | edit source]

Follow this http://csunplugged.org/sites/default/files/activity_pdfs_full/unplugged-02-image_representation.pdf image representation activity to see how images are encoded, transmitted, and reproduced in fax machines.

Error Detection

[edit | edit source]

Follow this http://⁰csunplugged.org/sites/default/files/activity_pdfs_full/unplugged-04-error_detection.pdf error detection activity to see the algorithm works to detect and also correct single bit errors.

A similar algorithm, Luhn algorithm, is used to validate credit card numbers.

Text Compression

[edit | edit source]

Text compression is another important task in computing. The following activity demonstrates how a compression algorithm works: http://csunplugged.org/sites/default/files/activity_pdfs_full/unplugged-03-text_compression.pdf

Searching

[edit | edit source]

Why is searching important? We do it on a daily basis. It is good business too. Google's mission is to organize the world's information and make it universally accessible and useful. Obviously to be able to find the information we need fast is very useful and profitable.

We can always find a piece of information by going through a list of them sequentially checking each one of them. Could you describe the algorithm using either the pseudo code or the flow-chart notation? Structure-wise this algorithm should resemble the "find largest number" algorithm. This algorithm needs two inputs: the list and the target item we are looking for. The repeated steps are fetching the next item and comparing it to target. The piece of information used in the comparison is also known as the key because it determines whether a search is successful or not. For instance, if we have a list of students we can search a student by last name, birthdate, or shoe size, which are search keys.

A sequential search is straight-forward, but it can be costly if we need to perform it very often. However, if the list is ordered by the search key we can use a much better algorithm by taking advantage of this ordered property of the data. Think about the index of a book, a phone book, or a dictionary. They are all ordered somehow. For instance, home phone numbers in a phone books are usually ordered by owner's last names and business phone numbers are ordered by business types. Entries in a dictionary or the index of a book are ordered alphabetically. How does this orderedness help us when we search for information? It allows us to guesstimate where the search target is located. The number guessing game illustrate the idea well. If the list of numbers are random but ordered increasingly we can always guess the target number is in the middle of the search range (initially the whole list). If we are not lucky, we can eliminate half of the candidates - if the number in the middle is greater than the search target the new search range will be the first half of the previous search range, otherwise the second half. Imagine the reduction in the number of comparisons! We will study this algorithm in much detail when we discuss algorithm complexities.

Social impact

[edit | edit source]

Please watch the following video How algorithms shape our world. Could you explain in what ways algorithms are shaping our world?

In the world of computing, we store and process data (representations of information) by quantifying information. This quantification process reduces the world to what can be counted and measured and emphasize abstraction and efficiency (speed). We must not be fooled to believe the abstractions of reality are true reality as in abstractionism. As Frederick Brooks warns us “Models are intentional oversimplifications to help us with real-life problems that are frighteningly complicated. The map is not the terrain.”


Algorithm Design

Algorithm Design

[edit | edit source]

Algorithm design is a specific method to create a mathematical process in solving problems. One powerful example of algorithm design can be seen when solving a Rubik's cube. When solving a Rubik's cube (of any size), it is important to know the step by step instructions or algorithm to reach an efficient solution. This is where algorithm design comes into place. There are designs that breakdown the seemingly complex solution by addressing each layer (First, Middle, and Last and colors.

Please follow the link on solving the last layer of a 3X3 Rubik's cube.

Approaches to algorithm design

[edit | edit source]

Top Down

[edit | edit source]

The top down approach to design is starting by examining the full problem or one way to think of it is to look at the big/whole picture first. Once you have assessed the main problem then you divide the problem into smaller components or parts.

The next portion of the top down approach is to begin testing. Initially, we will have portions that are missing due to focusing on the bigger picture. In situations where parts of the problem have not been solved stubs or placeholders are used to as temporary holders.

One way to think about the top down approach is in a hierarchical setting such as a general command his troops. The general will breakdown a mission by assigning each soldier with a specific task to complete that in turn contributes to a critical part of the overall mission.

Bottom Down

[edit | edit source]

The bottom down approach to algorithm design starts with the smallest units or parts of a problem. This approach solves the smallest units first and then gradually builds out the next layer or solution. Using this method ensures that the smallest unit has been successfully tested and therefore, when you start solving or implementing the next sub-solution that it will work due to the previous layers working successfully.

One example is building a car. Each piece of the car is engineered, created, and tested piece by piece. Knowing that the smaller parts work correctly the parts are then gradually added on an assembly line. As the parts are added, you know that the smaller components work due to thoroughly testing each piece. Eventually, as you walk through this process the end result is a properly functioning car.

Algorithm Design: Building Blocks

[edit | edit source]

There are basic logic structures and operations involved in algorithm design. Building blocks are necessary to decide how we want to manipulate units of work. The basis of every algorithm is steps or blocks of operations. These steps/blocks of operation can be as simple as adding two numbers together. However, these blocks of operation can also be complex, for example, finding the maximum value in a list of numbers.

Logic structures in important for organizing steps into a process/solution. The following four basic logic structures describe the type of blocks used for creating algorithms:

  • procedure/function call - one example would be a single block in Scratch
  • sequence - in order to create a sequence you need a stack of blocks
  • alternatives - use of the if-then-else blocks to indicate alternate solutions for a particular problem
  • iteration - using the "repeat", "for", and "forever" blocks to build loops to solve problems

Most languages have programming constructs for basic operations and logic structures. In order to understand programming constructs you must first learn about constructed language. According to Wikipedia, a constructed language "is a language whose phonology, grammar, and vocabulary has been consciously devised for human or human-like communication, instead of having developed naturally". Similarly, a programming construct is "designed to communicate instructions to a machine, particularly a computer."


Algorithm Complexity

Algorithm Complexity

[edit | edit source]

"An algorithm is an abstract recipe, prescribing a process that might be carried out by a human, by computer, or by other means. It thus represents a very general concept, with numerous applications."—David Harel, "Algorithmics - the spirit of computing".

We have learned that algorithms are conceptual solutions to problems. Computing algorithms can be described/defined in terms of the common units of work that computers can do so that they are neutral to programming languages and the execution environment (computers). Algorithm writing is such a creative process as Francis Sullivan noted that "for me, great algorithms are the poetry of computing. Just like verse, they can be terse, allusive, dense, and even mysterious. But once unlocked, they cast a brilliant new light on some aspects of computing." You have seen example algorithms documented using abstract notations such as pseudo code and flowchart.

Once algorithms are implemented/coded, i.e. represented by concrete symbols, they become alive in program that embody them. Programs are executable/runnable by computers. The ability to run different programs that implements all kinds algorithms is a unique feature of computers as machines. Usually when we buy a machine, e.g. an appliance, we assume it has a set of well defined functions it can perform. For example a microwave oven is supposed to warm and cook our food and nothing more. We don't expect a microwave oven ever to be able to wash the clothes for us. A computing machine (a computer) is different. We expect it to perform the function whichever program makes it to do - the functionality of a computer is extensible. If we liken a computer to a car, programmers are drivers who can make the car do different things (to a certain extent) and users are like passengers taking advantage of the things the car can do. This is another reason why everyone needs to learn computer programming because it gives you the freedom to make the computer do different thing.

Correctness of Algorithms

[edit | edit source]

Algorithms must be correct to be useful. We must examine our algorithms carefully to remove errors before using them to create programs. If an algorithm has errors, it is impossible to create correct programs. We often remind students that if their algorithms do not work on paper they won't work on a computer. So we must work out our algorithms on paper first.

Even if the design of an algorithm is correct we may introduce errors during the programming process. As any natural language a programming language has its own syntax consisting of grammatical rules for using the language. When we violate such rules we introduce syntax errors in our program. This type of errors are easy to fix, in fact most modern program editors can detect and warn us about such errors. Another type of errors are logic errors, which result from the misuse of the language. In other words, our grammatically correct program doesn't make sense or make the wrong sense to the computer. For example, a recipe can be unclear or misleading even though all sentences are grammatically correct. You may think that computers can also make mistakes when running programs that are logically. This is true but very rarely this is the case especially with modern computers equipped with error detection mechanisms. We generally assume computers don't make mistakes. If the program doesn't generate the correct answer, it is the result of human errors.

We also call logic errors software bugs. The original "bug" is, in fact, a hardware failure - a moth caught in a electromechanical computer. Now bugs are generally used to refer to any error/failure in computer systems, both in hardware and in software. When a computer program is buggy, it will produce erroneous results or crash as the computer may not know what to do next. Another more subtle bug may cause the computer program to never finish, known as the infinite loop, which is obviously not what we wanted.

Bugs are almost inevitable as humans make mistakes. Then, how do we fix bugs to ensure the correctness of our programs? We must test our programs to verify its correctness. A test consists of a sample input to a program and the desired output from the program. We can run a test by subject the program to the sample input and collect the actual output. If the actual output matches the desired output the program passes the test, otherwise their is a bug in the program or in the test (tests can be buggy too). We usually use a set of tests (a test suite) to exercise different parts of the algorithm. This process is called debugging as expressed in the following pseudo code:

for each test in the test suite
  run and compare the actual output to the desired output
  if they match move on to the next test
  otherwise fix the bug and repeat the whole process

Note that it is very difficult to make our tests exhaustive except for very simple programs. When a program becomes larger and more complex the number of tests need to cover all possible cases grows very fast. As Dijkstra said "program testing can be used to show the presence of bugs, but never to show their absence!" There are techniques for proving the correctness of programs. For instance, microcode for computer processors are often proved correct via formal verification.

"Goodness" of algorithms

[edit | edit source]

There are usually always multiple ways to solve the same problem and, therefore, multiple algorithms to make computers solve the problem for us. In addition to solving the problem correctly we want to be able to compare/evaluate algorithms that solve the same problem. We must define the criteria/measure for "goodness" in algorithm design. Such criteria or measures can include simplicity, ease of implementation, speed of execution, or preciseness of answers. In computing we care the most about the solution speed because a fast algorithm can solve larger problem or more problems in the same amount of time. This is also known as efficiency - an economic measure of many processes. Often time he usefulness of many programs depends on the timeliness of the results. For instance, a program that takes 24 hours to predict the next day's weather is almost useless.

Given an algorithm how do we measure its "speed"? One possible approach is to implement the algorithm, run the result program, and measure its execution time - elapsed time between the start and the end of a program run. There are some challenges in this approach. First the algorithm must be implemented, which can a serious undertaking. Secondly, to run two programs to compare their execution time we must subject them to the same input (a.k.a a workload, e.g a list of one million numbers to be sorted) and the size of the input is never ideal. Thirdly, the "speed" of a running program is influenced by the execution environment, such as the machine's hardware configuration. Take a recipe for example. Different cooks will surely spend different time following it. The amount of food needed will surely magnify the difference - as we increase the amount of ingredients the lengthy recipes will take even longer to follow. But there are intrinsic characteristics of the recipes that affect the preparation time. If a recipe involves beating eggs can instruct the cook to break each egg and scramble it individually or break all eggs first and scramble them together. Obviously the first method is slower due to the additional steps involved. Algorithms in computing exhibit similar characteristics. Recall that algorithms must be defined in terms of the units of work (steps) computers can perform so that it is straightforward to implement them in programming languages. The way the steps are ordered and organized in algorithms can significantly affect the execution time of the programs implement the algorithms.

Since an algorithm is a conceptual solution to a problem, we want to study their "speed" in an abstract sense without actually implementing them. This is known as algorithm analysis in computer science. In this approach we take an algorithm described in pseudo code or flow chart, count the number of steps (units of work), which is always a function of the input size. In the aforementioned example recipe the time it takes to follow the first method (break each egg and scramble it individually) is directly proportional to the number of eggs involved. In fact if only one egg is needed, there is no difference between the two methods. Instead of measuring the steps taken for a particular input size, we focus on the relationship function between the number of steps and the input size, which shows the pattern in which the amount of work (cost) grows as the input size increases. Such functions are also known as growth functions. Then, we apply asymptotic analysis, which compare functions as inputs approach infinity, to simplify the functions because as the input size approaches infinity the difference between the units of works disappears (we can assume breaking an egg and scrambling it take the same amount of time) and the cost of most "complex" part of the task will dominate (a part that repeats a step 10 time will dwarf any step that is only done once) the total cost. For example, in a recipe we have a one quick step (takes 0.0001 seconds per serving) that repeats 10 times and a slow step (takes 10 seconds per serving) doesn't repeat, an amount of serving of N (input size) would cost total seconds on the repeated steps and would always cost 10 seconds on the slow step. When N is bigger than 10000, the repeated part would cost more than the slow part. In asymptotic analysis we can ignore the slow step because its contribution to the total cost is negligible when N approaches infinity. With simplify growth functions we can put them into categories considering algorithms in each category to have similar performance characteristics. This type of analysis is not completely precise but it can be very useful in studying the nature of algorithms and predicting their performances. We learn how to put algorithms into categories and rank their performances according to the categories they are in. We denote each category using big O notation.

In summary we have discussed the following few key concepts regarding algorithm complexity analysis:

  • Algorithms are conceptual and abstract solutions to problems and programs are concrete executable code computers can run. We can not run algorithms on computers; we can only run programs that implement algorithms. Therefore, the performance of an algorithm is by definition abstract (can not be concretely defined or measured).
  • The goodness measure of an algorithm has to be an intrinsic characteristic of the algorithm itself - something that reflects the "cleverness" of the design regardless the implementation details and the future execution environments.
  • From an economic perspective we care about the cost of algorithms both in terms of time and space. We will focus only on the time cost in this course. We can not actually measure the time cost of algorithms, but we can study the relationship between the time cost and the input size, which reflects the internal complexity (or cleverness) of algorithms. We represent such relationships as growth functions with the input size as the variable and the total cost as the value. The growth functions can be simplified by keeping only the dominate term (asymptotic analysis) because other terms won't matter when the input becomes really large (eventually approaching infinity). Simplified growth functions are put into categories and algorithms can be rank by the categories they belong to.
  • Algorithms in the low complexity category will perform better than algorithms in the higher complexity categories when the input size is sufficiently large. We care about large input sizes because any algorithm can solve a small problem fast. With this algorithm analysis technique we can evaluate and compare algorithms to predict the relative performance of the programs implementing such algorithms before actually implementing any of them.
  • Efficiency, as another term often used, is reversely proportional to complexity. A more complex algorithm is less efficient because it makes less efficient uses of the computing resources by being more complex.

Examples

[edit | edit source]

There are at least two ways to calculate the sum of all numbers between 1 and N. The following are two algorithms:

  • Algorithm 1: add all the numbers up manually, one by one
  • Algorithm 2: calculate the result using this formula

Consider the following question: if we carry out the two algorithms manually, which one would run faster if

  • N = 2?
  • N = 100?
  • N = 1,000,000?

Lets see how a algorithms behave (after implemented) on a computer. The following script implements algorithm 1 using a block that iterates through all the numbers in the range and adding them to the sum one at a time.

A Snap! script that adds all the numbers between two numbers.
A Snap! reporter block that adds all the numbers between two numbers and reports the sum.

The same problem can be solved using algorithm 2, which use a function to calculate the result as shown in the following script and the report block.

A Snap! reporter block that calculate the sum of all numbers in range and reports the result.
A Snap! block that calculates the sum of all numbers in a range.

Both scripts (programs) take the same input - two numbers that define a range. The number of numbers in the range is the input size or more properly the problem size. We assume the units of work are arithmetic operations , assignment operations, and report operation. Such an assumption is reasonable because in deed those operations takes the same to to perform regardless the operands. If you run the programs and try different input size you will observe that as you increase the input size the execution time of the first program increases steadily whereas the second program shows not change in execution time. Can we predict such behaviors?

Lets apply algorithm analysis to these two algorithms. The first algorithm loops through the numbers to get the sum, therefore the relationship function between cost (the number of steps - additions) and the input size is assuming N is the input size and a and b are the cost for creating the script variable and the cost for an addition. Note that both a and b are constant because they don't change for a given computer. We can simply the function to because when N approaches infinity the constants don't matter anymore. We assign algorithms with this type of growth function into the leaner algorithm to the leaner this type of functions linear time category denoted by . The second algorithm always takes the same amount of time because it performs the same set of operations regardless of the input size. It belongs to the constant time category denoted by .

Lets consider another problem: are numbers in a list distinct? Here is a straight forward algorithm in pseudo code:

  • Step 1: compare the first number with all of the other numbers in the list. If, at any point, the same number is seen, stop and answer NO.
  • Step 2: repeat Step 1 by taking the next number from the list and comparing it with all of the other numbers.
  • Step 3: after using all numbers in the list, stop and answer YES.

The input size is the size of the list. The more numbers in the list the longer it takes to answer the question. According to algorithm it is possible that the first comparison find two identical numbers, which answer the questions right a way. This is good design because at that point it is unnecessary to continue with the rest of the algorithm. When we analyze algorithms we focus on worst cases because the performance of an algorithm is dependent from the actual input and we should not rely on luck! So the relationship (growth) function between the cost and the input size for this algorithm should be because in the worst case (all numbers are unique) the algorithm has to compare each number to all the other numbers in the list before it arrive at the answer. To simply the function by keeping only the largest dominating term we get , which puts the algorithm in the quadratic category . If you are interested you can reproduce and run the following script (an implementation of the algorithm) to verify the predicted performance characteristics of the algorithm.

A Snap! reporter block that tests whether all the numbers in a list are unique.
A Snap! script that tests whether all the numbers in a list are unique.

Lets look at an algorithm from another category: print all n-digit numbers?

For each number in 0, 1, 2, 3, ..., 9 
 use the number for the first digit
 enumerate all n-1 digit numbers
 append each every n-1 digit number to the first digit to form a n digit number

This example is a little contrived but it demonstrate a new performance category and is very useful when we discuss encryption techniques. The algorithm is complex simply because the amount of output (all n digit numbers) it has to generate. So the cost of the output part will dominate the total cost. To generate all n digit numbers we must first generate all n-1 digit numbers. To generate all n-1 digit numbers we must first generate all n-2 digit numbers and so on. It is easier to study the process backward. Assuming outputting a number is the unit of work (it takes the same time output a number) the cost to generate all one digit numbers is 10. To generate all two digit numbers we have to output numbers. For three digit numbers the cost is . Did you see a pattern? To generate all n digit numbers it costs . This type of algorithms goes much faster than the quadratic ones because the input size is in the exponent. They belong to the exponential time category denoted as . The root (in this case 2) doesn't matter. Because all exponential functions are more similar to each other than quadratic functions or linear functions.

Here is a YouTube video illustrating the performance difference between the bubble sort algorithm and the quick sort algorithm: https://www.youtube.com/watch?v=aXXWXz5rF64


Abstraction and Recursion

Abstraction and Recursion

[edit | edit source]

Programming is easy as long as the programs are small. Inevitably our programs will grow larger and larger as we create them to solve increasingly complex problem. One technique we use to keep our algorithms and programs simple is abstraction, which is an idea widely used in many fields such as art, math and engineering. In this chapter we will study how to apply this technique is algorithm design and programming.

An abstraction removes details to help us focus our attention. For instance, a car present a set of simple controls as an interface to its drivers. As long as we know how to use the interface we can drive the car without knowing how it operates under the hood. The internal working of a car is unnecessary detail to drivers unless they are race car drivers who need this type knowledges to drive the car most efficiently. This interface hasn't change much since when the first car was made. Abstraction also generalizes concepts by extracting common features from specific examples. This car interface extracts common car features (what a driver needs to know to drive a car) from all kinds of cars. Once you learn how to drive one car you have learned how to drive all cars of the same type. This is a powerful idea. Such abstraction also gives car makers the freedom to change the internal design of a car without affecting the users.

Abstraction in Computing

[edit | edit source]

We have learned that algorithm design is an integral part of problem solving using computing and a first step in programming. The hardest part isn't programming/coding but the keeping track of details in large programs. There are two primary ways to keep our programs "small": chunking and layering, which are two metaphors for abstraction. Chunking breaks down (decompose) functionality into smaller units and let units interact with each other through a well-defined interface. For instance, in Snap! you can implement an algorithm as a block, which then can be used anywhere in your script as long as you can call the block with a proper sequence of parameters according to the interface. Layering separates the functional units (blocks) into layers to separate concerns and simply interactions patterns to make the complexity more manageable. The following figure illustrates the idea of layering.


By organizing functional units (blocks) into layers we can simply the interactions and allow concurrent development of the layers.

In the figure each layer relies on the layer below it to function and provides services to the layer above it. For example, a unit in layer one is only allowed to call units in layer 2 below it. All interactions are limited to pairs of layers that are next to each other in a stack of layers. We could replace a layer completely with a new implementation without affecting the rest of the stack, which achieves modularity. On the contrary if any arbitrary interaction is allowed we may end up with tightly coupled system as shown in the following figure.


Without any restriction any unit (block) can call any other unit.

Abstraction Examples

[edit | edit source]

Using abstraction to achieve simplification and generalization is truly a powerful organizing idea. Recall the thought experiment in chapter one, in which we built a machine that can potentially solve groups of equations. The machine was built through abstraction - we construct larger building blocks using smaller (more elementary) ones and treat each block as a unit ignoring the internal details.

The Snap! environment allows us to construct programs in a similar fashion. When a block is defined it becomes a new building block (user defined). The block can be arbitrarily complex, but to use it we only need to know the interface - its name and its list of parameters. The unnecessary details are hidden from the users greatly simplifying the thinking involved in programming.

Lets look at a concrete example. In order to make a sprite to draw different equal-lateral shapes we can create a block for drawing each type of shapes such as triangles, squares, pentagons, and etc. A block in Snap! is an abstraction that hides details and represents a certain behavior/intention. The following block draws a square with a specific size.


This Snap! block draws a square with a specific size.

To draw a triangle we can use a similar logic structure.


This Snap! block draws an equal-lateral triangle.

The draw triangle block repeat three times to draw each side with a turn of 120 degrees. Do you know why the sprite has to turn 120 degrees to form a 60 angel between the two sides? I hope you have noticed that the same logic structure can be used to create blocks for drawing any equal-lateral shapes. Instead of creating a block for each shape we can generalize the task into one block that can draw any shapes. This block needs an additional piece of information - the number of sides.


The Snap! block can draw any equal-lateral shape.

With the number of sides we can determines the internal angle of the shape, which is all we need to draw the shape. Please checkout this resource if you are not sure how to calculate the internal angle using the number of sides. This block can serve as an abstraction of the tasks of drawing equal-lateral shapes (polygons). You may have noticed the length of the sides is hard-coded (typed in as a constant not a parameter). What if we want to draw shapes of different size? We can further generalize the function of the block by adding anther parameter and use it to control the side length.


This Snap! block draws any equal-lateral polygon of any size.

Through this example, we have demonstrated defining blocks to abstract away details of a task and generalizing a solution to solve more problems.

Recursion

[edit | edit source]

Recursion is a pattern that is self-similar - the whole consists of smaller parts that are structurally similar to the whole. For example, a tree consist of branches that look like smaller trees. Similarly, a directory tree of a file system on a computer and an ancestry tree genealogy exhibit a similar pattern. The following figure shows a recursive tree.


Tree created using the Logo programming language and relying heavily on recursion.

Self-similarity allows us to define concepts that exhibit such a pattern in a more concise and elegant way. A tree can be either a trunk with no branches or a trunk with a number of branches, each of which is a tree. This definition covers all possible tree structures. How would you describe the following picture?

If you were to do it by delving into finer and finer details repeatedly it never ends. Can you define the picture recursively? Another example is the definition of the factorial function in math: and for all This recursive definition not only defines factorial but also describes a way to calculate factorial. For example, 5! can be calculated from 4!, which is 4 times 3!, which is 3 time 2!, which is 2 times 1!, which is 1 by definition.

If the problem we are solving is self-similar - solving the whole problem is a matter of solving the parts that are similar to the whole - the solution we are defining for the whole can be used to solve the subproblems(recursion). The beauty of a recursive solution is that the definition of the problem is the solution as shown in the factorial example. To design a recursive solution we practice wishful thinking - as we describe the solution to whole problem we wish it is fully defined and can be used to solve the smaller subproblems. In computer programming, this is called recursive thinking/programming. To program recursively, we describe the solution to a problem as a function/procedure/block, in which we break the bigger problems into smaller ones and use the function we defining to solve the smaller problems. If the problem is finite, eventually the smaller problems are so simple that they are directly solvable. In such cases the recursion stops. We call those cases base cases. By the time our program reaches all base cases, we would have solved the whole problem because all the subproblems are solved including the problem we start with. In any recursive function, two essential parts must exist: base cases and recursive cases. The recursive cases part breaks a bigger problem into smaller ones. The base cases stops the recursion by solving the directly solvable problems. If both parts exists and are structured properly, the algorithm (function) can solve problems of any size by asking clones of itself to solve partial problems. Recursive problem solving is a powerful idea because it simples our thinking: if you can define a problem recursively you can solve it recursively. Recursive solutions are more elegant and easy to verify, but it only lends itself to problems that can be defined recursively.

Before we study some concrete examples we will introduce the concept of function, which makes recursive solutions more manageable. A block in Snap! is considered a function (similar to a math function) if it has the following properties:

  • a function takes an arbitrary number of inputs (zero or more)
  • a function always returns/reports exactly one result value
  • for the same input a function always reports the same result value
  • the execution of a function has no side effects to the environment

With such restrictions functions in Snap! are blocks that performance a task in isolation (in its own world) and hand off the result to be further processed. According to the definition what blocks in the following list are functions?


Some blocks in this list are functions.

Recursion Examples

[edit | edit source]

Consider the binary search algorithm:

Find an item (target) in a sorted list
 if the list is empty, the target cannot be found
 consider the item in the middle, if it matches the target you are done 
 otherwise, decide which half to search next

To search one half of the ordered list is just a smaller version/part of the overall problem so we can apply the same algorithm wishing the algorithm is fully defined. This process goes on till a list is empty or the search target is found.

Clearly the base cases are

  • if the list is empty, then the target can be found
  • if target in the middle of a list, then the target is found

The recursive cases are (assume the list is sorted in ascending order)

  • if the item in the middle of a list is smaller than the search target, continue searching in the second half of the list
  • otherwise, continue searching in the first half of the list


Recursion Revisited

Recursion Revisited

[edit | edit source]

Recursive solutions provide another powerful way to solve self-similar problems. The example that we will examine is the binary search solution.

How Binary Search Works?

[edit | edit source]

The process for identifying a target item in a sorted list. You start with the first base case which is to directly solve whether the middle item is equal to the target. If the middle item equals the target then the search is complete and the index is reported. If the list is empty then the search reports -1 or not found. Otherwise, decide if the target is greater or less than the middle item to dictate which half of the list to search.

Next, if the middle item is greater than the target the first half of the list is searched, else, we search the second half of the list. This is the same problem that we originally started with making this a recursive or self-similar problem.

The image below is the binary search code implementation. The base cases and recursive cases are labeled.

Binary search is used to find an item (target) in a sorted list. It consists of bases cases and recursive cases that make up the recursive solution.
Binary search is used to find an item (target) in a sorted list. It consists of bases cases and recursive cases that make up the recursive solution.

Binary Search: Abstraction Simplification

[edit | edit source]

One excellent way to simplify the binary search solution is using abstraction. The image previously shown uses a helper block called "middle between" which is used to find the middle index when given the first and last index of a sorted list (see below).

The helper block used in a binary search to find the middle index between the first and last index given as input.
The helper block used in a binary search to find the middle index between the first and last index given as input.

Another way to simplify the binary search solution is adding another level of abstraction. The helper block below shows the target and list being passed in as input into the "binary search for" block (see below). The actual recursive solution is implemented when the recursive search block detects the first and last index of a sorted list. The "binary search for" block allows users to call the recursive search without the user being required to provide the first and last indices. The user will only see the information and/or blocks necessary to start the binary search. The user will not see the recursive search or behind the scenes of the solution).

The binary search helper block that adds another level of abstraction that searches for the target in a list by calling the recursive search block.
The binary search helper block that adds another level of abstraction that searches for the target in a list by calling the recursive search block.

Binary Search: Tracing

[edit | edit source]

We have previously discussed simplifying our interface for the recursive search using "binary search for" which takes the target and a list as inputs (see example below). Then, the "recursive search for" is called and the first base case is immediately solved as our target, 9, is not equal to 5. Since the target 9 is greater than 5 (element at the middle index), we search the second half of the list (index 4 to index 5). The process is repeated and the list is split checking the first element at position and/or index 4. The target is greater than 7 and the recursive search is repeated again searching the second half of the list again. The base case 2 immediately solves that the target 9 is equal to 9 in the list; which located at the middle index 5.

Now, since the target has been found at index 5, the recursive search reports back to the second recursive search which in turn reports back to the first recursive search. Finally, index 5 is reported back to the user at top level of the "binary search for" block.

This image shows how to trace the binary search when the target is equal to 9. Step by step the base case and recursive cases are used for reporting.
This image shows how to trace the binary search when the target is equal to 9. Step by step the base case and recursive cases are used for reporting.

In the event the target is not found in a list the binary search reports -1 (see below). The same recursive search calls are made, but on the last call the start index is greater than the last index which indicates the target is not found and the end of the list has occurred.

This image shows how to trace the binary search when the target is not found in the sorted list. The solution reports -1 or not found if the target is not in the list.
This image shows how to trace the binary search when the target is not found in the sorted list. The solution reports -1 or not found if the target is not in the list.

Koch Curve

[edit | edit source]

In 1904, Helge von Koch discovered the von Koch snowflake curve, "a continuous curve important in the study of fractal geometry" (3). The Koch curve starts with a single line segment that is 1 unit long. Then, the segment is replaced with four new segments, each one-third the length of the original, also called the generator. The process is repeated forever and creates the Koch curve. The image below shows this process for stages 0 to 2.

This shows the generator (stage 0) and iterations of stage 1 and 2 of the Koch curve.
This shows the generator (stage 0) and iterations of stage 1 and 2 of the Koch curve.

The length of the Koch curve is infinite. The curve is another interesting implementation of recursion where it is self-similar; the curve copies itself over and over.

Koch Snowflake

[edit | edit source]

Using the same Koch curve generator on all three sides of an equilateral triangle we see repeated iterations eventually start to look like a snowflake (please see Koch Snowflake). The snowflake has infinite perimeter and finite area.

Examining each stage of the Koch Snowflake a pattern is created for the number of segments per side, the length, and the total length of the curve as seen below. The number of segments per sides is increased four times the previous stage. The length of the segment is divided into equal thirds each time giving us the length of each segment. The original stage, an equilateral triangle, is why we take three times the number of segments per side divided by the length of a segment raised to the number of the stage currently being implemented.

The Koch Snowflake pattern can be seen by tracking the different iterations.
The Koch Snowflake pattern can be seen by tracking the different iterations.

Exploration Questions

[edit | edit source]

Studying the Koch Curve and Snowflake patterns are established and show how the same process is repeated to generate each stage. We can use the recursive process to solve abstractions of the same problem.

  • What would the total length of the N-th iteration be? Look at the patterns made by the numbers both before and after simplifying.
  • What do you expect the Koch Curve to look like? In other words, what would you expect to happen if you repeated this infinitely many times?
  • What is the length of the Koch Curve?
  • Can you estimate the area at each stage? What is the area of the final snowflake?

Reviewing the previous questions we can start to observe the behavior of the Koch Curve and Koch Snowflake.

  1. Based on the patterns established it is clear that the total length of the N-th iteration would be 3*(4/3)n.
  2. As the Koch Curve is infinitely repeated it will start to look more smooth as the lines will appear closer and closer although never touching.
  3. The length of the Koch Curve is infinity (after applying the generator infinite number of times).
  4. The area of the final snowflake is bounded (finite).

[1]

  1. Niels Fabian Helge von Koch. (2014). In Encyclopædia Britannica. Retrieved from http://www.britannica.com/EBchecked/topic/958515/Niels-Fabian-Helge-von-Koch


Higher Order Functions

Higher Order Functions

[edit | edit source]

higher order functions offer a more powerful ways to generalize solutions to problems by allowing blocks to take blocks as parameters and returning a block as a return value. All other functions are called first order functions. An example higher order function in math is the derivative function which takes a function as the input and produces another function (the derivative of the first function) as the output. In computer science, a map function takes an arbitrary function and a data set (e.g. a list) and applies the function to each and every data item in the set. Another example is the reduce (or fold) function, which takes a input function and data set and produces the aggregation of all items in the data set using the function. For instance, if the input function is the addition the reduce function returns the sum of all items in the data set as the output. If the input function is multiplication, the reduce function produces the product of all items in the data set. Higher order function also allows us to create compositions of functions using existing functions on the fly. For example, given two functions and and and we can create a function so that .

Examples

[edit | edit source]

The following script uses the built in map block/function to apply the same block/function to each and every element in a list.

The built in map block (function) is used to apply a block (function) to each and every element in a list.

To apply a different function we simply need to find or implement the function and use it as the first parameter to the map block. The next example uses the multiplication function that takes two parameters. Snap! is smart enough to detect that and use each element of the list as both parameters when the function is applied. So the result list should contain the elements from the original list multiplied to themselves.

The built in map block (function) is used to apply a block (function) to each and every element in a list. Because the function passed in takes two parameters when the function is applied to a element of the list the same element is used as both parameters.

This map block generalized this pattern of applying the same function to multiple data items into a block. It doesn't simply programming because someone has to write this map block (check the source code to see how complicated it is), but it makes programmers happier because it keeps the thinking part simpler. As programmers we are freed from worrying about the iteration of lists so that we can focus on the function that needs to be applied to the list.

reduce

[edit | edit source]

The following two examples use the built-in reduce function in Snap! to calculate the sum and the product of a list of numbers.

This block applies the addition function to each and every item in the list to calculate the total.

This block uses the built-in reduce function in Snap! to calculate the product of a list of numbers.

Note that the reduce (combine with) function can take any function with two input parameters to aggregate a list of values to a single value. By using higher order functions we can create generalized solutions that can be customized to solve a larger set of problems.

return blocks as data

[edit | edit source]

The following block demonstrates the use of blocks as data. In this block two reporter blocks are taken in as parameters, which are then used to form the report value - a new block. The new block applies the two input report blocks to some unknown input value. Note that the "ring" (gray frame) around the report value is very important. Whatever enclosed in the "ring" will be treated as data not programs. The application of the two functions will not be evaluated but will simply be returned as data without evaluation, which is what we wanted in this higher order function.

This block takes two (reporter) blocks as input parameters and reports a new block as the output. The function of the new block is the composition of the two functions represented by the two input blocks - when the new block is called it takes the input to the new block, applies the two functions (specified when the new block is created) to the input, and reports the result.

To use the composed function we can call the compose function with the two input functions and use the "call" block to execute the composed block against an input value as shown in the following script.

The compose block is called to form a composed block (function) from the two input blocks. The composed block is then applied to the input value of 3. The result of this block call is 1+2+3=6.

With this "compose" block we define new functions by combining existing functions on the fly - when we need them. This is a powerful way of generalization we cannot achieve without using higher order functions.


The Internet and the Web

The Internet and the Web

[edit | edit source]

The Internet and the Web give us the ability to connect to countless resources and is molding the way our society utilizes technology for online storage and services. We will use principles previously learned to examine Internet and Web communication. The principles we will examine are:

  • information can be encoded into messages
  • a coordination system is a set of agents interacting with each other toward a common objective
  • messages can hide information

Computer Networks

[edit | edit source]

A computer network is considered a communication sub-system that connects a group of computers enabling them to communicate with each other. When thinking of a computer network you must consider two parts that make it possible:

Hardware:

  • network interface card (NIC) - required in order to connect to a local area network
  • cabling or antennas - required to carry signals for transmission
  • network switches - used to relay signals

Software:

  • Programs - used to process information (bits) using algorithms

Network Standard

[edit | edit source]

Similar to encoding procedures used for bits, the same idea of a standard must be used for networks. In order for communication to occur it requires a standard for devices, message format, and procedures of interactions. These standards provide ordered processes for communication.

Once we have the standards in place we can examine what actually makes a network tick. As stated previously, a computer network is made of two parts: hardware and software. The physical hardware sets the way for communication to travel, but does not enable the network. The software (programs) are the pieces that make a computer network to allow software-to-software communication.

The focus of this chapter will be on following three software standards:

  • the Internet protocol suite
  • layers of software
  • abstraction being used for simplification

Background Definitions

[edit | edit source]

Knowing the definitions from the links provided will give you a foundation for the material in this chapter.

Stack of Protocols

[edit | edit source]

When analyzing the protocols needed to allow communication over a network, we see that different protocols are layered to create levels of abstraction. These abstraction layers are used both for the upper and lower layers (see image below).

Shows the stack of network agents used to transmit a message from one computer to another.
Shows the stack of network agents used to transmit a message from one computer to another.

Message Analogy

Let's say that Computer A wants to send a message to Computer B. Trace through the steps below to see how a message is sent via the two stacks of agents.

  1. Only A4 and B4 can access the physical mailboxes to send and receive packages
  2. A1 puts the message into packages
  3. A2 adds sequence numbers and tracking numbers to packages
  4. A3 adds address labels
  5. A4 puts the packages in the outbox
  6. packages arrive at B4's inbox
  7. B3 accepts packages addressed to B
  8. B2 checks use sequence numbers to put the packages in order and acknowledges the packages using the tracking numbers to A2
  9. A2 re-sends a package unless acknowledged
  10. B1 opens the packages to reconstruct the original message

The network protocols work in the same way with A1 to A4 and B1 to B4 being software. The delivery mechanism used between A4 and B4 usually consists of metal wires, fiber optic cables, or radio waves in the air.

Delivery Mechanism

Previously, we established how information is transmitted using Computer A and Computer B. Two delivery mechanisms that are used today for communication between networks are circuit switching and packet-switching. When you think about a telephone network, this network requires a connection is established before communication can occur. For example, when you call someone, the phone rings until the other person picks up or voicemail initiates; this type of communication is known as synchronous communication.

The opposite is true for computer networks which use packet-switching. When using packet-switching, each packet (which is a small package of information) is individually addressed and delivered separately. The process mimics how mail packages are delivered via shared media, i.e. trucks, trains, ships, and airplanes. For instance, when you send a letter, you do not wait until the recipient is ready. This type of communication is called asynchronous communication.

The Internet

[edit | edit source]

We have seen different standards and/or protocols of the Internet.The following describes the different characteristics of the Internet which will be important when distinguishing the Internet from the Web.

  • An infrastructure for communication (information highway)
  • A global connection of computer networks using the Internet Protocol (IP)
  • Uses layers of communication protocols: IP,TCP, HTTP/FTP/SSH
  • Built on open standards: anyone can create a new internet device
  • Lack of centralized control (mostly)
  • Everyone can use it with simple, commonly available software

The World Wide Web

[edit | edit source]

The World Wide Web is often confused with the Internet as it is used in conjunction with the Internet. The web is only one of the services provided through the Internet. It is important to know the characteristics of the Web (see below):

  • A collection of distributed web pages or documents that can be fetched using the web protocol (HTTP-Hyper-Text Transfer Protocol)
  • A service (application) that uses the Internet as a delivery mechanism
  • Only one of the services that run on the Internet along with other services: email, file transfer, remote login, and etc.

The Web

[edit | edit source]

There are two roles that work together to make up the web: Web servers and Web clients (browsers).

Web servers

  • Software that listens for web page requests and has access to stored web pages
  • Apache, MS Internet Information Server (IIS)

Web clients (browsers)

  • Software that fetches/displays documents fetched from web servers
  • Firefox, Internet Explorer, Safari, Chrome

Uniform Resource Locator (URL)

[edit | edit source]

The Uniform Resource Locator (URL) is an identifier for the location of a page on the web. The system of URLs is hierarchical (see image below).

An example of the different pieces of a URL.
An example of the different pieces of a URL.
  • edu: a URL for a school (not .com or .org)
  • www.sbuniv.edu: a URL for the Southwest Baptist University (SBU) website
  • www.sbuniv.edu/COBACS/CIS/index.html: a URL to a page on SBU's website under the path

Hyper-Text Markup Language (HTML)

[edit | edit source]

The language used to define web pages is known as HTML. In order to view an example, open another tab and navigate to the Southwest Baptist University CIS Department website. Once you have the page open, right click on the page and select "View Source", this will allow you to see the HTML code that was used to create the web page. The web page itself may content hypertext (clickable text that serves as links). A link is just a defined URL that points to another web page. Web pages and links are what combine to form the Web.

Finding Information on the Web

[edit | edit source]

It is important to note how to find information on the Web. Follow the steps below to see how this process works: Use a hierarchical system (directory) to find the URLs to pages that may have the information

  • Use our knowledge to guess, e.g. start from apple.com to navigate to the page for iPhone 5s
  • Use a search engine
   -we look for information (wherever it is located) not pages
   -we may find information we did not know existed

How a Search Engine Works

[edit | edit source]

One of the main sources for locating resources can be found using a search engine. However, have you ever thought about how they actually work? There is a series of steps that describe exactly what happens when a search engine is used:

  1. Gather information: crawl the web
  2. Keep copies: cache web pages
  3. Build an index
  4. Understand the query
  5. Determine the relevance of each possible result to the query
  6. Determine the ranking of the relevant results
  7. Present the results

Measure of Important Pages

Once a search is performed relevant pages are provided. However, not all relevant pages displayed are considered important. A web page does not gain importance until it has been ranked by credible sources. One of Google's innovations is page rank - a measure of the “importance” of a page that takes into account the external references to it. A page is considered more important based on the number of important pages that link to that page. For example, an electronic article from the New York Times would have a higher level of importance or page rank than a personal blog due to the number of important pages linked to that online article.


Encryption

Encryption

[edit | edit source]

In order to ensure secure communication takes place encryption methods must be used. Secure communication over the web is important for areas such as e-commerce. Encryption is used to encode messages ensuring no one, but the intended recipient knows the content of the message.

The messages that are transferred over the Internet in the form of packets. When you think about packets, they are more like postcards than letters. The content of each packet are plaintext (exposed for all to see) as the bits are transmitted.

The best way to protect these packets during transmission and after reception is using encryption techniques. Encryption is simply the process of converting information (plaintext) into unintelligible text (ciphertext) to avoid unwanted parties from intercepting the message. In order for the recipient to understand the ciphertext they must use a decryption method. Decryption reverses the process of ciphertext back into plaintext.

The two parts: encryption and decryption are part of what is known as cryptography. Cryptography (secret writing) is the practice and study of techniques for secure communication in the presence of third parties. It is not a new practice and has been around since early 2000 B.C..

Caesar Cipher

[edit | edit source]

The Caesar cipher is an example of a substitution cipher. This cipher uses a letter-by-letter translation to encrypt messages. A cipher is simply a method (algorithm) used to transform a message into an obscured form and reversing the transformation. An example of this particular cipher can be seen below where you replace each letter in the top row by the corresponding letter on the bottom row:

Caesar Cipher example.
Caesar Cipher example.

With the Caesar cipher there are 25 possible variations representing one for each different amount of shifting. The key to remember about the encryption and decryption rule is the amount of the shift. If we know the Caesar cipher is used then we could try all possible 25 shifts of the alphabet to decrypt the message. However, tools have been created to encrypt and decrypt messages created using this cipher.

Substitution Ciphers

[edit | edit source]

Substitution ciphers are ciphers that use one symbol is substituted for another according to a uniform rule. The example below shows a substitution table that defines a rule for reordering letters in the alphabet. How many possible reordering possibilities can be performed using the example below?

Substitution cipher
Substitution cipher
Number of methods possible.
Number of methods possible.

These type of ciphers appear to be unbreakable, but that is not true. Frequency analysis is used to decode substitution ciphers . This technique used to break general substitution ciphers uses frequencies letters that appear in a language.The image below shows the original message with symbols. We will use frequency analysis to decode the cipher.

Original encoded ciphertext.
Original encoded ciphertext.

After we have replaced the most used characters with E and T we can begin using other common symbols and sentence structure to fill in the gaps.

Process of using conjectural decoding.
Process of using conjectural decoding.

Finally, after replacing symbols with frequently used letters we see the entire message displayed below.

Complete ciphertext message.
Complete ciphertext message.

Vigenère Cipher

[edit | edit source]

The Vigenere cipher is similar to the Caesar cipher, but it uses multiple Caesar ciphers to encode a message. For a long time the Viegenere cipher was considered unbreakable until the 1800s when Charles Babbage discovered a way.

This table shows the key to the cipher thomasbbryan. This cipher was used by an attorney named Thomas B. Bryan in 1894 to communicate with his client.
This table shows the key to the cipher thomasbbryan. This cipher was used by an attorney named Thomas B. Bryan in 1894 to communicate with his client.
Key description
Key description

The substitution table used above encrypts and decrypts messages. We use the second column, "thomasbbryan", to uniquely identify the table. This key is used to specify which cipher is used.

The Vigenere cipher was unbreakable until the method to decode this cipher was discovered in 1863. Although the cipher is no longer secure, it was at the time a great enhancement to secure communications.

Vernam Cipher

[edit | edit source]

The weakness discovered with the Vigenere cipher is the repeated use of the same key. In order to combat this problem the Vernam cipher was created. The key is as long as the plaintext so that no repetition is needed. For example, if we wanted to use the Vernam cipher to encrypt the message the length of 100, we might use 100 Caesar ciphers extended to 100 rows. This was a one-time pad used to encrypt messages. The Vernam cipher was used widely during World War II and the Cold War.

In principle, the one-time pad is as good as it gets when it comes to cryptography. This process is mathematically provable. The Vernam cipher operates in a similar fashion to the Caesar cipher. Wherein the Caesar cipher one number key is used as the shift cipher, the Vernam cipher operates through the use of many different shift ciphers being used, a unique one for each letter in the key. This is done by shifting the character according to the value of the letter it corresponds to in the alphabet. For example if the letter of the key was 'A' that would lead to a shift of 1.

When used correctly, the one-time pad is unbreakable, but it is difficult to transmit the one-time pad between the parties without interception. Another challenge is that the cipher (one-time pad) is impractical. If there is a way to transmit the message, then the person might as well send the message itself due to the length and complexity.

Today, we use more innovative practices to secure communication. Sophisticated ciphers (programs) use shorter keys and these keys are sequences of bits on which both parties agree to keep secret. This process works because computers divide ASCII-coded plaintext messages into blocks. The bits that make up that block are transformed according to a specific method that depends on the secret key created.

There are no known shortcuts for breaking secret key ciphers. Even using a brute-force attack is difficult because it requires guessing all possible keys but as the attack occurs the process grows exponentially in time based on the size of the key. Increasing the key length by only one bit doubles the work required to break the cipher. By creating longer keys it makes it possible to have the work outgrow the actual computing power. Due to this, breaking these ciphers is possible, but computationally infeasible, taking hundreds of years or more to crack.

The challenges that come with using secret key encryption is that the number of keys required increases as the number of network members increases. For each pair of members a new shared secret key must be created. Creating unique keys becomes more complicated as more combinations are needed. Another challenge is securely establishing a secret key between two parties when a secure channel does not exist between them.

Public Key Encryption

[edit | edit source]

In 1976, Whitfield Diffie and Martin Hellman proposed the idea of public-key encryption. The idea is of two mathematically related keys a public key and a private key. The keys are paired, but computationally infeasible to connect to each other. A message encrypted using the private key can only be decrypted by the public key and vice versa.

When a user picks a secret key and encrypts the message with the recipient's public key and sends the ciphertext to the recipient. The recipient then uses their private key to decrypt the ciphertext to get the secret key. The private keys are kept secret and never sent to the other user. The two can communicate using the secret key (also known as a session key). The confidentiality of the message is ensured due to no one except the recipient being able to decrypt the message from the initiator.

The way to ensure the message is from the sender is to use digital signature schemes. Signatures should be easy for a user to produce, but difficult for anyone else to forge. Digital signatures can also be tied to the content of the message being signed. Authenticity of the message is verified due to the ciphertext only being decrypted with the intended party's private key.


Simulation

Simulation

[edit | edit source]

Simulation can be a very powerful way to represent real-world systems, scenarios, and experiments. Simulation is the recreation of a real-world system in a prepared and controlled environment. As we study different objects and environments we see the complexity of measuring, analyzing, and emulating events such as nature. Although it is impossible to produce 100% accuracy and detail; the two are used to attempt to create an environment as close as possible to mimicking real-world systems. In order to achieve a higher level of accuracy and detail we focus on the most important aspects of the system we hope to simulate. The higher the granularity in detail the more likely we are to accurately predict what will happen in a real world environment.

The three motivating factors behind using simulation instead of real world experimentation are:

  • Control - Gives us the ability to explore problems that were utterly out of reach due to our lack of control over a real world situation. For example, if we are attempting to simulate a storm or hurricane these are events we have no control over, but studying the paths could be beneficial in predicting future effects and climate changes.
  • Cost - Conducting experiments in real system can be costly both in regards to time and money. For example, instead of automotive manufacturers crashing several cars for testing they can use simulations to mimic different car crashes, angles, and scenarios without spending time and money on an actual car.
  • Safety - Certain experiments can be dangerous or harmful and simulations. Scientists are able to simulate events such as virus outbreaks, aircraft engine failure, and even testing nuclear bomb material (more info on atomic bomb simulations).

Modeling

[edit | edit source]

Modeling is the process of describing how the components of a simulation looks and behaves. During this process we model the behavior and interactions of all components. Although we are using loose approximations these simulated versions can be a surprisingly accurate recreation of the real world system.


Artificial Intelligence

Artificial Intelligence

[edit | edit source]

What is A.I.?

[edit | edit source]

Artificial Intelligence (AI) is the idea of building a system that can mimic human intelligence. How we determine intelligence is based on how people plan, learn, natural language processing, motion and manipulation, perception, and creativity. These various areas are used in the process of engineering and developing Artificial Intelligence.

The Turing Test & A.I.

[edit | edit source]

One concept that is important to note is that computers only perform algorithms and programs given, they cannot inherently create algorithms on their own. We have seen in previous chapters where algorithms can morph or change, but not without being given an algorithm.

Previous chapters have breached the topic of the Turing Test. The Turing Test is used as a theoretical standard to determine whether a human judge can distinguish via a conversation with one machine and one human which is a human and which is a machine. If a machine can trick the human judge into thinking it is human then it passes the Turing Test.

Although there are several innovative developments in the field of AI there are still areas that are being tested in improved. Once such example can be found with CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans apart). The CAPTCHAs are used to distinguish between human and computer. Even today, computers are unable to identify images such as those generated by CAPTCHAs as well as humans. Those developing CAPTCHAs are using these as a tool to teach computers to recognize and learn words and/or images that humans are able to identify. CAPTCHA is considered a reverse Turing test because a computer is determining whether a human user is indeed human.

Intelligent Agent Approach

[edit | edit source]

The intelligent agent approach was first introduced in the quest for "artificial flight". The Wright brothers and others stopped their attempts to imitate birds and to instead embrace the idea of aerodynamics. The goal is not to duplicate, but to use what is known about flying and manipulate that knowledge.

A major part of this approach is a rational agent, which is an an agent used to achieve the best outcome. Therein lies the connection to the intelligence test of rationality. Basically, can this machine or computer mimic the rational behavior of a human.

History of A.I.

[edit | edit source]

The concept of AI began around 1943 and became a field of study in 1956 at Dartmouth. AI is not limited to the Computer Sciences disciplines, but can be seen in countless disciplines such as Mathematics, Philosophy, Economics, Neuroscience, psychology and various other areas. The areas of interest in the Computer Science and Engineering field are focused on how we can build more efficient computers. Great advancements have been made in the area of hardware and software.

A.I. Knowledge-based Expert System

[edit | edit source]

An A.I. system often times will use a rule-based system to capture knowledge in the form of if-then statements. Another way to think about these rule-based systems is as decision trees. Decision trees use these preset rules to determine the decision path to follow based on input provided. An example of a decision tree or rule-based system is a single player game. In the game the player imagines an animal (real or imaginary) and answers a series of questions, which are designed for the computer to guess what the animal is assuming the player always answers the questions truthfully.

Machine Learning

[edit | edit source]

There are two types of machine learning: formal and informal. The informal involves giving computers the ability to learn without explicitly programming the capability (Arthur Samuel, 1959). The formal type of machine learning is a computer program that learns from experience in respect to some task and increases performance based on that experience (Tom Mitchell, 1998).

Supervised Learning

[edit | edit source]

The supervised learning is based on giving the correct answers and having the computer mapping inputs to outputs. Examples of supervised machine learning are:

  • United States Postal Service using computers to read zip codes on envelopes and automatically sort mail (i.e. handwritten zip codes)
  • Spam filters - software is trained to learn and distinguish between spam and non-spam messages (i.e. email filters)
  • Facial recognition- used by cameras to focus and via photo editing software to tag persons (i.e. Facebook)

Unsupervised Learning

[edit | edit source]

Unsupervised learning is simply the reverse of supervised learning where the correct answers are unknown. The goal or objective of unsupervised learning is to discover structure in data, which is also known as data mining. The computer looks at data to find trends to make and/or assist in decision making. Below are examples of unsupervised learning:

  • Clustering algorithm - Used to find patterns in datasets and then group that data into different coherent clusters.
  • Market segmentation - targeting customers based off regions, likes, dislikes, when the consumer makes purchases, etc. This is considered targeted marketing.
  • Recommendation systems - systems or software that make recommendations to the consumer as to what they may like based off their preferences (i.e. Netflix, Hulu, etc.).
  • Statistical Natural Language Processing - used to guess the next word or auto-complete words based off of correcting/guessing techniques, suggesting news stories, or translating texts)

Genetic Programming

[edit | edit source]

Genetic Programming is an idea that uses evolutionary process to improve algorithms.

Future of A.I.

[edit | edit source]

There are many challenges in mimicking human intelligence. Humans acquire common senses that are intuitive but hard to reason rationally, e.g. the color of a blue car is blue. Deep learning is a branch A.I. that aim to create algorithms that can acquire intuition.


Limits of Computing

Limits of Computing

[edit | edit source]

We have studied some big ideas of computing, which allows us to perform amazing tasks through information process. You might have gotten the impression that if we can quantify information and design an algorithm, we can solve any problem using computing. In this chapter we will study the theory about the limits of computing. There are theoretical and practical limits on what computing can do for us.

Turing Machine Revisited

[edit | edit source]

When we talked about the formal definition of "algorithm", we introduced the Turing machine, which is a mathematical model for computation. With this model we can study the nature of computing including what it is can possibly do. Alan Turing contributed greatly to the computability theory with his model machine. He proved that Turing machine is as powerful as any computer we can ever build (Turing completeness). If we can find a problem the Turing machine cannot solve, we just proved the solution is not computable, i.e., the problem is not solvable through computing on any computer.

Halting Problem

[edit | edit source]

In 1936 Alan Turing proved that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist, i.e., the halting problem is unsolvable using computing. More specifically, the halting problem is undecidable because it is the simplest type of question that answers a yes or no (decision) question: whether a program will eventually halt (stop) given an input. Obviously, it is infeasible to actually run the program with the input to see whether it will halt because it may take forever if there is an infinite loop in the program. The idea is to analyze a program with given input to determine whether it will halt or not.

Alan Turing proved the halting problem is undecidable by a proof technique called proof by contradiction. It would be very helpful to review some of the sample proofs on Wikipedia. Another classic example is the proof of Euclid's theorem in number theory, which asserts that there are infinitely many prime numbers. All proofs by contradiction start with an assumption that the proposition we are trying to proof is false, follow a logical sequence of valid steps, and arrive at a conclusion that is clearly false or contradicts, which is also false because a statement can be either true or false but never both.

If we assume the halting problem is decidable/solvable we should be able to design an algorithm and implement it as a program that solve the problem for us. The program should take a program and the input to the program as input parameters and return the answer - whether the program will halt or not on the input. Such a program may sound strange because it takes a program as input, but we have seen such programs, e.g., the higher order function blocks in Snap! takes a block (program) as input and an interpreter program takes the source code of a program as data and runs the program. Programs are data and there no intrinsic difference between the tow. The following proof shows that a program that answers the halting question cannot exist.

  1. Assume such a program A exists. A(P, D) -> whether program P halt on input data D.
  2. We can create another program B that takes a program as input. Inside program B we can call program A with the given input to determine whether the input program will halt on itself as input data and if the answer is no (will not halt) we halt (return) and if the answer is yes (will halt) loop forever.
 B(P):
   if A(P, P) = yes 
     infinite loop
   else
     return 

  1. What happens if we feed program B to itself as the input? or more simply, would program B halt on itself? There are two possible outcomes of the exercise: program B halts on itself or program B runs forever when itself is the input. The actual outcome/result depends on the outcome of program A called inside program B. If program B halts on itself, according to the design of program B it should run forever because the program A with program B as both inputs should run an infinite loop. On the other hand, if program B will not halt on itself, the output from program B with itself as input should return right away. Either way it is a contradiction.
  2. So far, we have made an assumption, followed a set of logically sound steps, and finally arrived at contradictions. What went wrong? The contradictions cannot be true because they are logically impossible. The steps are logically sound and the only part that could be wrong is the assumption. Therefore, the assumption cannot be true, i.e., the halting problem is not decidable.

Here is a YouTube video illustrating the proof: https://www.youtube.com/watch?v=92WHN-pAFCs

Intractable Problems

[edit | edit source]

The Halting problem is hard because it is not solvable algorithmically even in principle. There are other hard problems that are solvable in principle but in practice they are close to being impossible to solve. As you can see, we can categorize problems by the performance of the best-known algorithms. If a problem can be solved using a fast algorithm, the problem is easy because we can use a computer to solve it fast. On the contrary if the best-known algorithm we know takes a long time to solve a problem, it is hard because computers cannot solve it fast.

Using algorithm complexity theory we can put each algorithm into a particular category according to the algorithm's complexity. If the big-O notation of a category contains only polynomial terms, the problems solvable using algorithms in this category are called P problems (Polynomial time solutions exist), such as , , and . The P problems are the easy problems to computers. Among the problems without a polynomial time solution there are problems that if we can guess the solution it can be verified in polynomial time. For example, the integer factorization (or prime decomposition) problem has no known polynomial time solution but given an answer we can verify it very quickly by simply multiplying the factors and comparing the result to the integer. These types of problems are called NP (Non-deterministic Polynomial) problems.

Collectively we call problems that take too long to solve intractable problems, which include problems with best algorithm in exponential time () or those with polynomial time solutions but the exponent is too larger, e.g. .

If a problem's best algorithmic solution is in the , when and a computer does operations per second it would take years (the age of the universe) to solve the problem on the computer.

P v.s. NP

[edit | edit source]

Obviously P is a subset of NP because NP is defined only by polynomial answer verification time and being able to solve a problem in polynomial time (P) certainly qualifies for that. Whether P is a proper subset of NP or, in other words, whether remains one of the open questions in computer science. Nobody knows the answer. You can win a million dollars if you can solve it as one of the Millennium Prize Problems.

To attack this P v.s. NP problem theoretical computer scientist have defined another category called NP-complete problems. The relationships among the three categories are illustrated in the following figure.

Diagram of complexity classes provided that P ≠ NP. The existence of problems within NP but outside both P and NP-complete, under that assumption, was established by Ladner's theorem.[1]

All problems in this category are NP problems sharing one special property - ALL other NP-complete problems can be translated/reduced to each of the NP-complete problems in polynomial time. Because of the nature of NP-completeness if we can solve ANY single problem in this category, we have proved all NP problems are solvable in polynomial time, i.e., . We can take any NP problem, translate it to the solved NP-complete problem, and solve the problem in polynomial time. The overall time is still polynomial because polynomial + polynomial is polynomial. Thousands of NP-complete problems have been discovered but none of them has been solved. NP-complete problems are, in a sense, the most difficult known problems.

Most computer scientist believe because the implications otherwise. The "creative leaps" will disappear because solving a problem is as easy as being able to recognize the right answer. Most encryption algorithms are computationally secure because breaking them are NP-problems - there is no known efficient polynomial time solution. If then all encryptions are broken.

There are other unsolved problems in computer science. You can find a list of such problems at https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_computer_science


Computing Machinery

Computing Machinery

[edit | edit source]

We have studied some fundamental principles of computing and seen the power of computing demonstrated in powerful technologies that operate on these principles. At the beginning we imagined that computing can be done purely mechanically and blindly through symbol manipulation if we can build a simple device that is capable of some simple tasks. In this lesson we will study the history of the development of computers and the principles on which all modern computer hardware operate. We will see on the physical layer a computer is nothing but a simple device that follows simple rules to manipulate two symbols.

Computer Hardware

[edit | edit source]

Computer hardware that can carry out computation automatically has been around for too long. There is a short history from early computers to modern computers we have today.

Mechanical Computers

[edit | edit source]

Charles Babbage invented the concept of a programmable computer and designed the first mechanical computer in the early 19th century. His difference engine and analytical engine are mechanical devices designed to tabulate polynomial functions. The input to the machines are programs and data on punched cards and the output number can be pouched onto cards or directed to a printer, a curve plotter and a bell. The machines use ordinary base-10 fixed-point arithmetic. Charles Babbage's engine is the first design for a general-purpose computer that is Turing-complete.

Analog Computers

[edit | edit source]

In early 20th century mechanical analog computers are designed to perform increasingly sophisticated computing, e.g. solving differential equations by integration. Analog computers use continuously changeable aspects of physical phenomena such as electrical, mechanical, or hydraulic quantities to model/quantify computation, which lack the accuracy of modern digital computers.

Digital Computers

[edit | edit source]

The world's first fully automatic digital computer is the electromechanical programmable computer Z3 made by Konrad Zuse in 1941. Z3 uses electric switches that drive mechanical relays to perform computation. It replaces the decimal system with the binary system and pioneered the use of floating point numbers. Program and data are stored on punched film. The distinction between a digital device and an analog device is whether the representations of values are discrete or continuous. For example, black and white are discrete values but their infinite number of grayish colors (a grayscale).

Electronic Digital Computers

[edit | edit source]

The world's first electronic digital programmable computer is Colossus, built with a large number of valves (vacuum tubes) in 1943. The design was all electronic and was used to break the German Enigma code.

Transistor Computers

[edit | edit source]

From 1955 vacuum tubes were replaced by transistors in computer designs resulting in smaller, more reliable and more power efficient second generation computers, giving rise to the "second generation" of computers.

Integrated Circuit Computers

[edit | edit source]

The invention of integrated circuit in 1952 ushered in a new era of computing machinery - micro-computers. Microprocessors with integrated circuit are used to build common computing devices you see today: desktop computers, laptop computers, phones, and greeting cards.

Principles of Digital Computing

[edit | edit source]

The mathematical foundation of digital computing is Boolean logic invented by George Boole. Claude Shannon proved in the 1930s that electronic circuits can compute in binary using Boolean logic, which becomes the fundamental principle/idea behind all modern computing devices.

Boolean Algebra

[edit | edit source]

Boolean algebra has three operations AND, OR, and NOT on two values: true and false. The rules for evaluating the three operations are shown in the figure.

The truth table showing the rules for the three basic Boolean operations.

A Boolean operation operates on Boolean values and always result in a Boolean value. For the AND operation the result is true only when both operands are true. The OR operation, on the other hand, only results in a false value if either of the two operands is false. The NOT operation takes one operand and simply negates it. We will see if we can build electronic circuits that implement the three operations we can build circuits that can perform all kinds of arithmetic and logic functions.

The three boolean operations can be implemented physically using transistors. A transistor is fundamentally a tiny switch as shown in the figure.

A transistor is fundamentally a tiny switch with three pins. When a logical 1 is applied to the control pin the switch is closed connecting in to out.

When a high voltage (logical 1) is applied to the control pin the switch is closed connecting the in pin directly to the out pin. Transistor operates on two voltages: high and low, which can be used to represent two different logical values: true and false or two binary values: 1 and 0. We will use a high voltage to physically represent a logical 1 and a low voltage a logical 0.

Transistors and Logic Gates

[edit | edit source]

Transistors are simple devices that are tiny, but it is fundamental building block of electronic circuits. For example we can build a device called NOT gate using a single transistor as shown in the figure.

A NOT gate constructed using a single transistor.

If we treat what's inside the red box as a unit it behaves like negator, which is known as an NOT gate in digital logic design. As shown in the truth table for this device (next to the figure) when the input is logical 1 the switch is closed connecting the output to the ground dragging the output voltage to a low signifying a logical 0. On the other hand, when the input is logical 0 the switch stays open which result in a high voltage on the output line because it is connected to the power through a resister and without a current the resister will not cause any drop in power. Once this device is built we can use it as a building block to construct more complicated circuits. Will use the following symbol to represent a NOT gate.

NOT gate

With transistors and the NOT gate we can build a device that performs the AND operation.

An AND gate built using two transistors and a NOT gate.

As shown in the figure the device performs exact an AND operation. The output is logical 1 (high voltage) only when both inputs are logical 1, which causes both switches to close dragging the output before the NOT gate to low. The NOT gate then negates the output to be high voltage or logical 1.

Similarly we can build a device for performing the OR operation.

An OR gate built using two transistors and a NOT gate.

A shown in the figure this kind of parallel structure guarantees that the output is high voltage as long as any transistor is closed. In other words, the relationship between the two inputs and the output is a OR logical function.

Gates to Circuits

[edit | edit source]

With the three basic gates (AND, OR, NOT) we can build any combinational logic circuit. A circuit consists of input wires, gates connected by wires, and output wires. Once a circuit is designed it can be viewed as a black box that implements some logic mapping input to output. Here is the standard circuit construction algorithm:

  1. build a truth table from the desired logic
  2. build a logic expression in sum-of-products form
  3. convert the expression to circuit design using gates

We want to construct a circuit that test the equality of two bits. The two inputs are the two bits represented physically by a high voltage (logical 1) or a low voltage (logical 0). According to the desired logic of the circuit we can draw the following truth table:

This logic tests whether two bits are the same/equal.

The first two columns enumerates all possible value combinations of the two input lines. The output is logical 1 (true) only when the two inputs are the same. Based on the truth table, we can derive the following logic expression (sum-of-products form):

(a AND b) OR ((NOT a) AND (NOT b))

To derive the sum-of-products form we check lines in the truth table with a logical 1 for the output. We know the input combinations shown in these lines are supposed to cause the output to become logical 1. We can represent each of these lines using a logic expression, for instance, (a AND b) represents the last line because when both a and b are logical 1 the expression evaluates to a logical 1 according to the definition of the AND operation. Similarly ((NOT a) AND (NOT b)) represents the first line. If we combine the two cases, we can represent the desired logic using a single expression: (a AND b) OR ((NOT a) AND (NOT b)). If we plugin the inputs for the cases in the truth table this expression should evaluate to the same desired output values for the corresponding cases. Because we know how to build the devices (gates) that implement AND, OR, and operations we can build a device that can compare whether two bits are equal. This is device will be able to perform this kind of operation purely mechanically (blindly) because it doesn't know the meaning of the operation.

Using the same methodology we can gradually build more and more complicated circuits. For example we could build a device that can add two binary digits: one-bit adder. It is just a matter of figuring out the desired logic and construct the device using the building blocks we already know how to make.

An adder circuit that adds two binary bits.

Once we get the sum-of-products form of the logic expression from the truth table it is straight forward for us to construct the circuit because all we need are the three types of logic gates and wire connection. The following figure shows the design of a one-bit adder (with carry-in) circuit using the three basic logic gates.

The circuit for producing the sum bit of a one bit adder.

We can construct multi-bit adders by connecting multiple one-bit adders. The following figure shows that a two-bit adder can be formed by connecting the carry-out of the first one-bit adder to the carry-in of the second one-bit adder.

A 2-bit adder circuit constructed from two 1-bit adders.


A Simple Flow Chart for a Lamp Diagnosis Algorithm
A Flow Chart for the Algorithm that Computes N! (factorial of N)

Computer data storage


Parallel Processing

Computing is fundamentally about information processes. On a digital computer such processes are carried out via symbol manipulations in binary logic. With the advancement in semiconducting technology we have been able to keep making computers run faster—manipulate bits at a higher speed—by cramming more transistors into computer chips. This is known as the Moore's law originated around 1970's. The trend of increase has slowed down and will eventually flatten due to limits in physics as predicted by some physicist, who also predicted potential new technologies that may replace semiconductors (silicon) in computer hardware manufacturing.

In the meantime, hardware companies have tweaked their technologies to maintain the growth in hardware capacity. Multicore technology replaces one fast CPU (Central Processing Unit—the brain of a computer) with many slower ones (called cores) to avoid overheating the chip. Even though each core is a slower but we get more of them and could get more done if we can arrange the work properly. For instance, a strong worker can lift 100 bricks a minute and a normal worker can only lift 34 bricks. Three normal workers can outperform one strong worker even though they are much slower individually. This is the idea of parallel processing.

Traditionally computer program has been written to describe sequential processes, which means the steps can only be carried out one at a time and one after another in a sequence. This type of program works fine on a computer with a single processor because the computer can perform one symbol manipulation at a time any ways. In fact we have been reaping the benefit of Moore's law: every two computer hardware double it speed causing our program run twice as fast without us doing anything. This trend has stopped. Each individual processor (core) is not getting faster but we have more of them in a computer. As a result our existing sequential program will run slower even though the hardware's capacity has become larger. Before the next generation of computers are invented we can parallel computing/processing to solve problems faster computationally.

The idea of parallel processing is not new. For example, a car assembly line allows multiple cars to be built at the same time. Even though different parts of the car are being assembled at a given time this assembly line keeps all the workers busy increasing the throughput (number of cars built per unit time) of the whole system. We can make the workers work faster to further increase the throughput or we could add another assembly and hire more workers. This is one form of parallel processing - pipelining. Another form of parallelism divides the whole computing task into parts that can be computed simultaneously and run them physically on different CPU (computers). This is similar to putting a jigsaw puzzle together with friends. As you can imagine having some extra help will definitely help solve the puzzle faster, but do it mean the more the better. The answer is no. As the number of helpers increase the amount of coordination and communication increases faster. When you have too many people they may start stepping on each other's toes and competing with each other for resources (space and puzzle pieces). This is known as the overhead of parallel processing, which causes the diminishing return on investment. We can see this pattern clearly when we measure the improvement in performance as a function of the workers involved.

In the context of parallel processing/computing we use a metric called speedup to measure the improvement in performance. The achieved speedup equals the solution/execution time of a program with out parallel processing divided by the execution time of the same task with parallel processing.

where:

  • is the speedup.
  • is the old execution time without the parallel processing.
  • is the new execution time with the parallel processing.

If parallel processing makes a program run twice as fast the speedup is two (a.k.a two-fold speedup). Theoretically as we double the number of workers or resources we can expect a two-fold speed up. Practically it is hard to achieve this optimal speedup because some tasks are not always parallelizable. For example you can not usually lay the carpet before the floor of house is constructed and you cannot always add more painters to get the painting job down faster. Computational tasks often have similar dependency and resources constraints to keep us from fully utilize the parallel processing systems (e.g. multi-core computers) we have.

Exercise:

With a washing machine and a dryer you could work on one load of laundry at time. You wash it first and then put it into the dryer. Assume the whole task takes an hour. This works perfectly when you have only one load to do and there is nothing you can do to make it go faster. What if you have many loads of laundry to do? You can at least at one load done every hour. Can you "speed it up"? If the number of loads can be arbitrary larger what is shortest average per load laundry time you can achieve?


References

How to Think Like a Computer Scientist (Python)

A Balanced Introduction to Computer Science (3rd ed)

Fire in the Valley