Jump to content

Abstract Algebra/Printable version

From Wikibooks, open books for an open world


Abstract Algebra

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Abstract_Algebra

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Sets

In the so-called naive set theory, which is sufficient for the purpose of studying abstract algebra, the notion of a set is not rigorously defined. We describe a set as a well-defined aggregation of objects, which are referred to as members or elements of the set. If a certain object is an element of a set, it is said to be contained in that set. The elements of a set can be anything at all, but in the study of abstract algebra, elements are most frequently numbers or mathematical structures. The elements of a set completely determine the set, and so sets with the same elements as each other are equal. Conversely, sets which are equal contain the same elements.

For an element and a set , we can say either , that is, is contained in , or , that is, is not contained in . To state that multiple elements are contained in , we write .

The axiom of extensionality

[edit | edit source]

Using this notation and the symbol , which represents logical implication, we can restate the definition of equality for two sets and as follows:

if and only if and .

This is known as the axiom of extensionality.

Comprehensive notation

[edit | edit source]

If it is not possible to list the elements of a set, it can be defined by giving a property that its elements are sole to possess. The set of all objects with some property can be denoted by . Similarly, the set of all elements of a set with some property can be denoted by . The colon : here is read as "such that". The vertical bar | is synonymous with the colon in similar contexts. This notation will appear quite often in the rest of this book, so it is important for the readers to familiarize themselves with this now.

As an example of this notation, the set of integers can be written as , and the set of even integers can be written as .

Set inclusion

[edit | edit source]

For two sets and , we define set inclusion as follows: is included in, or a subset of, , if and only if every member of is a member of . In other words,

where the symbol denotes "is a subset of", and denotes "if and only if".

By the aforementioned axiom of extensionality, one finds that .

The empty set

[edit | edit source]

One can define an empty set, written , such that , where denotes universal quantification (read as "for all" or "for every"). In other words, the empty set is defined as the set which contains no elements. The empty set can be shown to be unique.

Since the empty set contains no elements, it can be shown to be a subset of every set. Similarly, no set but the empty set is a subset of the empty set.

Proper set inclusion

[edit | edit source]

For two sets and , we can define proper set inclusion as follows: is a proper subset of if and only if is a subset of , and does not equal . In other words, there is at least one member in not contained in

,

where the symbol denotes "is a proper subset of" and the symbol denotes logical and.

Cardinality of sets

[edit | edit source]

The cardinality of a set , denoted by , can be said informally to be a measure of the number of elements in . However, this description is only rigorously accurate for finite sets. To find the cardinality of infinite sets, more sophisticated tools are needed.

Set intersection

[edit | edit source]

For sets and , we define the intersection of and by the set which contains all elements which are common to both and . Symbolically, this can be stated as follows:

.

Because every element of is an element of and an element of , is, by the definition of set inclusion, a subset of and .

If the sets and have no elements in common, they are said to be disjoint sets. This is equivalent to the statement that and are disjoint if .

Set intersection is an associative and commutative operation; that is, for any sets , , and , and .

By the definition of intersection, one can find that and . Furthermore, .

One can take the intersection of more than two sets at once; since set intersection is associative and commutative, the order in which these intersections are evaluated is irrelevant. If are sets for every , we can denote the intersection of all by

In cases like this, is called an index set, and are said to be indexed by .

In the case of one can either write or

.

Set union

[edit | edit source]

For sets and , we define the union of and by the set which contains all elements which are in either or or both. Symbolically,

.

Since the union of sets and contains the elements of and , and .

Like set intersection, set union is an associative and commutative operation; for any sets , , and , and .

By the definition of union, one can find that . Furthermore, .

Just as with set intersection, one can take the union of more than two sets at once; since set union is associative and commutative, the order in which these unions are evaluated is irrelevant. Let be sets for all . Then the union of all the is denoted by

(Where may be read as "there exists".)

For the union of a finite number of sets , that is, one can either write or abbreviate this as

.

Distributive laws

[edit | edit source]

Set union and set intersection are distributive with respect to each other. That is,

and
.

Cartesian product

[edit | edit source]

The Cartesian product of sets and , denoted by , is the set of all ordered pairs which can be formed with the first object in the ordered pair being an element of and the second being an element of . This can be expressed symbolically as

.

Since different ordered pairs result when one exchanges the objects in the pair, the Cartesian product is not commutative. The Cartesian product is also not associative. The following identities hold for the Cartesian product for any sets :

,
,
,
.

The Cartesian product of any set and the empty set yields the empty set; symbolically, for any set , .

The Cartesian product can easily be generalized to the n-ary Cartesian product, which is also denoted by . The n-ary Cartesian product forms ordered n-tuples from the elements of sets. Specifically, for sets ,

.

This can be abbreviated as

.

In the n-ary Cartesian product, each is referred to as the -th coordinate of .

In the special case where all the factors are the same set , we can generalize even further. Let be the set of all functions . Then, in analogy with the above, is effectively the set of "-tuples" of elements in , and for each such function and each , we call the -th coordinate of . As one might expect, in the simple case when for an integer , this construction is equivalent to , which we can abbreviate further as . We also have the important case of , giving rise to the set of all infinite sequences of elements of , which we can denote by . We will need this construction later, in particular when dealing with polynomial rings.

Disjoint union

[edit | edit source]

Let and be any two sets. We then define their disjoint union, denoted to be the following: First create copies and of and such that . Then define . Notice that this definition is not explicit, like the other operations defined so far. The definition does not output a single set, but rather a family of sets. However, these are all "the same" in a sense which will be defined soon. In other words, there exists bijective functions between them.

Luckily, if a disjoint union is needed for explicit computation, one can easily be constructed, for example .

Set difference

[edit | edit source]

The set difference, or relative set complement, of sets and , denoted by , is the set of elements contained in which are not contained in . Symbolically,

.

By the definition of set difference, .

The following identities hold for any sets :

,
,
,
,
,
,
,
,
.

The set difference of two Cartesian products can be found as .

The universal set and set complements

[edit | edit source]

We define some arbitrary set for which every set under consideration is a subset of as the universal set, or universe. The complement of any set is then defined to be the set difference of the universal set and that set. That is, for any set , the complement of is given by . The following identities involving set complements hold true for any sets and :

De Morgan's laws for sets:
,
,
Double complement law:
,
Complement properties:
,
,
,
,
.

The set complement can be related to the set difference with the identities and .

Symmetric difference

[edit | edit source]

For sets and , the symmetric set difference of and , denoted by or by , is the set of elements which are contained either in or in but not in both of them. Symbolically, it can be defined as

More commonly, it is represented as

or as
.

The symmetric difference is commutative and associative so that and . Every set is its own symmetric-difference inverse, and the empty set functions as an identity element for the symmetric difference, that is, and . Furthermore, if and only if .

Set intersection is distributive over the symmetric difference operation. In other words, .

The symmetric difference of two set complements is the same as the symmetric difference of the two sets: .

Notation for specific sets

[edit | edit source]

Commonly-used sets of numbers in mathematics are often denoted by special symbols. The conventional notations used in this book are given below.

Natural numbers with 0: or
Natural numbers without 0:
Integers:
Rational numbers:
Real numbers:
Complex numbers:




Equivalence relations and congruence classes

We often wish to describe how two mathematical entities within a set are related. For example, if we were to look at the set of all people on Earth, we could define "is a child of" as a relationship. Similarly, the operator defines a relation on the set of integers. A binary relation, hereafter referred to simply as a relation, is a binary proposition defined on any selection of the elements of two sets.

Formally, a relation is any arbitrary subset of the Cartesian product between two sets and so that, for a relation , . In this case, is referred to as the domain of the relation and is referred to as its codomain. If an ordered pair is an element of (where, by the definition of , and ), then we say that is related to by . We will use to denote the set

.

In other words, is used to denote the set of all elements in the codomain of to which some in the domain is related.

Equivalence relations

[edit | edit source]

To denote that two elements and are related for a relation which is a subset of some Cartesian product , we will use an infix operator. We write for some and .

There are very many types of relations. Indeed, further inspection of our earlier examples reveals that the two relations are quite different. In the case of the "is a child of" relationship, we observe that there are some people A,B where neither A is a child of B, nor B is a child of A. In the case of the operator, we know that for any two integers exactly one of or is true. In order to learn about relations, we must look at a smaller class of relations.

In particular, we care about the following properties of relations:

  • Reflexivity: A relation is reflexive if for all .
  • Symmetry: A relation is symmetric if for all .
  • Transitivity: A relation is transitive if for all .

One should note that in all three of these properties, we quantify across all elements of the set .

Any relation which exhibits the properties of reflexivity, symmetry and transitivity is called an equivalence relation on . Two elements related by an equivalence relation are called equivalent under the equivalence relation. We write to denote that and are equivalent under . If only one equivalence relation is under consideration, we can instead write simply . As a notational convenience, we can simply say that is an equivalence relation on a set and let the rest be implied.

Example: For a fixed integer , we define a relation on the set of integers such that if and only if for some . Prove that this defines an equivalence relation on the set of integers.

Proof:

  • Reflexivity: For any , it follows immediately that , and thus for all .
  • Symmetry: For any , assume that . It must then be the case that for some integer , and . Since is an integer, must also be an integer. Thus, for all .
  • Transitivity: For any , assume that and . Then and for some integers . By adding these two equalities together, we get , and thus .

Q.E.D.

Remark. In elementary number theory we denote this relation and say a is equivalent to b modulo p.

Equivalence classes

[edit | edit source]

Let be an equivalence relation on . Then, for any element we define the equivalence class of as the subset given by

Theorem:

Proof: Assume . Then by definition, .

  • We first prove that . Let be an arbitrary element of . Then by definition of the equivalence class, and by transitivity of equivalence relations. Thus, and .
  • We now prove that Let be an arbitrary element of . Then, by definition . By transitivity, , so . Thus, and .

As and as , we have .

Q.E.D.

Partitions of a set

[edit | edit source]

A partition of a set is a disjoint family of sets , , such that .

Theorem: An equivalence relation on induces a unique partition of , and likewise, a partition induces a unique equivalence relation on , such that these are equivalent.

Proof: (Equivalence relation induces Partition): Let be the set of equivalence classes of . Then, since for each , . Furthermore, by the above theorem, this union is disjoint. Thus the set of equivalence relations of is a partition of .

(Partition induces Equivalence relation): Let be a partition of . Then, define on such that if and only if both and are elements of the same for some . Reflexivity and symmetry of is immediate. For transitivity, if and for the same , we necessarily have , and transitivity follows. Thus, is an equivalence relation with as the equivalence classes.

Lastly obtaining a partition from on and then obtaining an equivalence equation from obviously returns again, so and are equivalent structures.

Q.E.D.

Quotients

[edit | edit source]

Let be an equivalence relation on a set . Then, define the set as the set of all equivalence classes of . In order to say anything interesting about this construction we need more theory yet to be developed. However, this is one of the most important constructions we have, and one that will be given much attention throughout the book.



Functions

Definition

[edit | edit source]

A function is a triplet such that:

  • is a set, called the domain of
  • is a set, called the codomain of
  • is a subset of , called the graph of

In addition the following two properties hold:

  1. .
  2. .

we write for the unique such that .

We say that is a function from to , which we write:

Example

[edit | edit source]

Let's consider the function from the reals to the reals which squares its argument. We could define it like this:

Remark

[edit | edit source]

As you see in the definition of a function above, the domain and codomain are an integral part of the definition. In other words, even if the values of don't change, changing the domain or codomain changes the function.

Let's look at the following four functions.

The function:

is neither injective nor surjective (these terms will be defined later).

The function:

is not injective but surjective.

The function:

is injective but not surjective.

The function:

is injective and surjective

As you see, all four functions have the same mapping but all four are different. That's why just giving the mapping is insufficient; a function is only defined if its domain and codomain are known.

Image and preimage

[edit | edit source]

For a set , we write for the set of subsets of .

Let . We will now define two related functions.

The image function:

The preimage function:

Note that the image and preimage are written respectively like and its inverse (if it exists). There is however no ambiguity because the domains are different. Note also that the image and preimage are not necessarily inverse of one another. (See the section on bijective functions below).

We define , which we call the image of .

For any , we call the support of .

Proposition: Let . Then

Example

[edit | edit source]

Let's take again the function:

Let's consider the following examples:

Further definitions

[edit | edit source]

Let and . We define by , which we call the composition of and .

Let be a set. We define the identity function on A as

Properties

[edit | edit source]

Definition: A function is injective if

Lemma: Consider a function and suppose . Then is injective if and only if there exists a function with .
Proof:
:
Suppose is injective. As let's define as an arbitrary element of . We can then define a suitable function as follows:

It is now easy to verify that .
:
Suppose there is a function with . Then . is thus injective.
Q.E.D.

Definition: A function is surjective if

Lemma: Consider a function . Then is surjective if and only if there exists a function with .
Proof:
:
Suppose is surjective. We can define a suitable function as follows:

It is now easy to verify that .
:
Suppose there is a function with . Then . Then . is thus surjective.
Q.E.D.

Definition: A function is bijective if it is both injective and surjective.

Lemma: A function is bijective if and only if there exists a function with and . Furthermore it can be shown that such a is unique. We write it and call it the inverse of .
Proof:
Left as an exercise.

Proposition: Consider a function . Then

  1. is injective iff
  2. is surjective iff
  3. is bijective iff the image and preimage of are inverse of each other

Example: If and are sets such that , there exists an obviously injective function , called the inclusion , such that for all .

Example: If is an equivalence relation on a set , there is an obviously surjective function , called the canonical projection onto , such that for all .

Theorem: Define the equivalence relation on such that if and only if . Then, if is any function, decomposes into the composition

where is the canonical projection, is the inclusion , and is the bijection for all .

Proof: The definition of immediately implies that , so we only have to prove that is well defined and a bijection. Let . Then . This shows that the value of is independent of the representative chosen from , and so it is well-defined.

For injectivity, we have , so is injective.

For surjectivity, let . Then there exists an such that , and so by definition of . Since is arbitrary in , this proves that is surjective.

Q.E.D.

Definition: Given a function , is a

(i) Monomorphism if given any two functions such that , then .

(ii) Epimorphism if given any two functions such that , then .

Theorem: A function between sets is

(i) a monomorphism if and only if it is injective.

(ii) an epimorphism if and only if it is surjective.

Proof: (i) Let be a monomorphism. Then, for any two functions , for all . This is the definition if injectivity. For the converse, if is injective, it has a left inverse . Thus, if for all , compose with on the left side to obtain , such that is a monomorphism.

(ii) Let be an epimorphism. Then, for any two functions , for all and . Assume , that is, that is not surjective. Then there exists at least one not in . For this choose two functions which coincide on but disagree on . However, we still have for all . This violates our assumption that is an epimorphism. Consequentally, is surjective. For the converse, assume is surjective. Then the epimorphism property immediately follows.

Q.E.D.

Remark: The equivalence between monomorphism and injectivity, and between epimorphism and surjectivity is a special property of functions between sets. This not the case in general, and we will see examples of this when discussing structure-preserving functions between groups or rings in later sections.

Example: Given any two sets and , we have the canonical projections sending to , and sending to . These maps are obviously surjective.

In addition, we have the natural inclusions and which are obviously injective as stated above.

Universal properties

[edit | edit source]

The projections and inclusions described above are special, in that they satisfy what are called universal properties. We will give the theorem below. The proof is left to the reader.

Theorem: Let be any sets.

(i) Let and . Then there exists a unique function such that and are simultaneously satisfied. is sometimes denoted .

(ii) Let and . Then there exists a unique function such that and are simultaneously satifsied.

The canonical projections onto quotients also satisfy a universal property.

Theorem: Define the equivalence relation on and let be any function such that for all . Then there exists a unique function such that , where is the canonical projection.



Binary Operations

A binary operation on a set is a function . For , we usually write as .

Properties

[edit | edit source]

The property that for all is called closure under .

Example: Addition between two integers produces an integer result. Therefore addition is a binary operation on the integers. Whereas division of integers is an example of an operation that is not a binary operation. is not an integer, so the integers are not closed under division.

To indicate that a set has a binary operation defined on it, we can compactly write . Such a pair of a set and a binary operation on that set is collectively called a binary structure. A binary structure may have several interesting properties. The main ones we will be interested in are outlined below.

Definition: A binary operation on is associative if for all , .

Example: Addition of integers is associative: . Notice however, that subtraction is not associative. Indeed, .

Definition: A binary operation on is commutative is for all , .

Example: Multiplication of rational numbers is commutative: . Notice that division is not commutative: while . Notice also that commutativity of multiplication depends on the fact that multiplication of integers is commutative as well.

Exercise

[edit | edit source]
  • Of the four arithmetic operations, addition, subtraction, multiplication, and division, which are associative? commutative and identity?

Answer

[edit | edit source]
operation associative commutative
Addition yes yes
Multiplication yes yes
Subtraction No No
Division No No

Algebraic structures

[edit | edit source]

Binary operations are the working parts of algebraic structures:

Augustus De Morgan expressed the venture into the categories of algebraic structures as follows: It is "inventing a distinct system of unit-symbols, and investigating or assigning relations which define their mode of action on each other." De Morgan 1844, quoted by A.N. Whitehead, A Treatise on Universal Algebra, 1898, page 131.

One binary operation

[edit | edit source]
A structure with one binary operation may be one of several types, depending on the conditions satisfied by the binary operation.

A closed binary operation o on a set A is called a magma (A, o ).

If the binary operation respects the associative law a o (b o c) = (a o b) o c, then the magma (A, o ) is a semigroup.

If a magma has an element e satisfying e o x = x = x o e for every x in it, then it is a unital magma. The element e is called the identity with respect to o. If a unital magma has elements x and y such that x o y = e, then x and y are inverses with respect to each other.

A magma for which every equation a x = b has a solution x, and every equation y c = d has a solution y, is a quasigroup. A unital quasigroup is a loop.

A unital semigroup is called a monoid. A monoid for which every element has an inverse is a group. A group for which x o y = y o x for all its elements x and y is called a commutative group. Alternatively, it is called an abelian group.

Two binary operations

[edit | edit source]

A pair of structures, each with one operation, can used to build those with two: Take (A, o ) as a commutative group with identity e. Let A_ denote A with e removed, and suppose (A_ , * ) is a monoid with binary operation * that distributes over o:

a * (b o c) = (a * b) o (a * c). Then (A, o, * ) is a ring.

In this construction of rings, when the monoid (A_ , * ) is a group, then (A, o, * ) is a division ring or skew field. And when (A_ , * ) is a commutative group, then (A, o, * ) is a field.

The two operations sup (v) and inf (^) are presumed commutative and associative. In addition, the absorption property requires: a ^ (a v b) = a, and a v (a ^ b) = a. Then (A, v, ^ ) is called a lattice.

In a lattice, the modular identity is (a ^ b) v (x ^ b) = ((a ^ b) v x ) ^ b. A lattice satisfying the modular identity is a modular lattice.

Three binary operations

[edit | edit source]

A module is a combination of a ring and a commutative group (A, B), together with a binary function A x BB. When A is a field, then the module is a vector space. In that case A consists of scalars and B of vectors. The binary operation on B is termed addition.

Four binary operations

[edit | edit source]

Suppose (A, B) is a vector space, and that B has a second operation called multiplication. Then the structure is an algebra over the field A.



Linear Algebra

The reader is expected to have some familiarity with linear algebra. For example, statements such as

Given vector spaces and with bases and and dimensions and , respectively, a linear map corresponds to a unique matrix, dependent on the particular choice of basis.

should be familiar. It is impossible to give a summary of the relevant topics of linear algebra in one section, so the reader is advised to take a look at the linear algebra book.

In any case, the core of linear algebra is the study of linear functions, that is, functions with the property , where greek letters are scalars and roman letters are vectors.

The core of the theory of finitely generated vector spaces is the following:

Every finite-dimensional vector space is isomorphic to for some field and some , called the dimension of . Specifying such an isomorphism is equivalent to choosing a basis for . Thus, any linear map between vector spaces with dimensions and and given bases and induces a unique linear map . These maps are precisely the matrices, and the matrix in question is called the matrix representation of relative to the bases .

Remark: The idea of identifying a basis of a vector space with an isomorphism to may be new to the reader, but the basic principle is the same. An alternative term for vector space is coordinate space since any point in the space may be expressed, on some particular basis, as a sequence of field elements. (All bases are equivalent under some non-singular linear transformation.) The name associated with pointy things like arrows, spears, or daggers, is distasteful to peace-loving people who don’t imagine taking up such a weapon. The orientation or direction associated with a point in coordinate space is implicit in the positive orientation of the real line (if that is the field) or in an orientation instituted in a polar expression of the multiplicative group of the field.

A coordinate space V with basis has vectors where ei is all zeros except 1 at index i.

As an algebraic structure, V is an amalgam of an abelian group (addition and subtraction of vectors), a scalar field F (the source of the xi's), its multiplicative group F *, and a group action F * x VV, given by

The group action is scalar-vector multiplication.

Linear transformations are mappings from one coordinate space V to another W corresponding to a matrix (ai j ). Suppose W has basis

Then the elements of the matrix (ai j ) are given by the rate of change of yj depending on xi:

constant.

A common case involves V = W and n is a low number, such as n = 2. When F = {real numbers} = R, the set of matrices is denoted M(2,R). As an algebraic structure, M(2,R) has two binary operations that make it a ring: component-wise addition and matrix multiplication. See the chapter on 2x2 real matrices for a deconstruction of M(2,R) into a pencil of planar algebras.

More generally, when dim V = dim W = n, (ai j ) is a square matrix, an element of M(n, F), which is a ring with the + and x binary operations. These benchmarks in algebra serve as representations. In particular, when the rows or columns of such a matrix are linearly independent, then there is a matrix (bi j ) acting as a multiplicative inverse with respect to the identity matrix. The subset of invertible matrices is called the general linear group, GL(n, F). This group and its subgroups carry the burden of demonstrating physical symmetries associated with them.

The pioneers in this field included w:Sophus Lie, who viewed the continuous groups as evolving out of 1 in all directions according to an "algebra" now named after him. w:Hermann Weyl, spurred on by Eduard Study, explored and named GL(n, F) and its subgroups, calling them the classical groups.


Number Theory

As numbers of various number systems form basic units with which one must work when studying abstract algebra, we will now define the natural numbers and the rational integers as well as the basic operations of addition and multiplication. Using these definitions, we will also derive important properties of these number sets and operations. Following this, we will discuss important concepts in number theory; this will lead us to discussion of the properties of the integers modulo n.

The Peano postulates and the natural numbers

[edit | edit source]

Definition: Using the undefined notions "1" and "successor" (denoted by ), we define the set of natural numbers without zero , hereafter referred to simply as the natural numbers, with the following axioms, which we call the Peano postulates:

Axiom 1.
Axiom 2.
Axiom 3.
Axiom 4.
Axiom 5.

We can prove theorems for natural numbers using mathematical induction as a consequence of the fifth Peano Postulate.

Addition

[edit | edit source]

Definition: We recursively define addition for the natural numbers as a composition using two more axioms; the other properties of addition can subsequently be derived from these axioms. We denote addition with the infix operator +.

Axiom 6.
Axiom 7.

Axiom 6 above relies on the first Peano postulate (for the existence of 1) as well as the second (for the existence of a successor for every number).

Henceforth, we will assume that proven theorems hold for all in .

Multiplication

[edit | edit source]

Definition: We similarly define multiplication for the natural numbers recursively, again using two axioms:

Axiom 8.
Axiom 9.

Properties of addition

[edit | edit source]

We start by proving that addition is associative.

Theorem 1: Associativity of addition:

Proof: Base case: By axioms 6 and 7, .

By axiom 6, .
Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 7, .
By the inductive hypothesis, .
By axiom 7, .
By axiom 7, .
By induction, . QED.

Lemma 1:

Proof: Base case: 1+1=1+1.

Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 6, .
By the inductive hypothesis, .
By theorem 1, .
By axiom 6, .
By induction, . QED.

Theorem 2: Commutativity of addition:

Proof: Base case: By lemma 1, .

Inductive hypothesis: Suppose that, for , .
By axiom 6, .
By theorem 1, .
By the inductive hypothesis, .
By theorem 1, .
By lemma 1, .
By theorem 1, .
By axiom 6, .
By induction, . QED.

Theorem 3: .

Proof: Base case: Suppose .

By theorem 2, .
By axiom 6, .
By axiom 4, .
Inductive hypothesis: Suppose that, for , .
Inductive step: Suppose .
By axiom 6, .
By theorem 2, .
By theorem 1, .
By the base case, . Thus, .
By the inductive hypothesis, .
By induction, . QED.

Properties of multiplication

[edit | edit source]

Theorem 4: Left-distributivity of multiplication over addition: .

Proof: Base case: By axioms 6 and 9, .

By axiom 8, .
Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 7, .
By axiom 9, .
By the inductive hypothesis, .
By theorem 1, .
By axiom 9, .
By induction, . QED.

Theorem 5: .

Proof: Base case: By axiom 8, 1(1)=1.

Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 6, .
By theorem 4, .
By the base case, .
By the inductive hypothesis, .
By axiom 6, .
By induction, . QED.

Theorem 6: .

Proof: Base case: By axiom 8, .

By axiom 6, .
By axiom 8, .
Inductive hypothesis: Suppose that, , .
Inductive step: By axiom 9, .
By the inductive hypothesis, .
By axiom 6, .
By theorem 1, .
By theorem 2, .
By theorem 1,
By axiom 9, .
By theorem 1, .
By axiom 6, .
By induction, . QED.

Theorem 7: Associativity of multiplication:

Proof: Base case: By axiom 8, .

Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 9, .
By the inductive hypothesis, .
By theorem 4, .
By axiom 9, .
By induction, . QED.

Theorem 8: Commutativity of multiplication: .

Proof: Base case: By axiom 8 and theorem 5, .

Inductive hypothesis: Suppose that, for , .
Inductive step: By axiom 9, .
By the inductive hypothesis, .
By theorem 6, .
By induction, . QED.

Theorem 9: Right-distributivity of multiplication over addition: .

Proof: By theorems 4 and 7, .

By theorem 7, . QED.

The integers

[edit | edit source]

The set of integers can be constructed from ordered pairs of natural numbers (a, b). We define an equivalence relation on the set of all such ordered pairs such that

Then the set of rational integers is the set of all equivalence classes of such ordered pairs. We will denote the equivalence class of which some pair (a, b) is a member with the notation [(a, b)]. Then, for any natural numbers a and b, [(a, b)] represents a rational integer.

Integer addition

[edit | edit source]

Definition: We define addition for the integers as follows:

Using this definition and the properties for the natural numbers, one can prove that integer addition is both associative and commutative.

Integer multiplication

[edit | edit source]

Definition: Multiplication for the integers, like addition, can be defined with a single axiom:

Again, using this definition and the previously-proven properties of natural numbers, it can be shown that integer multiplication is commutative and associative, and furthermore that it is both left- and right-distributive with respect to integer addition.



Group Theory/Group

In this section we will begin to make use of the definitions we made in the section about binary operations. In the next few sections, we will study a specific type of binary structure called a group. First, however, we need some preliminary work involving a less restrictive type of binary structure.

Monoids

[edit | edit source]

Definition 1: A monoid is a binary structure satisfying the following properties:

(i) for all . This is defined as associativity.
(ii) There exists an identity element such that for all .

Now we have our axioms in place, we are faced with a pressing question; what is our first theorem going to be? Since the first few theorems are not dependent on one another, we simply have to make an arbitrary choice. We choose the following:

Theorem 2: The identity element of is unique.

Proof: Assume and are both identity elements of . Then both satisfy condition (ii) in the definition above. In particular, , proving the theorem.

This theorem will turn out to be of fundamental importance later when we define groups.

Theorem 3: If are elements of for some , then the product is unambiguous.

Proof: We can prove this by induction. The cases for and are trivially true. Assume that the statement is true for all . For , the product , inserting parentheses, can be "partitioned" into . Both parts of the product have a number of elements less than and are thus unambiguous. The same is true if we consider a different "partition", , where . Thus, we can unambiguously compute the products , , and , and rewrite the two "partitions" as and . These equal each other by the definition of a monoid.

This is about as far as we are going to take the idea of a monoid. We now proceed to groups.

Groups

[edit | edit source]

Definition 4: A group is a monoid that also satisfies the property

(iii) For each , there exists an element such that .

Such an element is called an inverse of . When the operation on the group is understood, we will conveniently refer to as . In addition, we will gradually stop using the symbol for multiplication when we are dealing with only one group, or when it is understood which operation is meant, instead writing products by juxtaposition, .

Remark 5: Notice how this definition depends on Theorem 2 to be well defined. Therefore, we could not state this definition before at least proving uniqueness of the identity element. Alternatively, we could have included the existence of a distinguished identity element in the definition. In the end, the two approaches are logically equivalent.

Also note that to show that a monoid is a group, it is sufficient to show that each element has either a left-inverse or a right-inverse. Let , let be a right-inverse of , and let be a right-inverse of . Then, . Thus, any right-inverse is also a left-inverse, or . A similar argument can be made for left-inverses.

Theorem 6: The inverse of any element is unique.

Proof: Let and let and be inverses of . Then, .

Thus, we can speak of the inverse of an element, and we will denote this element by . We also observe this nice property:

Corollary 7: .

Proof: This follows immediately since .

The next couple of theorems may appear obvious, but in the interest of keeping matters fairly rigorous, we allow ourselves to state and prove seemingly trivial statements.

Theorem 8: Let be a group and . Then .

Proof: The result follows by direct computation: .

Theorem 9: Let . Then, if and only if . Also, if and only if .

Proof: We will prove the first assertion. The second is identical. Assume . Then, multiply on the left by to obtain . Secondly, assume . Then, multiply on the left by to obtain .

Theorem 10: The equation has a unique solution in for any .

Proof: We must show existence and uniqueness. For existence, observe that is a solution in . For uniqueness, multiply both sides of the equation on the left by to show that this is the only solution.

Notation: Let be a group and . We will often encounter a situation where we have a product . For these situations, we introduce the shorthand notation if is positive, and if is negative. Under these rules, it is straightforward to show and and for all .

Definition 11: (i) The order of a group , denoted or , is the number of elements of if is finite. Otherwise is said to be infinite.

(ii) The order of an element of , similarly denoted or , is defined as the lowest positive integer such that if such an integer exists. Otherwise is said to be infinite.

Theorem 12: Let be a group and . Then .

Proof: Let the order of be . Then, , being the smallest positive integer such that this is true. Now, multiply by on the left and on the right to obtain implying . Thus, we have shown that . A similar argument in the other direction shows that . Thus, we must have , proving the theorem.

Corollary 13: Let be a group with . Then, .

Proof: By Theorem 12, we have that .

Theorem 14: An element of a group not equal to the identity has order 2 if and only if it is its own inverse.

Proof: Let have order 2 in the group . Then, , so by definition, . Now, assume and . Then . Since , 2 is the smallest positive integer satisfying this property, so has order 2.

Definition 15: Let be a group such that for all , . Then, is said to be commutative or abelian.

When we are dealing with an abelian group, we will sometimes use so-called additive notation, writing for our binary operation and replacing with . In such cases, we only need to keep track of the fact that is an integer while is a group element. We will also talk about the sum of elements rather than their product.

Abelian groups are in many ways nicer objects than general groups. They also admit more structure where ordinary groups do not. We will see more about this later when we talk about structure-preserving maps between groups.

Definition 16: Let be a group. A subset is called a generating set for if every element in can be written in terms of elements in . We write .

Now that we have our definitions in place and have a small arsenal of theorems, let us look at three (really, two and a half) important families of groups.

Multiplication tables

[edit | edit source]

We will now show a convenient way of representing a group structure, or more precisely, the multiplication rule on a set. This notion will not be limited to groups only, but can be used for any structure with any number of operations. As an example, we give the group multiplication table for the Klein 4-group . The multiplication table is structured such that is represented by the element in the "-position", that is, in the intersection of the -row and the -column.

This next group is for the group of integers under addition modulo 4, called . We will learn more about this group later.

We can clearly see that and are "different" groups. There is no way to relabel the elements such that the multiplication tables coincide. There is a notion of "equality" of groups that we have not yet made precise. We will get back to this in the section about group homomorphisms.

The reader might have noticed that each row in the group table features each element of the group exactly once. Indeed, assume that an element appeared twice in some row of the multiplication table for . Then there would exist such that , implying and contradicting the assumption of appearing twice. We state this as a theorem:

Theorem 17: Let be a group and . Then .

Using this, the reader can use a multiplication table to find all groups of order 3. He/she will find that there is only one possibility.

Problems

[edit | edit source]

Problem 1: Show that , the set of matrices with real entries, forms a group under the operation of matrix addition.

Problem 2: Let be vector spaces and be the set of linear maps from to . Show that forms an abelian group by defining .

Problem 3: Let be generated by the elements such that , and . Show that forms a group. Are any of the conditions above redundant? When the identity e is written 1 and m = −1, then is called the quaternion group. The are imaginary units. Using 1 and one of these as a basis for a number plane results in the complex number plane.

Problem 4: Let be any nonempty set and consider the set . Show that has a natural group structure.

Answer

is the set of functions . Let and define the binary operation for all . Then is a group with identity such that for all and inverses for all .

Problem 5: Let be a group with two distinct elements and , both of order 2. Show that has a third element of order 2.

Answer

We consider first the case where . Then and is distinct from and . If , then and is distinct from and .

Problem 6: Let be a group with one and only one element of order 2. Show that .

Answer

Since the product of two elements generally depends on the order in which we multiply them, the stated product is not necessarily well-defined. However, it works out in this case.

Since every element of appears once in the product, for every element , the inverse of must appear somewhere in the product. That, is, unless in which case is its own inverse by Theorem 14. Now, applying Corollary 13 to the product shows that its order is that same as the order of the product of all elements of order 2 in . But there is only one such element, , so the order of the product is 2. Since the only element in having order 2 is , the equality follows.


Group Theory/Subgroup

Subgroups

[edit | edit source]

We are about to witness a universal aspect of mathematics. That is, whenever we have any sort of structure, we ask ourselves: does it admit substructures? In the case of groups, the answer is yes, as we will immediately see.

Definition 1: Let be a group. Then, if is a subset of which is a group in its own right under the same operation as , we call a subgroup of and write .

Example 2: Any group has at least 2 subgroups; itself and the trivial group . These are called the improper and trivial subgroups of , respectively.

Naturally, we would like to have a method of determining whether a given subset of a group is a subgroup. The following two theorems provide this. Since naturally inherits the associativity property from , we only need to check closure.

Theorem 3: A nonempty subset of a group is a subgroup if and only if

(i) is closed under the operation on . That is, if , then ,
(ii) ,
(iii) is closed under the taking of inverses. That is, if , then .

Proof: The left implication follows directly from the group axioms and the definition of subgroup. For the right implication, we have to verify each group axiom for . Firstly, since is closed, it is a binary structure, as required, and as mentioned, inherits associativity from G. In addition, has the identity element and inverses, so is a group, and we are done.

There is, however, a more effective method. Each of the three criteria listed above can be condensed into a single one.

Theorem 4: Let be a group. Then a nonempty subset is a subgroup if and only if .

Proof: Again, the left implication is immediate. For the right implication, we have to verify the (i)-(iii) in the previous theorem. First, assume . Then, letting , we obtain , taking care of (ii). Now, since we have so is closed under taking of inverses, satisfying (iii). Lastly, assume . Then, since , we obtain , so is closed under the operation of , satisfying (i), and we are done.

All right, so now we know how to recognize a subgroup when we are presented with one. Let's take a look at how to find subgroups of a given group. The next theorem essentially solves this problem.

Theorem 5: Let be a group and . Then the subset is a subgroup of , denoted and called the subgroup generated by . In addition, this is the smallest subgroup containing in the sense that if is a subgroup and , then .

Proof: First we prove that is a subgroup. To see this, note that if , then there exists integers such that . Then, we observe that since , so is a subgroup of , as claimed. To show that it is the smallest subgroup containing , observe that if is a subgroup containing , then by closure under products and inverses, for all . In other words, . Then automatically since is a subgroup of .

Theorem 6: Let and be subgroups of a group . Then is also a subgroup of .

Proof: Since both and contain the identity element, their intersection is nonempty. Let . Then and . Since both and are subgroups, we have and . But then . Thus is a subgroup of .

Theorem 6 can easily be generalized to apply for any arbitrary intersection where is a subgroup for every in an arbitrary index set . The reasoning is identical, and the proof of this generalization is left to the reader to formalize.

Definition 7: Let be a group and be a subgroup of . Then is called a left coset of . The set of all left cosets of in is denoted . Likewise, is called a right coset, and the set of all right cosets of in is denoted .

Lemma 8: Let be a group and be a subgroup of . Then every left coset has the same number of elements.

Proof: Let and define the function by . We show that is a bijection. Firstly, by left cancellation, so is injective. Secondly, let . Then for some and , so is surjective and a bijection. It follows that , as was to be shown.

Lemma 9: The relation defined by is an equivalence relation.

Proof: Reflexivity and symmetry are immediate. For transitivity, let and . Then , so and we are done.

Lemma 10: Let be a group and be a subgroup of . Then the left cosets of partition .

Proof: Note that for some . Since is an equivalence relation and the equivalence classes are the left cosets of , these automatically partition .

Theorem 11 (Lagrange's theorem): Let be a finite group and be a subgroup of . Then .

Proof: By the previous lemmas, each left coset has the same number of elements and every is included in a unique left coset . In other words, is partitioned by left cosets, each contributing an equal number of elements . The theorem follows.

Note 12: Each of the previous theorems have analagous versions for right cosets, the proofs of which use identical reasoning. Stating these theorems and writing out their proofs are left as an exercise to the reader.

Corollary 13: Let be a group and be a subgroup of . Then right and left cosets of have the same number of elements.

Proof: Since is a left and a right coset we immediately have for all .

Corollary 14: Let be a group and be a subgroup of . Then the number of left cosets of in and the number of right cosets of in are equal.

Proof: By Lagrange's theorem and its right coset counterpart, we have . We immediately obtain , as was to be shown.

Now that we have developed a reasonable body of theory, let us look at our first important family of groups, namely the cyclic groups.

Problems

[edit | edit source]

Problem 1 (Matrix groups): Show that:

i) The group of invertible matrices is a subgroup of . This group is called the general linear group of order .
ii) The group of orthogonal matrices is a subgroup of . This group is called the orthogonal group of order .
iii) The group is a subgroup of . This group is called the special orthogonal group of roder .
iv) The group of unitary matrices is a subgroup of . This is called the unitary group of order .
v) The group is a subgroup of . This is called the special unitary group of order .

Problem 2: Show that if are subgroups of , then is a subgroup of if and only if or .


Group Theory/Cyclic groups

  • A cyclic group generated by g is


  • where
  • Induction shows:

A cyclic group of order n is isomorphic to the integers modulo n with addition

[edit | edit source]

Theorem

[edit | edit source]

Let Cm be a cyclic group of order m generated by g with

Let be the group of integers modulo m with addition

Cm is isomorphic to

Lemma

[edit | edit source]

Let n be the minimal positive integer such that gn = e

Proof of Lemma
Let i > j. Let i - j = sn + r where 0 ≤ r < n and s,r,n are all integers.
1.

2. as i - j = sn + r, and gn = e
3.

4. as n is the minimal positive integer such that gn = e
and 0 ≤ r < n

5. 0. and 7.
6.

Proof

[edit | edit source]
0. Define   
Lemma shows f is well defined (only has one output for each input).
f is homomorphism:
f is injective by lemma
f is surjective as both and have m elements and f is injective

Cyclic groups

[edit | edit source]

In the previous section about subgroups we saw that if is a group with , then the set of powers of , constituted a subgroup of , called the cyclic subgroup generated by . In this section, we will generalize this concept, and in the process, obtain an important family of groups which is very rich in structure.

Definition 1: Let be a group with an element such that . Then is called a cyclic group, and is called a generator of . Alternatively, is said to generate . If there exists an integer such that , and is the smallest positive such integer, is denoted , the cyclic group of order . If no such integer exists, is denoted , the infinite cyclic group.

The infinite cyclic group can also be denoted , the free group with one generator. This is foreshadowing for a future section and can be ignored for now.

Theorem 2: Any cyclic group is abelian.

Proof: Let be a cyclic group with generator . Then if , then and for some . To show commutativity, observe that and we are done.

Theorem 3: Any subgroup of a cyclic group is cyclic.

Proof: Let be a cyclic group with generator , and let . Since , in particular every element of equals for some . We claim that if the lowest positive integer such that , then . To see this, let . Then and for unique . Since is a subgroup and , we must have . Now, assume that . Then contradicts our assumption that is the least positive integer such that . Therefore, . Consequently, only if , and and is cyclic, as was to be shown.

As the alert reader will have noticed, the preceding proof invoked the notion of division with remainder which should be familiar from number theory. Our treatment of cyclic groups will have close ties with notions from number theory. This is no coincidence, as the next few statements will show. Indeed, an alternative title for this section could have been "Modular arithmetic and integer ideals". The notion of an ideal may not yet be familiar to the reader, who is asked to wait patiently until the chapter about rings.

Theorem 4: Let with addition defined modulo . That is , where . We denote this operation by . Then is a cyclic group.

Proof: We must first show that is a group, then find a generator. We verify the group axioms. Associativity is inherited from the integers. The element is an identity element with respect to . An inverse of is an element such that . Thus . Then, , and so , and is a group. Now, since , generates and so is cyclic.

Unless we explicitly state otherwise, by we will always refer to the cyclic group . Since the argument for the generator of can be made valid for any integer , this shows that also is cyclic with the generator .

Theorem 5: An element is a generator if and only if .

Proof: We will need the following theorem from number theory: If are integers, then there exists integers such that , if and only if . We will not prove this here. A proof can be found in the number theory section.

For the right implication, assume that . Then for all , for some integer . In particular, there exists an integer such that . This implies that there exists another integer such that . By the above-mentioned theorem from number theory, we then have . For the left implication, assume . Then there exists integers such that , implying that in . Since generates , it must be true that is also a generator, proving the theorem.

We can generalize Theorem 5 a bit by looking at the orders of the elements in cyclic groups.

Theorem 6: Let . Then, .

Proof: Recall that the order of is defined as the lowest positive integer such that in . Since is cyclic, there exists an integer such that is minimal and positive. This is the definition of the least common multiple; . Recall from number theory that . Thus, , as was to be proven.

Theorem 7: Every subgroup of is of the form .

Proof: The fact that any subgroup of is cyclic follows from Theorem 3. Therefore, let generate . Then we see immediately that .

Theorem 8: Let be fixed, and let . Then is a subgroup of generated by .

Proof: We msut first show that is a subgroup. This is immediate since . From the proof of Theorem 3, we see that any subgroup of is generated by its lowest positive element. It is a theorem of number theory that the lowest positive integer such that for fixed integers and equals the greatest common divisor of and or . Thus generates .

Theorem 9: Let and be subgroups of . Then is the subgroup generated by .

Proof: The fact that is a subgroup is obvious since and are subgroups. To find a generator of , we must find its lowest positive element. That is, the lowest positive integer such that is both a multiple of and of . This is the definition of the least common multiple of and , or , and the result follows.

It should be obvious by now that and , and and are the same groups. This will be made precise in a later section but can be visualized by denoting any generator of or by .

We will have more to say about cyclic groups later, when we have more tools at our disposal.


Group Theory/Permutation groups

Permutation Groups

[edit | edit source]

For any finite non-empty set S, A(S) the set of all 1-1 transformations (mapping) of S onto S forms a group called Permutation group and any element of A(S) i.e., a mapping from S onto itself is called Permutation.

Symmetric groups

[edit | edit source]

Theorem 1: Let be any set. Then, the set of bijections from to itself, , form a group under composition of functions.

Proof: We have to verify the group axioms. Associativity is fulfilled since composition of functions is always associative: where the composition is defined. The identity element is the identity function given by for all . Finally, the inverse of a function is the function taking to for all . This function exists and is unique since is a bijection. Thus is a group, as stated.

is called the symmetric group on . When , we write its symmetric group as , and we call this group the symmetric group on letters. It is also called the group of permutations on letters. As we will see shortly, this is an appropriate name.

Instead of , we will use a different symbol, namely , for the identity function in .

When , we can specify by specifying where it sends each element. There are many ways to encode this information mathematically. One obvious way is to identify as the unique matrix with value in the entries and elsewhere. Composition of functions then corresponds to multiplication of matrices. Indeed, the matrix corresponding to has value in the entries , which is the same as , so the product has value in the entries . This notation may seem cumbersome. Luckily, there exists a more convenient notation, which we will make use of.

We can represent any by a matrix . We obviously lose the correspondence between function composition and matrix multiplication, but we gain a more readable notation. For the time being, we will use this.

Remark 2: Let . Then the product is the function obtained by first acting with , and then by . That is, . This point is important to keep in mind when computing products in . Some textbooks try to remedy the frequent confusion by writing functions like , that is, writing arguments on the left of functions. We will not do this, as it is not standard. The reader should use the next example and theorem to get a feeling for products in .

Example 3: We will show the multiplication table for . We introduce the special notation for : , , , , and . The multiplication table for is then

Theorem 4: has order .

Proof: This follows from a counting argument. We can specify a unique element in by specifying where each is sent. Also, any permutation can be specified this way. Let . In choosing we are completely free and have choices. Then, when choosing we must choose from , giving a total of choices. Continuing in this fashion, we see that for we must choose from , giving a total of choices. The total number of ways in which we can specify an element, and thus the number of elements in is then , as was to be shown.

Theorem 5: is non-abelian for all .

Proof: Let be the function only interchanging 1 and 2, and be the function only interchanging 2 and 3. Then and . Since , is not abelian.

Definition 6: Let such that for some . Then is called an -cycle, where is the smallest positive such integer. Let be the set of integers such that . Two cycles are called disjoint if . Also, a 2-cycle is called a transposition.

Remark 3: It's important to realize that if , then so is . If , then if we have that is not 1–1.

Theorem 7: Let . If , then .

Proof: For any integer such that but we have . A similar argument holds for but . If , we must have . Since , we have now exhausted every , and we are done.

Theorem 8: Any permutation can be represented as a composition of disjoint cycles.

Proof: Let . Choose an element and compute . Since is finite of order , we know that exists and . We have now found a -cycle including . Since , this cycle may be factored out from to obtain . Repeat this process, which terminates since is finite, and we have constructed a composition of disjoint cycles that equals .

Now that we have shown that all permuations are just compositions of disjoint cycles, we can introduce the ultimate shorthand notation for permutations. For an -cycle , we can show its action by choosing any element and writing .

Theorem 9: Any -cycle can be represented as a composition of transpositions.

Proof: Let . Then, (check this!), omitting the composition sign . Interate this process to obtain .

Note 10: This way of representing as a product of transpositions is not unique. However, as we will see now, the "parity" of such a representation is well defined.

Definition 11: The parity of a permutation is even if it can be expressed as a product of an even number of transpositions. Otherwise, it is odd. We define the function if is even and if is odd.

Lemma 12: The identity has even parity.

Proof: Observe first that for . Thus the minimum number of transpositions necessary to represent is 2: . Now, assume that for any representation using less than transpositions must be even. Thus, let . Now, since in particular , we must have for some . Since disjoint transpositions commute, and where , it is always possible to configure the transpositions such that the first two transpositions are either , reducing the number of transposition by two, or . In this case we have reduced the number of transpositions involving by 1. We restart the same process as above. with the new representation. Since only a finite number of transpositions move , we will eventually be able to cancel two permutations and be left with transpositions in the product. Then, by the induction hypothesis, must be even and so is even as well, proving the lemma.

Theorem 13: The parity of a permutation, and thus the function, is well-defined.

Proof: Let and write as a product of transposition in two different ways: . Then, since has even parity by Lemma 11 and . Thus, , and , so has a uniquely defined parity, and consequentially is well-defined.

Theorem 14: Let . Then, .

Proof: Decompose and into transpositions: , . Then has parity given by . If both are even or odd, is even and indeed . If one is odd and one is even, is odd and again , proving the theorem.

Lemma 15: The number of even permutations in equals the number of odd permutations.

Proof: Let be any even permutation and a transposition. Then has odd parity by Theorem 14. Let be the set of even permutations and the set of odd permutations. Then the function given by for any and a fixed transposition , is a bijection. (Indeed, it is a transposition in !) Thus and have the same number of elements, as stated.

Definition 16: Let the set of all even permutations in be denoted by . is called the alternating group on letters.

Theorem 17: is a group, and is a subgroup of of order .

Proof: We first show that is a group under composition. Then it is automatically a subgroup of . That is closed under composition follows from Theorem 14 and associativity is inherited from . Also, the identity permutation is even, so . Thus is a group and a subgroup of . Since the number of even and odd permutations are equal by Lemma 14, we then have that , proving the theorem.

Theorem 18: Let . Then is generated by the 3-cycles in .

Proof: We must show that any even permutation can be decomposed into 3-cycles. It is sufficient to show that this is the case for pairs of transpositions. Let be distinct. Then, by some casework,

i) ,
ii) , and
iii) ,

proving the theorem.

In a previous section we proved Lagrange's Theorem: The order of any subgroup divides the order of the parent group. However, the converse statement, that a group has a subgroup for every divisor of its order, is false! The smallest group providing a counterexample is the alternating group , which has order 12 but no subgroup of order 6. It has subgroups of orders 3 and 4, corresponding respectively to the cyclic group of order 3 and the Klein 4-group. However, if we add any other element to the subgroup corresponding to , it generates the whole group . We leave it to the reader to show this.

Dihedral Groups

[edit | edit source]
The lines represent the reflection symmetries of a regular hexagon
Illustration of the elements of the dihedral group as rotations and reflections of a stop sign.

The dihedral groups are the symmetry groups of regular polygons. As such, they are subgroups of the symmetric groups. In general, a regular -gon has rotational symmetries and reflection symmetries. The dihedral groups capture these by consisting of the associated rotations and reflections.

Definition 19: The dihedral group of order , denoted , is the group of rotations and reflections of a regular -gon.

Theorem 20: The order of is precisely .

Proof: Let be a rotation that generates a subgroup of order in . Obviously, then captures all the pure rotations of a regular -gon. Now let be any rotation in The rest of the elements can then be found by composing each element in with . We get a list of elements . Thus, the order of is , justifying its notation and proving the theorem.

Remark 21: From this proof we can also see that is a generating set for , and all elements can be obtained by writing arbitrary products of and and simplifying the expression according to the rules , and . Indeed, as can be seen from the figure, a rotation composed with a reflection is new reflection.


Group Theory/Homomorphism

We are finally making our way into the meat of the theory. In this section we will study structure-preserving maps between groups. This study will open new doors and provide us with a multitude of new theorems.

Up until now we have studied groups at the "element level". Since we are now about to take a step back and study groups at the "homomorphism level", readers should expect a sudden increase in abstraction starting from this section. We will try to ease the reader into this increase by keeping one foot at the "element level" throughout this section.

From here on out the notation will denote the identity element in the group unless otherwise specified.

Group homomorphisms

[edit | edit source]

Definition 1: Let and be groups. A homomorphism from to is a function such that for all ,

.

Thus, a homomorphism preserves the group structure. We have included the multiplication symbols here to make explicit that multiplication on the left side occurs in , and multiplication on the right side occurs in .

Already we see that this section is different from the previous ones. Up until now we have, excluding subgroups, only dealt with one group at a time. No more! Let us start by deriving some elementary and immediate consequences of the definition.

Theorem 2: Let be groups and a homomorphism. Then . In other words, the identity is mapped to the identity.

Proof: Let . Then, , implying that is the identity in , proving the theorem.

Theorem 3: Let be groups and a homomorphism. Then for any , . In other words, inverses are mapped to inverses.

Proof: Let . Then implying that , as was to be shown.

Theorem 4: Let be groups, a homomorphism and let be a subgroup of . Then is a subgroup of .

Proof: Let . Then and . Since , , and so is a subgroup of .

Theorem 5: Let be groups, a homomorphism and let be a subgroup of . Then is a subgroup of .

Proof: Let . Then , and since is a subgroup, . But then, , and so is a subgroup of .

From Theorem 4 and Theorem 5 we see that homomorphisms preserve subgroups. Thus we can expect to learn a lot about the subgroup structure of a group by finding suitable homomorphisms into .

In particular, every homomorphism has associated with it two important subgroups.

Definition 6: A homomorphism is called an isomorphism if it is bijective and its inverse is a homomorphism. Two groups are called isomorphic if there exists an isomorphism between them, and we write to denote " is isomorphic to ".

Theorem 7: A bijective homomorphism is an isomorphism.

Proof: Let be groups and let be a bijective homomorphism. We must show that the inverse is a homomorphism. Let . then there exist unique such that and . Then we have since is a homomorphism. Now apply to all equations. We obtain , and , so is a homomorphism and thus is an isomorphism.

Definition 8: Let be groups. A homomorphism that maps every element in to is called a trivial homomorphism (or zero homomorphism), and is denoted by

Definition 9: Let be a subgroup of a group . Then the homomorphism given by is called the inclusion of into . Let be a group isomorphic to a subgroup of a group . Then the isomorphism induces an injective homomorphism given by , called an imbedding of into . Obviously, .

Definition 10: Let be groups and a homomorphism. Then we define the following subgroups:

i) , called the kernel of , and
ii) , called the image of .

Theorem 11: The composition of homomorphisms is a homomorphism.

Proof: Let be groups and and homomorphisms. Then is a function. We must show it is a homomorphism. Let . Then , so is indeed a homomorphisms.

Theorem 12: Composition of homomorphisms is associative.

Proof: This is evident since homomorphisms are functions, and composition of functions is associative.

Corollary 13: The composition of isomorphisms is an isomorphism.

Proof: This is evident from Theorem 11 and since the composition of bijections is a bijection.

Theorem 14: Let be groups and a homomorphism. Then is injective if and only if .

Proof: Assume and . Then , implying that . But by assumption then , so is injective. Assume now that and . Then there exists another element such that . But then . Since both and map to , is not injective, proving the theorem.

Corollary 15: Inclusions are injective.

Proof: The result is immediate. Since for all , we have .

The kernel can be seen to satisfy a universal property. The following theorem explains this, but it is unusually abstract for an elementary treatment of groups, and the reader should not worry if he/she cannot understand it immediately.

Commutative diagram showing the universal property of kernels.

Theorem 16: Let be groups and a group homomorphism. Also let be a group and a homomorphism such that . Also let is the inclusion of into . Then there exists a unique homomorphism such that.

Proof: Since , by definition we must have , so exists. The commutativity then forces , so is unique.

Definition 17: A commutative diagram is a pictorial presentation of a network of functions. Commutativity means that when several routes of function composition from one object lead to the same destination, the two compositions are equal as functions. As an example, the commutative diagram on the right describes the situation in Theorem 16. In the commutative diagrams (or diagrams for short, we will not show diagrams which no not commute) shown in this chapter on groups, all functions are implicitly assumed to be group homomorphisms. Monomorphisms in diagrams are often emphasized by hooked arrows. In addition, epimorphims are often emphasized by double headed arrows. That an inclusion is a monomorphism will be proven shortly.

Remark 18: From the commutative diagram on the right, the kernel can be defined completely without reference to elements. Indeed, Theorem 16 would become the definition, and our Definition 10 i) would become a theorem. We will not entertain this line of thought in this book, but the advanced reader is welcome to work it out for him/herself.

Automorphism Groups

[edit | edit source]

In this subsection we will take a look at the homomorphisms from a group to itself.

Definition 19: A homomorphism from a group to itself is called an endomorphism of . An endomorphism which is also an isomorphism is called an automorphism. The set of all endomorphisms of is denoted , while the set of all automorphisms of is denoted .

Theorem 20: is a monoid under composition of homomorphisms. Also, is a submonoid which is also a group.

Proof: We only have to confirm that is closed and has an identity, which we know is true. For , the identity homomorphism is an isomorphism and the composition of isomorphisms is an isomorphism. Thus is a submonoid. To show it is a group, note that the inverse of an automorphism is an automorphism, so is indeed a group.

Groups with Operators

[edit | edit source]

An endomorphism of a group can be thought of as a unary operator on that group. This motivates the following definition:

Definition 21: Let be a group and . Then the pair is called a group with operators. is called the operator domain and its elements are called the homotheties of . For any , we introduce the shorthand for all . Thus the fact that the homotheties of are endomorphisms can be expressed thus: for all and , .

Example 22: For any group , the pair is trivially a group with operators.

Lemma 23: Let be a group with operators. Then can be extended to a submonoid of such that the structure of is identical to .

Proof: Let include the identity endomorphism and let be a generating set. Then is closed under compositions and is a monoid. Since any element of is expressible as a (possibly empty) composition of elements in , the structures are identical.

In the following, we assume that the operator domain is always a monoid. If it is not, we can extend it to one by Lemma 23.

Definition 24: Let and be groups with operators with the same operator domain. Then a homomorphism is a group homomorphism such that for all and , we have .

Definition 25: Let be a group with operators and a subgroup of . Then is called a stable subgroup (or a -invariant subgroup) if for all and , . We say that respects the homotheties of . In this case is a sub-group with operators.

Example 26: Let be a vector space over the field . If we by denote the underlying abelian group under addition, then is a group with operators, where for any and , we define . Then the stable subgroups are precisely the linear subspaces of (show this).

Problems

[edit | edit source]

Problem 1: Show that there is no nontrivial homomorphism from to .


Group Theory/Normal subgroups and Quotient groups

In the preliminary chapter we discussed equivalence classes on sets. If the reader has not yet mastered this notion, he/she is advised to do so before starting this section.

Normal Subgroups

[edit | edit source]

Recall the definition of kernel in the previous section. We will exhibit an interesting feature it possesses. Namely, let be in the coset . Then there exists a such that for all . This is easy to see because a coset of the kernel includes all elements in that are mapped to a particular element. The kernel inspires us to look for what are called normal subgroups.

Definition 1: A subgroup is called normal if for all . We may sometimes write to emphasize that is normal in .

Theorem 2: A subgroup is normal if and only if for all .

Proof: By the definition, a subgroup is normal if and only if since conjugation is a bijection. The theorem follows by multiplying on the right by .

We stated that the kernel is a normal subgroup in the introduction, so we had better well prove it!

Theorem 3: Let be any homomorphism. Then is normal.

Proof: Let and . Then , so , proving the theorem.

Theorem 4: Let be groups and a group homomorphism. Then if is a normal subgroup of , then is normal in .

Proof: Let and . Then since is normal in , and so , proving the theorem.

Theorem 5: Let be groups and a group homomorphism. Then if is a normal subgroup of , is normal in .

Proof: Let And . Then if such that , we have for some since is normal. Thus for all and so is normal in .

Corollary 6: Let be groups and a surjective group homomorphism. Then if is a normal subgroup of , is normal in .

Proof: Replace with in the proof of Theorem 5.

Remark 7: If is a normal subgroup of and is a normal subgroup of , it does not necessarily imply that is a normal subgroup of . The reader is invited to display a counterexample of this.

Theorem 8: Let be a group and be subgroups. Then

i) If is normal, then is a subgroup of .
ii) If both and are normal, then is a normal subgroup of .
iii) If and are normal, then is a normal subgroup of .

Proof: i) Let be normal. First, since for each , there exists such that , so . To show is a subgroup, let . Then for some since is normal, and so is a subgroup.

ii) Let and . Then since both and are normal, there exists such that . It follows that and so is normal.

iii) Let and . Then since H is normal, and similarly . Thus and it follows that is normal.

Examples of Normal Subgroups

[edit | edit source]

In the following, let be any group. Then has associated with it the following normal subgroups.

i) The center of , denoted , is the subgroup of elements which commute with all others. . That is a normal subgroup is easy to verify and is left to the reader.
ii) The commutator subgroup of , denoted or , is the subgroup generated by the subset where for all . For , we introduce the shorthand . Then we have , such that for any product of commutators where all elements are in , we have , and so is normal.

Remark 9: We can iterate the commutator subgroup construction and define and for all . We will not use the commutator subgroup in future results in this book, so for us it is merely a curiosity.

Equivalence Relations on Groups

[edit | edit source]

Why are normal subgroups important? In the preliminary chapter we discussed equivalence relations and the associated set of equivalence classes. If is a group and is an equivalence relation, when does admit a group structure? Of course we have to specify the multiplication on . We will do so now.

Definition 10: Let be a group and is an equivalence relation on , we define multiplication on the equivalence classes in such that for all ,

This is indeed the only natural way to do it. Take the two equivalence classes, choose representatives, compute their product and take its equivalence class. The alert reader will have only one thing on his/her mind: is this well defined? For a general equivalence relation, the answer is no. The reader is invited to come up with an example. What is more interesting is when is it well defined? By the definition above, we obviously need the projection map defined by to be a homomorphism. We can in fact condense the requirements down to two, both having to do with cancellation laws.

Theorem 11: Let be a group and an equivalence relation on . Then is a group under the natural multiplication if and only if for all

.

Proof: Assume is a group. Since , the property follows from the cancellation laws in . Assume now that the property holds. Then its multiplication rule is well defined, and must verify that is a group. Let , then associativity is inherited from :

.

The identity in is the equivalence class of , :

.

Finally, the inverse of is :

.

So really defines a group structure, proving the theorem.

We will call an equivalence relation compatible with if is a group. Then, is called the quotient group of by . Also, as an immediate consequence, this makes into a homomorphism, but not just any homomorphism! It satisfies a universal property!

Commutative diagram showing the universal property satisfied by the projection homomorphism.

Theorem 12: Let be en equivalence relation compatible with , and a group homomorphism such that . Then there exists a unique homomorphism such that .

Proof: In the preliminary chapter on set theory, we showed the corresponding statement for sets, so we know that exists as a function between sets. We have to show that it is a homomorphism. This follows immediately: since by commutativity, we have . As stated already, shows uniqueness, proving the theorem.

Lemma 13: Let be an equivalence relation on a group such that . Then is a subgroup of and .

Proof: First off, is nonempty since . Let . Then by multiplying on the left by . Then since we have by the same argument. Applying transitivity gives . Finally, multiplying on the left by gives , giving and so is a subgroup.

Assume for . Then implying . Thus . Now assume . Then and so and finally .

Assume . Then since is a subgroup, we have and so . Finally, assume . Then . Since in particular , this implies , completing the proof.

The mirror version using right cosets and the equivalence relation and is completely analogous. Stating the theorem and writing out the proof is left to the reader as an exercise.

We have showed how an equivalence relation defines a subgroup of . In fact the equivalence classes are all cosets of this subgroup. We will now go the other way, and show how a subgroup defines an equivalence relation on .

Lemma 14: Let be a subgroup of a group . Then,

i) is an equivalence relation such that for all .
ii) is an equivalence relation such that for all .

Proof: We will prove i). The proof for ii) is similar and is left as an exercise for the reader.

The fact that is an equivalence relation and that was proven in the section on subgroups. Assume . Then for all , such that . Now assume , Then such that , completing the proof.

Theorem 15: For every equivalence relation on G such that , there exists a unique subgroup of such that are precisely the left cosets of .

Proof: This follows from Lemma 13 and Lemma 14.

Again, the mirror statement is completely analogous. Stating the theorem is left to the reader as an exercise.

Quotients with respect to Normal Subgroups

[edit | edit source]

Lemma 16: Let be the equivalence relation given by , where is a subgroup of G. Then we know that is compatible if and only if is a normal subgroup.

Proof: Assume is compatible, and . Then , and compatibility gives us , and so . Since is arbitrary, we obtain for all and so is normal. Assume now that is normal. Then , and for all . Using this, we obtain and similarly for the right hand case, so is compatible with .

Definition 17: When an equivalence relation is given by specifying a normal subgroup , the quotient group with respect to this equivalence relation is denoted . We then refer to as the quotient of with respect to , or modulo . Note that this complies with previous definitions of this notation.

Multiplication in is given as before as , with identity and for all .

Definition 18: Let be a normal subgroup of . Then we define the projection homomorphism by for all .

Theorem 19: A subgroup is normal if and only if it is the kernel of some homomorphism.

Proof: We have already covered the left implication. For the right implication, assume is normal. Then is a group and we have the projection homomorphism as defined above. Since for all we have , and so is the kernel of a homomorphism.

Theorem 20: Let be groups, a homomorphism and a normal subgroup of such that . Then there exists a unique homomorphism such that .

Proof: This follows from Theorem 12 by letting .

The Isomorphism Theorems

[edit | edit source]
Commutative diagram showing the first isomorphism theorem. is an isomorphism.

Theorem 21 (First Isomorphism Theorem): Let be groups and a homomorphism. Then .

Proof: From Theorem 20 we have that there exists a unique homomorphism such that . We have to show that is an isomorphism when we corestrict to . This is immediate, since by Lemma 13, so that is injective, and for any there is a such that so that it is surjective and therefore an isomorphism.

Lemma 22: Let be a group, a subgroup and a normal subgroup of . Then is a normal subgroup of .

Proof: Let and . Then since and is a subgroup and since , and is normal in . Thus and is normal in .

Theorem 23 (Second Isomorphism Theorem): Let be a group, a subgroup and a normal subgroup of . Then .

Proof: Define by for all . is surjective since any element in can be written as with and , so . We also have that and so by the first isomorphism theorem.

Lemma 24: Let be a group, and let be normal subgroups of such that . Then is a normal subgroup of .

Proof: Let and . Then for some since is normal. Thus , showing that is normal in .

Theorem 25 (Third Isomorphism Theorem) Let be a group, and let be normal subgroups of such that . Then .

Proof: Let be given by . This is well defined and surjective since , and is a homomorphism. Its kernel is given by , so by the first isomorphism theorem, .

Theorem 26 (The Correspondence Theorem): Let be a group and be a normal subgroup. Now let and . Then is an order-preserving bijection from to .

Proof: We must show injectivity and surjectivity. For injectivity, note that if , then , so if such that , then , proving injectivity. For surjectivity, let . Then , so that , and , proving surjectivity. Lastly, since implies that , the bijection is order-preserving.

Note 27: The correspondence Theorem is sometimes called The Forth Isomorphism Theorem.

Theorem 28: Let from Theorem 26. Then is normal if and only if is normal in , and then .

Proof: Since is surjective, normal implies normal. Assume that is normal. Then and so is normal since it is the preimage of a normal subgroup. To show the isomorphism, let be given by a composition of projections: . Then , so by the first isomorphism teorem, .

Corollary 29: Let be a group and a normal subgroup. Then for any there exists a unique subgroup such that and . Also, is normal in if and only if is normal in .

Proof: From Theorem 26 we have that the projection is a bijection, and since for all , we have . The second part follows from Theorem 28.

Theorem 2.6.? (Baumslag):

Let be a group, let be subgroups of such that , and let be a subgroup of such that . Then:

Proof:

Due to theorem 2.6.?, and are subgroups of . Further, theorem 2.6.? implies that . Therefore, the function

is a homomorphism.

Further, since is a subgroup of , for all we have:

And thus:

Therefore, . Thus, the first isomorphism theorem implies

Simple Groups

[edit | edit source]

Definition 30: A group is called simple if it has no non-trivial proper normal subgroups.

Example 31: Every cyclic group , where is prime, is simple.

Definition 32: Let be a group and a normal subgroup. is called a maximal normal subgroup if for any normal subgroup of , we have .

Theorem 33: Let be a group and a normal subgroup. Then is a maximal normal subgroup if and only if the quotient is simple.

Proof: By Theorem 26 and Theorem 28, has a nontrivial normal subgroup if and only if there exists a proper normal subgroup of such that . That is, is not maximal if and only if is not simple. The theorem follows.

Problems

[edit | edit source]

Problem 1: Recall the unitary and special unitary groups from the section about subgroups. Define the projective unitary group of order as the group . Similarly, define the projective special unitary group of order as .

i) Show that
ii) Using the second isomorphism theorem, show that .


Group Theory/Products and Free Groups

During the preliminary sections we introduced two important constructions on sets: the direct product and the disjoint union. In this section we will construct the analogous constructions for groups.

Product Groups

[edit | edit source]

Definition 1: Let and be groups. Then we can define a group structure on the direct product of the sets and as follows. Let . Then we define the multiplication componentwise: . This structure is called the direct product of and .

Remark 2: The product group is a group, with identity and inverses . The order of is .

Theorem 3: Let and be groups. Then we have homomorphisms and such that and for all . These are called the projections on the first and second factor, respectively.

Proof: The projections are obviously homomorphisms since they are the identity on one factor and the trivial homomorphism on the other.

Corollary 4: Let and be groups. Then and .

Proof: This follows immediately from plying the first isomorphism theorem to Theorem 3 and using that and .

Theorem 5: Let and be groups. Then and are normal subgroups of .

Proof: We prove the theorem for . The case for is similar. Let and . Then .

Commutative diagram showing the universal property satisfied by the direct product.

We stated that this is an analogous construction to the direct product of sets. By that we mean that it satiesfies the same universal property as the direct product. Indeed, to be called a "product", a construction should have to satisfy this universal property.

Theorem 6: Let and be groups. Then if is a group with homomorphisms and , then there exists a unique homomorphism such that and .

Proof: By the construction of the direct product, is a homomorphism if and only if and are homomorphisms. Thus defined by is one homomorphism satisfying the theorem, proving existence. By the commutativity condition this is the only such homomorphism, proving uniqueness.

Products of Cyclic Groups

[edit | edit source]

Theorem 7: The order of an element is .

Proof: The lowest positive number such that is the smallest number such that and for integers . It follows that divides both and and is the smallest such number. This is the definition of the least common divider.

Theorem 8: is isomorphic to if and only if and are relatively prime.

Proof: We begin with the left implication. Assume . Then is cyclic, and so there must exist an element with order . By Theorem 7 we there must then exist a generator in such that . Since each factor of the generator must generate its group, this implies , and so , meaning that and are relatively prime. Now assume that and are relatively prime and that we have generators of and of . Then since , we have and so . this implies that generates , which must then be isomorphic to a cyclic group of order , im particular .

Theorem 9 (Characterization of finite abelian groups): Let be an abelian group. Then there exists prime numbers and positive integers , unique up to order, such that

Proof: A proof of this theorem is currenly beyond our reach. However, we will address it during the chapter on modules.

Subdirect Products and Fibered Products

[edit | edit source]

Definition 10: A subdirect product of two groups and is a proper subgroup of such that the projection homomorphisms are surjective. That is, and .

Example 11: Let be a group. Then the diagonal is a subdirect product of with itself.

Definition 12: Let , and be groups, and let the homomorphisms and be epimorphisms. The fiber product of and over , denoted , is the subgroup of given by .

In this subsection, we will prove the equivalence between subdirect products and fiber products. Specifically, every subdirect product is a fiber product and vice versa. For this we need Goursat's lemma.

Theorem 13 (Goursat's lemma): Let and be groups, and a subdirect product of and . Now let and . Then can be identified with a normal subgroup of , and with a normal subgroup of , and the image of when projecting on is the graph of an isomorphism .

Proof:

Semidirect Products

[edit | edit source]

Further reading

[edit | edit source]

More on the automorphism groups of finite abelian groups. Some results require theory of group actions and ring theory, which is developed in a later section.

http://arxiv.org/pdf/math/0605185v1.pdf

Free Groups

[edit | edit source]

In order to properly define the free group, and thereafter the free product, we need some preliminary definitions.

Definition 10: Let be a set. Then a word of elements in is a finite sequence of elements of , where the positive integer is the word length.

Definition 11: Let and be two words of elements in . Define the concatenation of the two words as the word .

Now, we want to make a group consisting of the words of a given set , and we want this group to be the most general group of this kind. However, if we are to use the concatenation operation, which is the only obvious operation on two words, we are immediately faced with a problem. Namely, deciding when two words are equal. According to the above, the length of a product is the sum of the lengths of the factors. In other words, the length cannot decrease. Thus, a word of length multiplied with its inverse has length at least , while the identity word, which is the empty word, has length . The solution is an algorithm to reduce words into irreducible ones. These terms are defined below.

Definition 12: Let be any set. Define the set as the set of words of powers of elements of . That is, if and , then .

Definition 13: Let . Then we define a reduction of as follows. Scan the word from the left until the first pair of indices such that is encountered, if such a pair exists. Then replace with . Thus, the resulting word is . If no such pair exists, then and the word is called irreducible.

It should be obvious if with length , then will be irreducible. The details of the proof is left to the reader.

Definition 14: Define the free group on a set as follows. For each word of length , let the reduced word . Thus is the subset of irreducible words. As for the binary operation on , if have lengths and respectively, define as the completely reduced concatenation .

Theorem 15: is a group.

Proof:

Example 16: We will concider free groups on 1 and 2 letters. Let and . Then

with .
such that for any and for any . Example product: .

Group Presentations

[edit | edit source]

In this subsection we will briefly introduce another method used for defining groups. This is by prescribing a group presentation.

Definition 17: Let be a group and a subgroup. Then define the normal closure of in as the intersection of all normal subgroups in containing H. That is, if is the normal closure of , then

.

Definition 18: Let be a set and . Let be the normal closure of in and define the group . The elements of are called generators and the elements of are called relators. If is a group such that , then is said to be a presentation of .

The Free Product

[edit | edit source]

Using the previously defined notion of a group presentation, we can now define another type of group product.

Definition : Let and be groups with presentations and . Define the free product of and , denoted , as the group with the presentation .

Remark : Depending on the context, spesifically if we only deal with abelian groups, we may require the free product of abelian groups to be abelian. In that case, the free product equals the direct product. This is another example of abelian groups being better behaved than nonabelian groups.

Lemma : The free product includes the component groups as subgroups.

Remark : The free product is not a product in the sense discussed previously. It does not satifsy the universal property other products do. Instead, it satisfies the "opposide", or dual property, obtained by reversing the direction of all the arrows in the commutative diagram. We usually call a construction satisfying this universal property a coproduct.

Problems

[edit | edit source]

Problem 1: Let and be groups of relatively prime orders. Show that any subgroup of is the product of a subgroup of with a subgroup of .

Answer

Coming soon.


Group Theory/Group actions on sets

Interesting in it's own right, group actions are a useful tool in algebra and will permit us to prove the Sylow theorems, which in turn will give us a toolkit to describe certain groups in greater detail.

Basics

[edit | edit source]

Definition 1.8.1:

Let be an arbitrary set, and let be a group. A function

is called group action by on if and only if ( denoting the identity of )

  1. and
  2. .

When a certain group action is given in a context, we follow the prevalent convention to write simply for . In this notation, the requirements for a group action translate into

  1. and
  2. .

There is a one-to-one correspondence between group actions of on and homomorphisms .

Definition 1.8.2:

Let be a group and a set. Given a homomorphism , we may define a corresponding group action by

.

If we are given a group action , then

is a homomorphism. The thus defined correspondence between homomorphisms and group actions is a bijective one.

Proof:

1.

Indeed, if is a homomorphism, then

and
.

2.

is bijective for all , since

.

Let also . Then

.

3.

We note that the constructions treated here are inverse to each other; indeed, if we transform a homomorphism to an action via

and then turn this into a homomorphism via

,

we note that since .

On the other hand, if we start with a group action , turn that into a homomorphism

and turn that back into a group action

,

then we ended up with the same group action as in the beginning due to .

Examples 1.8.3:

  1. acts on via .
  2. acts on via matrix multiplication: , where the first juxtaposition stands for the group action definition and the second for matrix multiplication.

Types of actions

[edit | edit source]

Definitions 1.8.4:

A group action is called

  1. faithful iff ('identity on all elements of enforces identity on ')
  2. free iff ('different group elements map an to different elements of '), and
  3. transitive iff for all there exists such that .

Subtle analogies to real life become apparent if we note that an action is faithful if and only if for two distinct there exist such that , and it is free if and only if the elements are all different for all .

Theorem 1.8.5:

A free operation on a nonempty set is faithful.

Proof: .

We now attempt to characterise these three definitions; i.e. we try to find conditions equivalent to each.

Theorem 1.8.6:

A group action is faithful if and only if the induced homomorphism is injective.

Proof:

Let first a faithful action be given. Assume . Then for all and hence . Let now be injective. Then .

An important consequence is the following

Corollary 1.8.7 (Cayley):

Every group is isomorphic to some subgroup of a symmetric group.

Proof:

A group acts on itself faithfully via left multiplication. Hence, by the previous theorem, there is a monomorphism .

For the characterisation of the other two definitions, we need more terminology.

Orbit and stabilizer

[edit | edit source]

Definitions 1.8.8:

Let be a group action, and let . Then

  • is called the orbit of and
  • is called the stabilizer of . More generally, for a subset we define as the stabilizer of .

Using this terminology, we obtain a new characterisation of free operations.

Theorem 1.8.9:

An operation is free if and only if is trivial for each .

Proof: Let the operation be free and let . Then

.

Since the operation is free, .

Assume that for each , is trivial, and let such that . The latter is equivalent to . Hence .

We also have a new characterisation of transitive operations using the orbit:

Theorem 1.8.10:

An operation is transitive if and only if for all .

Proof:

Assume for all , and let . Since transitivity follows.

Assume transitivity, and let . Then for all there exists with and hence .

Regarding the stabilizers we have the following two theorems:

Theorem 1.8.11:

Let be a group action and . Then .

Proof:

First of all, . Let . Then and hence . Further and hence .

Theorem 1.8.12:

Let . If we write for each , then

.

Proof:

Cardinality formulas

[edit | edit source]

The following theorem will imply formulas for the cardinalities of , , or respectively.

Theorem 1.8.13:

Let an action be given. The relation is an equivalence relation, whose equivalence classes are given by the orbits of the action. Furthermore, for each the function

is a well-defined, bijective function.

Proof:

1.

  • Reflexiveness:
  • Symmetry:
  • Transitivity: .

2.

Let be the equivalence class of . Then

.

3.

Let . Since , . Hence, . Hence well-definedness. Surjectivity follows from the definition. Let . Then and thus . Hence injectivity.

Corollary 1.8.14 (the orbit-stabilizer theorem):

Let an action be given, and let . Then

, or equivalently .

Proof: By the previous theorem, the function is a bijection. Hence, . Further, by Lagrange's theorem .

Corollary 1.8.15 (the orbit equation):

Let an action be given, and let be a complete and unambiguous list of the orbits. Then

.

Proof: The first equation follows immediately from the equivalence classes of the relation from theorem 1.8.13 partitioning , and the second follows from Corollary 1.8.14.

Corollary 1.8.16:

Let an action be given, let , and let be a complete and unabiguous list of all nontrivial orbits (where the orbit of is said to be trivial iff ). Then

.

Proof: This follows from the previous Corollary and the fact that equals the sum of the cardinalities the trivial orbits.

The following lemma, which is commonly known as Burnside's lemma, is actually due to Cauchy:

Corollary 1.8.17 (Cauchy's lemma):

Let an action be given, where are finite. For each , we denote .

The class equation

[edit | edit source]

Definition 1.8.18:

Let a group act on itself by conjugation, i. e. for all . For each , the centraliser of is defined to be the set

.

Using the machinery we developed above, we may now set up a formula for the cardinality of . In order to do so, we need a preliminary lemma though.

Lemma 1.8.19:

Let act on itself by conjugation, and let . Then the orbit of is trivial if and only if .

Proof: .

Corollary 1.8.20 (the class equation):

Let be a group acting on itself by conjugation, and let be a complete and unambiguous list of the non-trivial orbits of that action. Then

.

Proof: This follows from lemma 1.8.19 and Corollary 1.8.16.

Special topics

[edit | edit source]

Equivariant functions

[edit | edit source]

A set together with a group acting on it is an algebraic structure. Hence, we may define some sort of morphisms for those structures.

Definition 1.8.21:

Let a group act on the sets and . A function is called equivariant iff

.

Lemma 1.8.22:

p-groups

[edit | edit source]

We shall now study the following thing:

Definition 1.8.24:

Let be a prime number. If is a group such that for some , then is called a -group.

Corollary 23: Let be a -group acting on a set . Then .

Proof: Since is a -group, divides for each with defined as in Lemma 21. Thus .

Group Representations

[edit | edit source]

Linear group actions on vector spaces are especially interesting. These have a special name and comprise a subfield of group theory on their own, called group representation theory. We will only touch slightly upon it here.

Definition 24: Let be a group and be a vector space over a field . Then a representation of on is a map such that

i) given by , , is linear in over .
ii)
iii) for all , .

V is called the representation space and the dimension of , if it is finite, is called the dimension or degree of the representation.

Remark 25: Equivalently, a representation of on is a homomorphism . A representation can be given by listing and , .

As a representation is a special kind of group action, all the concepts we have introduced for actions apply for representations.

Definition 26: A representation of a group on a vector space is called faithful or effective if is injective.

Exercises

[edit | edit source]


Composition series

Definitions 2.7.1:

Let be a group. A normal series of are finitely many subgroups of such that

Two normal series and of are equivalent if and only if and there exists a bijective function such that for all :

A normal series of is a composition series of if and only if for each the group

is simple.

Theorem 2.7.2:

Let be a finite group. Then there exists a composition series of .

Proof:

We prove the theorem by induction over .

1. . In this case, is the trivial group, and with is a composition series of .

2. Assume the theorem is true for all , .

Since the trivial subgroup is a normal subgroup of , the set of proper normal subgroups of is not empty. Therefore, we may choose a proper normal subgroup of maximum cardinality. This must also be a maximal proper normal subgroup, since any group in which it is contained must have at least equal cardinality, and thus, if is normal such that

, then

, which is why is not a proper normal subgroup of maximal cardinality.

Due to theorem 2.6.?, is simple. Further, since , the induction hypothesis implies that there exists a composition series of , which we shall denote by , where

. But then we have

, and further for each :

is simple.

Thus, is a composition sequence of .

Our next goal is to prove that given two normal sequences of a group, we can find two 'refinements' of these normal sequences which are equivalent. Let us first define what we mean by a refinement of a normal sequence.

Definition 2.7.3:

Let be a group and let be a normal sequence of . A refinement of is a normal sequence such that

Theorem 2.7.4 (Schreier):

Let be a group and let , be two normal series of . Then there exist refinements of and of such that and are equivalent.

Proof:

Theorem 2.7.5 (Jordan-Hölder):

Let be a group and let and be two composition series of . Then and are equivalent.

Proof:

Due to theorem 2.6.?, all the elements of must be pairwise different, and the same holds for the elements of .

Due to theorem 2.7.4, there exist refinements of and of such that and are equivalent.

But these refinements satisfy

and

, since if this were not the case, we would obtain a contradiction to theorem 2.6.?.

We now choose a bijection such that for all :


Group Theory/The Sylow Theorems

In this section, we will have a look at the Sylow theorems and their applications. The Sylow theorems are three powerful theorems in group theory which allow us for example to show that groups of a certain order are not simple.

The proofs are a bit difficult but nonetheless interesting. Important remark: Wikipedia also has proofs of the Sylow theorems, see Wikipedia article on the Sylow theorems, which are shorter and more elegant. But here you can find other proofs. This is because the author wanted to avoid redundancy. So you can choose the proof you like, or read both :-)

Remark: The proofs below also teach a lot about how to apply group actions, so they might give you also an idea how to do this kind of stuff :-)

The Sylow theorems

[edit | edit source]

Definition 1: Let be a finite group of order , where is a prime, and is coprime to . We say that a subgroup of is a Sylow -subgroup iff it has order .

Definition 2: Let H be a subgroup of a group G. We define the normalizer N[H] of H as follows:

due to Fermat

Theorem 3 (Cauchy's theorem) Let G be a group and be a prime number such that divides . Then there exists an element of G which has order p. In particular, there is a subgroup of order p of G, namely .

Proof: Let X be the set of all tuples for which . The cyclic group acts on X with the action . An example is , for which the orbit contains only this same element ().

We also have that since we can choose the first p-1 elements arbitrarily and , and therefore |X| is a multiple of p, because |G| is also divisible by p. Furthermore, we know by the orbit-stabilizer theorem (theorem 19 from the section about group actions), that . Since p is a prime number, we have for all that either or . But since the orbits partition X (due to lemma 11 from the section about group actions), and is divisible by p, we need at least one (in the case p = 2, else even more elements) other element of X such that . Let . We have , because otherwise x' would not be fixed by the action we defined. Then . QED.

Theorem 4 (Sylow I): Let be a finite group of order , where is a prime, and is coprime to . For every , there is a subgroup of order of G. In particular, there exists a Sylow -subgroup of G.

Proof: For this proof, we use induction. Let H be a p-subgroup of G, i. e. for some natural i. H acts on the sets of left cosets G/H by left multiplication. By corollary 23 from the section about group actions, we obtain that , where . But also the following equivalences are true:

because

But from this we can conclude that . Therefore, (*) becomes . From this we can conclude, that if , and therefore p divides by the theorem of Lagrange, that then also p divides . And also: Since is a normal subgroup of , we know that is a group. Therefore we can apply Cauchy's theorem: has a subgroup of order p. But if we set , then is a subgroup of of order , because
a) the intersection of two different cosets is the empty set, and
b) is a subgroup of G because for for some , because , the normaliser. QED.

Lemma 5 (order of the conjugate): Let G be a group with identity , and an element of that group. Then

Proof: First, we observe that by induction: For n = 1, the claim is obviously true, and the calculation

shows the induction step.

Therefore, , which shows that .

Let furthermore . Then , where the first implication is true because the inverse is uniquely determined, and the second implication is true because the identity is uniquely determined. Therefore , implying with the former inequality that and finishing the proof. QED.

Lemma 6: Let , and let G act on X by conjugation. Then is a subgroup of G, and any p-subgroup of is contained in P.

Proof: Conjugation of G on X is a transitive action: If are arbitrary, by choosing . By the section about group actions, transitivity implies is really a group.

By the definition of X, we have that P is even a normal subgroup of . Let now Q be an arbitrary p-subgroup of . Then is a subgroup due to the section about normal subgroups. Due to the second isomorphism theorem, we have that . Therefore, we also have by Lagrange's theorem, that for some , because Q is a Sylow p-subgroup. Furthermore, Lagrange's theorem also assures that . Since P is a Sylow p-subgroup and QP is a subgroup of G and therefore divides |G|, we know that p does not divide . Therefore, must be the trivial subgroup, and therefore also , which implies because due to the section about subgroups, QED.

Theorem 7 (Sylow II): If P is a Sylow p-subgroup of G, and Q is an arbitrary p-group of G, then , so Q is contained in a Sylow p-group, since for arbitrary groups G if is an arbitrary subgroup of G, then also . In particular, all Sylow -subgroup of are conjugate.

Proof: Let's choose . P acts on X by conjugation. By the orbit-stabilizer theorem (corollary 19 of the section on group actions), we have that . But since P is a Sylow p-group, we know that or . Since , we furthermore have and therefore , because P is a single element in X.

But P is also the only element with trivial orbit: Let . That has trivial orbit means translated into the language of our group action, that . If we multiply this equation by on the right and on the left, we obtain that . Because of Lemma 5 we know that . Therefore is a p-subgroup of . Due to Lemma 6, we know that therefore must be a subgroup of P, and since both sets contain the same number of elements, they must be equal.

We now recall the above formula and note that since , all the other elements must have the property , since their orbits are not trivial. Since the orbits partition X, we have that .

Now we let Q act on X by conjugation, instead of P. Since , we know that there is at least one orbit of length 1. So what does this mean?:

As before:

, and, by Lemma 6:

, and therefore . QED.

Lemma 8: The normalizer of a subgroup is a subgroup.

Proof: Let H be a subgroup of G, and let G act on H by conjugation. Then the normalizer of H is the stabilizer of H in this action. Therefore, it is a subgroup due to Lemma 14 of the section about group actions, QED.

Theorem 9 (Sylow III*): Let again be the number of Sylow -groups of . Then , where is any Sylow -group.

Proof: This follows from the proof of Sylow II and the thm. Sylow II itself: Choose as in the proof of Sylow II. Then by the theorem itself follows that , and if we consider the group action of G on X of conjugation, then we have that . But since due to the orbit-stabilizer theorem (thm. 19 in the section about group actions) due to the definition of the normalizer, and due to the theorem of Lagrange (which is applicable since N[P] is a subgroup due to Lemma 8), the theorem follows. QED.

Theorem 10 (Sylow III): Let be a finite group of order , where is a prime, and is coprime to . If is the number of Sylow -subgroups of , then and .

Proof: Choose as in the proof of Sylow II. The proof for Sylow II shows that , and the theorem Sylow II itself shows that . This proves the second part. The first part follows from Sylow III* and the fact that due to Lagrange (which we can apply here because N[P] is a subgroup due to Lemma 8): Since P is a subgroup of N[P], we know that divides . This means that is not divisible by p. But since divides , it must divide m. QED.

How to show that groups of a certain order aren't simple

[edit | edit source]

In this section, it will be shown how to show that groups of a certain order can not be simple using the Sylow theorems. This is a useful application of the Sylow theorems.

Example 11: Groups of order 340 are not simple.

Proof: Let |G| = 340 = 22 ⋅ 5 ⋅ 17. Due to Sylow III, we have that , because and has only this solution (this can be seen by computing all possible solutions, which are finitely many since the last condition implies that ). But since, due to Sylow II, the conjugate of the only Sylow 5-group is again a Sylow 5-group, it is itself. This is the definition of normal subgroups. Therefore, by the definition of simple groups, groups of order 340 are not simple. QED.

Example 12: Groups of order 48 are not simple.

Proof: Let |G| = 48 = 24 ⋅ 3. Sylow III tells us that and . From this follows that either or . If now , then, in the same way as example 11, the (then only) Sylow 2-group is normal. In the other case, through conjugation on the set of the three Sylow 2-groups, we generate a homomorphism . This is due to theorem 2 from the section about group actions. The image can't be trivial, because all Sylow 2-groups are conjugate because of Sylow II. But since the kernel is a normal subgroup, and , we have that due to the first isomorphism theorem and the theorem of Lagrange, that the kernel is a proper, non-trivial normal subgroup, which is why the group is not simple.

Example 13: Groups of order 108 are not simple.

Proof: Let |G| = 108 = 2 ⋅ 2 ⋅ 33 and S be a Sylow 3-group. We let G act on the cosets of S by left multiplication. Due to theorem 2 from the section about group actions, we know that this action generates a homomorphism . Due to the first isomorphism theorem and Lagrange, we have that and therefore must divide 108. Since is a subgroup in and , must divide 24. From this follows that , and therefore, again due to the formula from the first isomorphism theorem and Lagrange, we have that . If the kernel would be G, then the action would be trivial and therefore , which is a contradiction, since G is no Sylow 3-group. Therefore, the kernel is a proper, non-trivial normal subgroup, which is why the group is not simple.



Rings

This section builds upon and expands the theory covered in the previous chapter on groups. The reader is strongly advised to master the material presented in the sections up to and including Products and Free Groups before continuing.

Motivation

[edit | edit source]

The standard motivation for the study of rings is as a generalization of the set of integers with addition and multiplication, in order to study integer-like structures in a more general and less restrictive setting. However, we will also present the following motivation for the study of rings, based on the theory of Abelian groups.

Let and be Abelian groups. Then the set (Please don't pay much attention to the subscript for now.) of group homomorphisms naturally forms an abelian group in the following way. If , define for all . It should be obvious where each addition is taking place. In particular, we can consider the set of endomorphisms of . That is, the set of homomorphisms from to itself. This set is obviously a group from the above discussion, but it is also closed under composition. By endowing the set with the operations of addition, , and composition, , we note that it has the following properties:

i) It is an Abelian group under addition.
ii) It is a monoid under multiplication.
iii) Addition distributes over composition.

Indeed, for the third property, note that if and , then and . The following material is a generalization of this situation.

Introduction to Rings

[edit | edit source]

Definition 1: A ring is a set with two binary operations and that satisfies the following properties:

For all

i) is an abelian group.
ii) is a monoid.

The definition of ring homomorphism does not include the existence of 1.

iii) is distributive over :
1)
2)

We will denote the additive identity in a ring by or if the ring is understood. Similarily, we denote the multiplicative identity by or when the ring is understood. We'll often use juxtaposition in place of , i.e., for .

Remark 2: Some authors do not require their rings to have a multiplicative identity element. We will call a ring without an idenitity a rng. Pseudo-rings is another term used for rings without unity. Authors who do not require a multiplicative identity usually call a ring a ring with unity. Unless otherwise stated, we will assume that in our rings. A major part of noncommutaive ring theory was developed without assuming every ring has an identity element.

Example 3: The reader is already familiar with several examples of rings. For instance and with the usual addition and multiplication operations. We have a familiy of finite rings given by the sets for integer with addition and multiplication defined modulo . Finally we have an example of a rng given by the sets for integer with the usual addition and multiplication. The reader is invited to confirm the ring axioms for these examples.

Let us now prove some very basic properties about rings. This is analogous to what we did for groups when we first introduced them.

Theorem 4: Let be a ring, and let . Then the following are true:

  1. If , then .
  2. The equation has a unique solution.

Proof: (1), (2), and (3) all strictly concern addition, and are all previous results from being a group. The other three parts all concern both addition and multiplication (since 0 and - are additive concepts), so as a proof strategy we expect to use the distributive law in some way to link the two operations. For (4), observe that . But then by (1), 0a=0. For (5), Note that . For (6) note that .

Remark 5: Take another look at the examples in Example 3. Notice that for all those rings, multiplication is a commutative opration. However, the axioms say nothing about this. Thus we should expect to find counter-examples to this.

Definition 6: A ring is called commutative if multiplication is commutative.

Example 7: An example of a non-commutative ring is the set of square matrices with real coefficients under standard addition and multiplication of matrices, where is an integer. The reader can easily check this for and conclude that it holds for all other (why?).

Theorem 8: A ring has a unique multiplicative identity.

Proof: During our brief discussion of monoids earlier, we showed that in any monoid the identity is unique. Since a ring sans addition is a monoid, this applies here.

Example 9: The singleton set with addition and multiplication defined by and is a ring, called the trivial ring or the zero ring. Note that in the trivial ring, . The reader is invited to show that in a ring if and only if it is the trivial ring.

If the reader has tried to construct some of the rings , he/she may have realised that certain non-zero elements have product zero. We formalize this concept as follows.

Definition 10: Let be a ring and . is called a left(resp.right)-zero-divisor if there exists a such that .

Lemma 11: Let be a ring with . Define the function given by for all . Then is injective if and only if is not a left-zero-divisor.

Proof: Assume is not a left-zero-divisor, and assume we have for some . This implies , giving since is not a left-zero-divisor, so is injective. Conversely, assume is a left-zero-divisor. Then there exists a such that and , so is not injective.

Remark 12: Thus, multiplication by is left-cancellative if and only if is not a zero-divisor. The reader is invited to state and prove the equivalent lemma for right-zero-divisors.

Example 13: are all examples of commutative rings without zero divisors. These rings motivate the next definition.

Definition 14: Let be a commutative ring without zero divisors. Then is called an integral domain.

Just like Definition 14, the majority of special types of rings will be motivated by properties of .

Example 15:

  1. The set of functions on with pointwise addition and multiplication is a ring.
  2. More generally, if is a ring, the set of functions from to itself is also a ring.
  3. The set with function composition for multiplication is not a ring since the statement is not true in general.
  4. The set of integrable functions on the real numbers, , is a rng under pointwise addition and multiplication given by convolution: . This rng is important to the study of linear systems and differential equations. If the reader has enough calculus under his/her belt, he/she reader is invited to show that it does not have an identity, and that it is commutative.
  5. The set of Gaussian integers with standard addition and multiplication is a ring.

Definition 16: Let be a ring. An element is a unit and is invertible if there is an element such that . The set of all units is denoted by .

Exercise 17: Prove that is a group under multiplication.

Exercise 18:: Show that a zero-divisor is not a unit.

Theorem 19: (Cancellation Law for Integral Domains): Let be an integral domain, and let be nonzero. Then if and only if .

Proof: Evidently if . To see the other direction, we rearrange the equality as . But then . Since is nonzero, and contains no zero divisors, it must be the case that , which is to say that .

Definition 20: A ring is a division ring or skew field if all non-zero elements are units, i.e. if it forms a group under multiplication with its nonzero elements.

Definition 21: A field is a commutative division ring. Alternatively, a field is a ring where is an abelian group under multiplication. As another alternative, a field is an integral domain where all non-zero elements are invertible.

As stated before, integral domains are easy to work with because they are so close to being fields. In fact, the next theorem shows just how close the two are:

Theorem 22: Let be a finite integral domain. Then is a field.

Proof: Let be nonzero and let . Clearly is a subset of . From the cancellation law, we can see that (since if two elements and are equal, then ). But then . So then there must be some such that . So is a unit.

Of course proving that a set with two operations satisfy all of the ring axioms can be tedious. So, just as we did for groups, we note that if we're considering a subset of something that's already a ring, then our job is easier.

Definition 23: A subring of a ring is a subset of that is also a ring (under the same two operations as for ) and . We denote " is a subring of " by . Note many mathematicians do not require rings or subrings to have an identity.

Theorem 24: Let be a subset of a ring . Then if and only if for all ,

  1. ,
  2. ,
  3. .

Example 25:

  1. .
  2. The trivial ring is a subring of every ring.
  3. The set of Gaussian integers is a subring of the complex numbers .


Ring Homomorphisms

Just as with groups, we can study homomorphisms to understand the similarities between different rings.

Homomorphisms

[edit | edit source]

Definition

[edit | edit source]

Let R and S be two rings. Then a function is called a ring homomorphism or simply homomorphism if for every , the following properties hold:

In other words, f is a ring homomorphism if it preserves additive and multiplicative structure.

Furthermore, if R and S are rings with unity and , then f is called a unital ring homomorphism.

Examples

[edit | edit source]
  1. Let be the function mapping . Then one can easily check that is a homomorphism, but not a unital ring homomorphism.
  2. If we define , then we can see that is a unital homomorphism.
  3. The zero homomorphism is the homomorphism which maps ever element to the zero element of its codomain.

Theorem: Let and be integral domains, and let be a nonzero homomorphism. Then is unital.

Proof: . But then by cancellation, .

In fact, we could have weakened our requirement for R a small amount (How?).

Theorem: Let be rings and a homomorphism. Let be a subring of and a subring of . Then is a subring of and is a subring of . That is, the kernel and image of a homomorphism are subrings.

Proof: Proof omitted.

Theorem: Let be rings and be a homomorphism. Then is injective if and only if .

Proof: Consider as a group homomorphism of the additive group of .

Theorem: Let be fields, and be a nonzero homomorphism. Then is injective, and .

Proof: We know since fields are integral domains. Let be nonzero. Then . So . So (recall you were asked to prove units are nonzero as an exercise). So .

Isomorphisms

[edit | edit source]

Definition

[edit | edit source]

Let be rings. An isomorphism between and is an invertible homomorphism. If an isomorphism exists, and are said to be isomorphic, denoted . Just as with groups, an isomorphism tells us that two objects are algebraically the same.

Examples

[edit | edit source]
  1. The function defined above is an isomorphism between and the set of integer scalar matrices of size 2, .
  2. Similarly, the function mapping where is an isomorphism. This is called the matrix representation of a complex number.
  3. The Fourier transform defined by is an isomorphism mapping integrable functions with pointwise multiplication to integrable functions with convolution multiplication.

Exercise: An isomorphism from a ring to itself is called an automorphism. Prove that the following functions are automorphisms:

  1. Define the set , and let



Ideals

Motivation

[edit | edit source]

In Rings we saw that the set of even integers was a subring of the integers.

We can also see very easily that the integers are a subring of the rational numbers under the usual operations of addition and multiplication.

The even integers, when taken as a subring of the integers have a property that the integers when taken as a subring of the rationals do not. The even integers taken as a subring of the rationals also lack this property.

The property is that the even integers, taken as a subring of the integers, absorb multiplication. Let's call the even integers for ease of notation.

Consider the following: For all , we can see by the definition of that for some .

For all see that .

In Math, regardless of which even integer is chosen, multiplying it by any integer will give us a different even integer.

Definition of an Ideal

[edit | edit source]

Definition: Given a ring , a subset is said to be a left ideal of if it absorbs multiplication from the left; that is, if .

Definition: Given a ring , a subset is said to be a right ideal of if it absorbs multiplication from the right; that is, if .

Definition: We define an ideal to be something that is both a left ideal and a right ideal. We also require that is a subgroup of .

We write as shorthand for this.

To verify that a subset of a ring is an ideal, it is only necessary to check that it is closed under subtraction and that it absorbs multiplication; this is because of the subgroup criterion from Abstract Algebra/Group Theory/Subgroup.

Definition: An ideal is proper if .

Definition: An ideal is trivial if .

Lemma: An ideal is proper if and only if .

Proof: If then so .

The converse is obvious.

Theorem: In a division ring, the only proper ideal is trivial.

Proof: Suppose we have an ideal in a division with a nonzero element a. Take any element b in our division ring. Then a−1b is in the division ring as well, and aa−1b = b is in the ideal. Therefore, it is not a proper ideal.

Definition: Let S be a nonempty subset of a ring R. Then the ideal generated S is defined to be the smallest ideal in R containing S, which would be the intersection of all such ideals. We can characterize this ideal by the collection of all finite sums

And one can easily verify that this is an ideal, and that all ideals containing S must contain this ideal. If it is commutative, then one can simply characterize it as

The ideal generated by a single element a is called a principal ideal. If the ring is commutative, it consists of all elements of the ring of the form ra where r is any element in the ring.

Example: Let be the ring of integers. The principal ideal is the subset of consisting of positive and negative multiples of . For example is the subset of even integers. Then one can view the factor ring simply as the set under addition and multiplication modulo .

Operations on Ideals

[edit | edit source]

Given a collection of ideals we can generate other ideals. For instance it is easy to check that the intersection of any family of ideals is again an ideal. We write this simply as .

Given any set we can construct the smallest ideal of containing which we denote by . It is determined by , though often we can be more explicit than this.

If is a collection of ideals we can determine the sum, written , as the smallest ideal containing all the ideals . One can check explicitly that its elements are finite sums of the form .

Finally if are two ideals in one can determine the ideal-theoretic product as the smallest ideal containing the set-theoretic product . Note that the ideal-theoretic product is in general strictly larger than the set-theoretic product, and that it simply consists of finite sums of the form where

Example: Let and the principal ideals in just given. Then one can check explicitly that , where r is the lcm of m and n. Moreover , and where s is the hcf of m and n. Observe that if and only if s = mn if and only if m and n are co-prime if and only if .

Homomorphisms and Ideals

[edit | edit source]

Rings, like groups, have factor objects that are kernels of homomorphisms. Let be a ring homomorphism. Let us determine the structure of the kernel of f which is defined to be all elements which map to the identity.

If a and b are in the kernel of f, i.e. , and r is any element of R, then

,
,
.

Therefore is an ideal of R.

Also note that the homomorphism will be a monomorphism i. e. it is injective or one-to-one when the kernel consists only of the identity element.

We also have the following
Theorem: If the only proper ideal of R is the trivial ideal {0}, then if f is a homomorphism from R to S, and it does not map all elements of R to the identity in S, then it is injective.
Proof: The kernel of the homomorphism must be an ideal, and since the only ideals are R and the trivial ideal, one of these two must be the kernel. However, since not all elements of R map to the identity of S, R is not the kernel, so the trivial ideal must be.

Since this condition is satisfied for all division rings, it is true for all division rings.

The construction of factor rings in the next section will prove that there exists a homomorphism with I as its kernel for any ideal I.

Factor Rings

[edit | edit source]
Definition

Given a ring and an ideal , since is a subgroup of the Abelian group (and thus a normal subgroup), by the quotient group construction, the quotient group is well-defined and Abelian, where means the additive group of . The elements of are the cosets of . We define multiplication on by , for each . The Abelian group together with the multiplication defined above is called the quotient ring or factor ring of modulo .

To show that this is independent of the choice of a and b (or, the operations are well-defined), suppose that a' and b' are elements of the same respective coset. Then a'=a+j and b'=b+k for some element j,k within I. Then a'b'=ab+ak+jb+jk and since ak, jb, and jk are elements of I, a'b' and ab must belong to the same coset, so ab+I=a'b'+I. Obviously the cosets form a group under addition because of what was proved earlier about factor groups, and furthermore the factor ring forms an abelian group under addition because the ring forms an abelian group under addition. Since the product rs+I is analogous to the multiplication in the ring, it obviously has all the properties of a ring. Furthermore, if the ring is commutative, then the factor ring will also be commutative.

Observe that there is a canonical ring homomorphism determined by , called the projection map. We record some properties of this homomorphism in the next section of the isomorphism theorems.

Ring Isomorphism Theorems

[edit | edit source]

We have already proved the isomorphism theorems for groups. Now we can use analogous arguments to prove the isomorphism theorems for rings, substitution the notion of "normal subgroups" with ideals.

Factor Theorem

[edit | edit source]

Let I be an ideal of a ring R, Let be the usual homomorphism from R to R/I. Now let f be a homomorphism from R to S. Observe that if is a ring homomorphism, then the composition is a ring homomorphism such that (because ). This characterizes all such morphisms in the following sense

Factor Theorem: Let be a ring homomorphism such that . Then there is a unique homomorphism such that . Furthermore, is an epimorphism if and only if is an epimorphism, is a monomorphism if and only if its kernel is I.

Proof We prove it the same way we did for groups. Define to be . To see that this is well-defined, let a+I=b+I, and so that a-b is an element of I, so that so that . Now is a homomorphism, implying that is also. The proofs of the additional statements can be carried over from the proofs of the additional statements of the factor theorem from groups.

First Isomorphism Theorem

[edit | edit source]

Let R be a ring, and let f be a homomorphism from R to S with kernel K. Then the image of f is isomorphic to R/K.

Proof
Using the factor theorem, we can find a homomorphism from R/K to S, and since the kernel is the same as the ideal used in forming the quotient group, and since the f is an epimorphism over its image, this homomorphism is an isomorphism.

Second Isomorphism Theorem

[edit | edit source]

Let R be a ring, let I be an ideal, and let S be a subring.

  1. S+I, the set of all s+i with s within S and i within I, is a subring of R.
  2. I is an ideal of S+I.
  3. The intersection of S and I is an ideal of S.
  4. (S+I)/I is isomorphic to .

Proof

  1. It can be verified that it contains 1, and is closed under multiplication.
  2. Of course, since I is an ideal of R, then it must be an ideal under any subring.
  3. From a similar argument for groups, it can only contain elements of I, but restricted to S, so it must be an ideal of S.
  4. Let be a function restricted to the domain S, and define . It is obvious that its kernel is and that its image is (S+I)/I.

Third Isomorphism Theorem

[edit | edit source]

Let I be an ideal of a ring R, and let J be an ideal of the same ring R that contains I. J/I is an ideal of R/I, and R/J is isomorphic to (R/I)/(J/I).

Proof Define the function to be which is well-defined because since I is an ideal that is within J. This is also obviously a homomorphism. Its kernel is all elements that map onto J, and is thus all a+I such that a is within J, or J/I. Moreover, its image is R/J, and so we can use the first isomorphism theorem to prove the result.

Correspondence Theorem

[edit | edit source]

Let I be an ideal of a ring R. Define the function to map the set of rings and ideals containing I to the set of rings and ideals of R/I, where = the set of all cosets x+I where x is an element of X. This function is one-to-one, and the image of rings or ideals containing I are rings or ideals within R/I.

Proof Define the function f from rings or ideals containing I to the rings or ideals of R/I, by f(A)=A/I. We have already proved the correspondence for addition because rings form an abelian group under addition. Thus, we need only to check for multiplication. Suppose S is a subring of R containing I. S/I is obviously closed under addition and subtraction. For multiplication, suppose that x and y are elements of S. Then (x+I)(y+I)=xy+I which is also an element of S/I, proving that it is closed under multiplication. The identity 1 is within S, and we have it that 1+I is also within S/I. Thus, S/I is a subring of R/I. Now suppose that S/I is a subring of R/I. Then it is also obvious that S is closed under addition and subtraction and multiplication, proving that S is a subring of R. Now suppose that J is an ideal of R containing I. Then by the third isomorphism theorem, J/I is an ideal of R/I. Now suppose that J/I is an ideal of R/I. Let r be any element of R, and let j be any element of J. Then since J/I is an ideal of R/I, (r+I)(j+I)=rj+I must be an element of J/I. This indicates that rj must be an element of J, proving that J is an ideal of R.

Chinese Remainder Theorem

[edit | edit source]

Definitions

[edit | edit source]

Definition: Two elements are said to be congruent in an ideal if and only if they belong to the same coset in R/I. This is true when a-b is within I. Write to mean that is congruent to modulo .

Lemma: Given an ideal , a subset of a ring , the congruence class modulo of an element is if and only if . To see this, simply note that means ; plugging in gives .

Definition: Two natural numbers are relatively prime when ax+by=1 for integers x and y. We do the same for rings - two ideals I and J are relatively prime when ai+bj=1 for ring elements a, b, and for an element i within I and an element j within J. In other words, two ideals are relatively prime if their sum contains the identity element i. e. if I+J is the whole ring R.

We will now prove the

Chinese Remainder Theorem

[edit | edit source]

Let R be a ring, and let be n pairwise (i. e. when considering any two pairs) relatively prime ideals.

  1. Let a be a number from 1 to n. There exists an element r within R that is within all ideals such that , and such that
  2. Let be elements of R. Then there exists an element r within R such that for all i=1,2,3,...,n.
  3. Let I be the intersection of the ideals. Another element of R, s satisfies for all i=1,2,3,...,n if and only if .
  4. R/I is isomorphic to the product ring

Proof

  1. Since and (i>1) are relatively prime, there will exist an elements and (i>1) such that . This implies that . Now we expand this product on the left side. All terms of the product other than belong to while itself belongs the set S of all finite sums of products with . Thus, it can be written in the form b+a=1, where b is an element of , and where a is an element of S. Then and for i>1.

Prime and Maximal Ideals

[edit | edit source]

There are two important classes of ideals in a ring - Prime and Maximal.

Definition: An ideal is prime if it satisfies:

for any ideals A and B such that AB is a subset of I , implies A is in I or B is in I.

Definition: An ideal is maximal if it is proper (i.e. and it satisfies:

That is, there are no proper ideals between and .

The following Lemma is important for many results, and it makes essential use of Zorn's Lemma (or equivalently the Axiom of Choice)

Lemma: Every non-invertible element of a ring is contained in some maximal ideal

Proof: Suppose is the non-invertible element. Then the first observation is that is a proper ideal, for if , then in particular so contradiction the assumption. Let be the set of proper ideals in containing ordered by inclusion. The first observation implies that is non-empty, so to apply Zorn's Lemma we need only show that every increasing set of ideals contains an upper-bound. Suppose is such an increasing set, then the least upper bound is as this is the smallest ideal containing each ideal. If one checks that the union is an ideal, then this must be . To show it's proper, we need only show for all . But this follows precisely because each is proper.

Therefore by Zorn's Lemma there is a maximal element of . It is clearly maximal for if were any ideal satisfying then would be an element of , and by maximality of we would have whence .

Properties of rings may be naturally restated in terms of the ideal structure. For instance

Proposition: A commutative ring is an Integral Domain if and only if is a prime ideal.

Proof: This follows simply because .

This explains why an Integral Domain is also referred to as a Prime Ring. Similarly, we may give a necessary and sufficient condition for a ring to be a field :

Proposition: A commutative ring is a Field if and only if is a maximal ideal (that is there are no proper ideals)

Proof: We only need to show that every element is invertible. Suppose not then by Lemma ... is contained in some (proper) maximal ideal, a contradiction.

Corollary: An ideal is maximal if and only if is a field.

Proof: By the previous Proposition we know is a field if and only if its only proper ideal is . By the correspondence theorem (...) this happens if and only if there are no proper ideals containing .

Corollary: The kernel of a homomorphism f from R to S is a maximal ideal when S is a field. The proof of this follows from the first isomorphism theorem because S is isomorphic to R/ker f.

It's also clear that

Lemma: An ideal is prime if and only if is an integral domain.

Proof: Write for the element of corresponding to the equivalence class . Clearly every element of can be written in this form.

where the second equivalence follows directly because is prime.

This follows in exactly the same way.

Corollary: The kernel of a homomorphism f from R to S is a prime ideal when S is an integral domain. The proof of this follows from the first isomorphism theorem because S is isomorphic to R/ker f.

Lemma: A maximal ideal is also prime.

Proof: Suppose is a maximal ideal, and . Suppose further that . Then the ideal is an ideal containing and , so is strictly larger than . By maximality . So .
Alternatively, we can use the above two results, and the fact that all fields are integral domains to prove this.

Glossary

[edit | edit source]

Please see the extensive Wikipedia:Glossary of ring theory.


Integral domains

Integral Domains

[edit | edit source]

Motivation: The concept of divisibility is central to the study of ring theory. Integral domains are a useful tool for studying the conditions under which concepts like divisibility and unique factorization are well-behaved. In fact, they are very important for polynomial rings as well.

The integral domain was already defined before on the page on rings. We provide the definition again for reference.

Definition An integral domain is a commutative ring with such that for all , the statement implies either or .

An equivalent definition is as follows:

Definition Given a ring , a zero-divisor is an element such that such that .

Definition An integral domain is a commutative ring with and with no non-zero zero-divisors.

Remark An integral domain has a useful cancellation property: Let be an integral domain and let with . Then implies . For this reason an integral domain is sometimes called a cancellation ring.

Examples:

  1. The set of integers under addition and multiplication is an integral domain. However, it is not a field since the element has no multiplicative inverse.
  2. The set trivial ring {0} is not an integral domain since it does not satisfy .
  3. The set of congruence classes of the integers modulo 6 is not an integral domain because in .

Theorem: Any field is an integral domain.

Proof: Suppose that is a field and let . If for some in , then multiply by to see that . cannot, therefore, contain any zero divisors. Thus, is an integral domain.

Definition If is a ring, then the set of polynomials in powers of with coefficients from is also a ring, called the polynomial ring of and written . Each such polynomial is a finite sum of terms, each term being of the form where and represents the -th power of . The leading term of a polynomial is defined as that term of the polynomial which contains the highest power of in the polynomial.

Remark A polynomial equals if and only if each of its coefficients equals .

Theorem: Let be a commutative ring and let be the ring of polynomials in powers of whose coefficients are elements of . Then is an integral domain if and only if is.

Proof If commutative ring is not an integral domain, it contains two non-zero elements and such that . Then the polynomials and are non-zero elements of and . Thus if is not an integral domain, neither is .

Now let be an integral domain and let and be polynomials in . If the polynomials are both non-zero, then each one has a non-zero leading term, call them and . That these are the leading terms of polynomials and means that the leading term of the product of these polynomials is . Since is an integral domain and , . This means, by the Remark above, that the product is not zero either. This means that is an integral domain.

Unique Factorization Domains, Principal Ideal Domains, and Euclidean Domains

[edit | edit source]

Unique Factorization Domains, Principal Ideal Domains, and Euclidean Domains are ideas that work only on integral domains.

Some definitions

[edit | edit source]
  • Two ring elements a and b are associates if a=ub for some unit u, we write a~b
  • A nonzero nonunit a is irreducible if a=bc (b,c in domain)=>a~b or a~c.
  • a divides b if b=ar for some r within R. When this happens, we write a|b.
  • A nonzero nonunit is prime when a|bc implies that a|b or a|c.

Theorem: If a is prime, then a is irreducible.

Let a be prime, and let a=bc, so that either a|b or a|c. Without loss of generality, assume that a|b, so that b=ad for some element d. Then you can factor a=bc into a=adc, implying that cd=1, or that c is a unit.

Now that we have proven that all prime elements are irreducible, is the converse true? The answer to that is no, for we can easily obtain counterexamples to it. However, we will prove a sufficient and necessary condition for all irreducible elements to be prime.

Unique Factorization Domains

[edit | edit source]

Definition: Let R be an integral domain. If the following two conditions hold:

  1. If a is nonzero, then a=up1p2...pn where u is a unit, and pi are irreducible.
  2. Let a=uq1q2...qm be another factorization of irreducibles. Then n = m and after a suitable re-ordering, each pi and qi are associates.

Then we call (the integral domain) R a unique factorization domain (UFD).

The converse to the above theorem holds true in a UFD.

Theorem: In a UFD, all irreducibles are prime.

Proof
Let a|bc, where a is irreducible. Then ad=bc for some element d. Taking the factorization, a = ud1d2...dl = vb1b2...bmwc1c2...cn = bc where u, v, and w are units. Because it is a UFD, a must be an associate of some bi or ci, implying that a|b or a|c.

The following theorem provides a sufficient and necessary condition for an integral domain R to be an integral domain.

Theorem:

  1. Let R be a UFD. R satisfies the following ascending chain condition on principal ideals: let be a sequence of elements of R such that the principal ideals satisfy the condition that . Then there exists an N such that for all n>N, all the are the same.
  2. If an integral domain R satisfies the ascending chain condition, then every nonzero element can be factored into irreducible elements, meaning that it satisfies the first condition for being a UFD.
  3. If, in addition to satisfying the ascending chain condition, all irreducible elements are prime, then the integral domain is a UFD.

Proof

  1. Consider a sequence of elements of R such that . Then obviously for all natural numbers n, since . Then due to unique factorization, all the factors of are associates of the factors , counting multiplicity of factors. Therefore, the number of non-unit factors is a decreasing sequence on the whole numbers. However, has finitely many factors, so there is an N such that for all n>N, all the factors are associates, meaning that all the are also the same.
  2. Clearly any nonzero irreducible can be factored into irreducibles, which is itself. Otherwise, let be a product of nonunits. If this is not a product of irreducibles, then suppose that one of them is not irreducible, say . Then obviously so the principal ideals satisfy the relations . We can factor in the same way, to obtain as a product of nonunits. Thus, if cannot be factored into irreducibles, we can get an increasing chain of principal ideals, meaning that it does not satisfy the ascending chain condition.
  3. Let where r and s are units and each and are irreducible, and thus prime. Since divides a, it divides one of the factors, and after suitably re-arranging the second factorization, can divide . However, is irreducible, so they must be associates, and thus can be factored out and replaced by a unit. We can continue this process until there are no factors left, at which point we conclude that all the factors are associates.

Principal Ideal Domains

[edit | edit source]

Definition: a principal ideal domain (PID) is an integral domain such that every ideal can be generated by a single element (i. e. every ideal is a principal ideal).

Theorem: All PIDs are UFDs.

Proof:
Suppose we have an ascending chain of principal ideals and let I be the union . Obviously I is an ideal, and is a principal ideal because it is in a PID. Therefore, it is generated by a single element, . Since , for some N. Then if , then we have , so it satisfies the ascending chain condition of principal ideals.

Let an element be irreducible. If , then would be a unit, so (a) must be a proper ideal. If there is no maximal proper ideal containing (a), then the ascending chain condition would not be satisfied, so we can conclude that there is a maximal ideal proper ideal I containing (a) (Note: This does not require the Zorn's lemma or axiom of choice, since we did not use the theorem on maximal ideals). This ideal must be a principal ideal (b), but since , b|a, and since is irreducible, b must either be a unit or an associate of a. Since (b) is a proper ideal, b must not be a unit, so it must be an associate of . Therefore, (a)=(b), so (a) is maximal. However, all maximal ideals are clearly prime, so (a) is a prime ideal, which implies that is prime.

Theorem: A UFD is a PID if and only if every nontrivial prime ideal is maximal.

Proof:
Suppose R is a PID, so that consequently, it is a UFD. Let (a) be an ideal of R, which in turn must be contained in a maximal proper ideal (b) due to the ascending chain condition (Note: again, this does not make use of Zorn's lemma). Since , b|a. Since is irreducible, b must either be a unit or an associate of . However, since (b) is a proper ideal, it must not be a unit, so it must be an associate of . Therefore, (a)=(b), so (a) is maximal. Conversely,

Euclidean Domains

[edit | edit source]

Definition: An integral domain R is a Euclidean domain (ED) if there is a function f from the nonzero elements of R to the whole numbers such that for any element and any nonzero element b, that a=bq+r for some and such that f(r)<f(b) or such that r=0.

Note: In an ED, the Euclidean algorithm to find the greatest common divisor is applicable.

Theorem: All EDs are PIDs.

Proof:
Suppose we have an ideal of R. If it contains only 0, then it is principal. Otherwise, it contains elements other than 0. Then f(I), the image of I under f, is a nonempty set of nonnegative integers. Choose the minimum x of this set, and consider an element b within I which mapped to this x. Let a be another element of I, and there exists such that a=qb+r and such that either f(r)<f(b) or r=0. Since both a and b belong to I, r must also belong to I since r=a-qb. However, f(b) is the minimum, so it must be less than or equal to f(r). Thus, r must be 0, so a=qb, proving that b is the generator of the principal ideal (b).


Fraction Fields

We know from experience that we arrive at the idea of fractions by merely considering the idea of the quotient of two integers. The motivation behind this is simply to arrive at a multiplicative inverse for every non-zero element. Thus, we can consider an integral domain R and construct its field of fractions. However, we can also try to make this work for any commutative ring, even if it has zero divisors other than 0. There is a slight alteration required because we cannot define when bd=0. Thus, we must place restrictions in case b and d are zero divisors in the case of multiplication. In this case, it is called the localization of a ring.

Definitions

[edit | edit source]

A multiplicative subset of a commutative ring R is a subset that does not contain 0, does contain 1, and is closed under multiplication. Some examples of multiplicative sets are the set of nonzero elements of an integral domain, the set of elements of a commutative ring that are not zero divisors, and R\P where P is a prime ideal of the commutative ring R.

Let S be a multiplicative subset. We will consider the Cartesian product R×S. Define the equivalence relation on this product: (a,b)~(c,d) whenever there exists an s such that s(ad-bc)=0.

If it is an integral domain, then (a,b) could be regarded as a/b. Now to check that this is an equivalence relation, it is obvious that it is reflexive and symmetric. To prove that it is transitive, let (a,b)~(c,d) and let (c,d)~(e,f). Then there are elements s and t within S such that s(ad-bc)=0 and such that t(cf-de)=0. This implies that stfad-stfbc=0 and that sbtcf-sbtde=0. Adding the two, we get stfad-sbtde=0, or std(af-be)=0, implying that (a,b)~(e,f).

We can thus use these equivalence classes to define the fraction: is the equivalence class containing (a,b).

Now we set this to be a ring. First, we define addition to be and multiplication to be . The additive identity is and the additive inverse is . The multiplicative identity is simply .

Now we prove below that it is indeed a ring:

Theorem

[edit | edit source]

The set of fractions with addition and multiplication as defined is a commutative ring, and if R is an integral domain, then the fractions are also. And if additionally S is R\{0}, then the set of fractions is a field.

Proof

[edit | edit source]

First, we note that

  1. and therefore

, from which follows that is a group.

It is abelian because of the definition of the sum in S and R is commutative.

Furthermore, is a monoid because

  1. and , where two (not difficult) intermediate steps are left to the reader.

And, also the distributive laws hold, because

and

, which shows that we have indeed found a ring.

The ring is commutative because of the definition of the product in S R is commutative.

Let now R be an integral domain, and let . Then, because of and since , . But since R was assumed to be an integral domain, and since , the last statement is exactly equivalent to , which is in turn equivalent to , and this is equivalent to , which shows that the fraction set is an integral domain if R is one.

Let's assume now that S = R \ {0}, and let , where the last equivalence is due to (*) and , where the last equivalence is in turn due to the fact that R is an integral domain and S does not contain zero. Then due to the fact that R is an integral domain, and therefore since S = R \ {0}, and . But since , we have and and therefore, by noting that we have assumed R to be commutative, we have that every element of R\{0} is invertible.

From this follows that the set of fractions is indeed a field, because we have already checked all field axioms, QED.



Polynomial Rings

The degree of a polynomial is defined to be . If is a field, and and are polynomials of , then we can divide by to get . However, we can also do this for any arbitrary ring if the leading coefficient of is 1.

Take real numbers R for the ring and adjoin two indeterminants X and Y. The free algebra R<X,Y> over R is the collection of sums and products involving X, Y, and real numbers. The polynomial ring R[X,Y] is the algebra reduced by XY=YX, commutativity of the two indeterminants. In terms of quotient rings, with ideal generated by XY−YX, R[X,Y] = R<X,Y>/(XY−YX). The polynomial ring is central to commutative algebra, such as in the discussion of bibinarions and tessarines to follow.

In non-commutative algebra, the anti-commutative property XY=−YX is illustrated in the quaternions. So-called "imaginary units" correspond to irreducible binomials XX+1 and YY+1. The elements of the quotient algebra R<X,Y>/(XY+YX, XX+1, YY+1) multiply as quaternions.

Exercises: What are the common names of these quotients ?

  • Polynomial quotient R[X]/(XX−1)
  • Free algebra quotient R<X,Y>/(XY+YX, XX+1, YY−1).

Application: Bibinarions & Tessarines

[edit | edit source]

Commutative four-dimensional hypercomplex number systems were put forward in the nineteenth century. James Cockle presented the tessarines and Corrado Segre the bicomplex numbers, called Bibinarions in the study of Associative Composition Algebra. The isomorphism of these systems can be demonstrated through quotients of polynomial rings:

Consider the polynomial ring R[X,Y], where XY = YX. The ideal then provides a quotient ring representing tessarines. In this quotient ring approach, elements of the tessarines correspond to cosets with respect to the ideal A. Similarly, the ideal produces a quotient representing bicomplex numbers.

A generalization of this approach uses the free algebra RX,Y in two non-commuting indeterminates X and Y. Consider these three second degree polynomials . Let A be the ideal generated by them. Then the quotient ring RX,Y⟩/A is isomorphic to the ring of tessarines.

To see that  note that

so that
But then
as required.

Now consider the alternative ideal B generated by . In this case one can prove . The ring isomorphism RX,Y⟩/ARX,Y⟩/B involves a change of basis exchanging .

Alternatively, suppose the field C of ordinary complex numbers is presumed given, and C[X] is the ring of polynomials in X with complex coefficients. Then the quotient C[X]/(X2 + 1) is another presentation of bicomplex numbers.



Modules

Motivation

[edit | edit source]

Let G be an abelian group under addition. We can define a sort of multiplication on G by elements of by writing for and . We can extend this to the case where n is negative by writing . We would, however, like to be able to define a sort of multiplication of a group by an arbitrary ring.

Definition

[edit | edit source]
Definition 1 (Module)
Let R be a ring and M an abelian group. We call M a left R-module if there is a function , called a scalar multiplication, satisfying
  1. ,
  2. , and
for all .
We call R the ring of scalars of M.

Note: We can also define a right R-module analogously by using a function . In particular the third property then reads:

Note that the two notions coincide if R is a commutative ring, and in this case we can simply say that M is an R-module.

Definition 2: Given any ring R, we can define it's opposite ring, , having the same elements and addition operation as R, but opposite multiplication. Their multiplication rules are related by . In contrast to group theory, there is no reason in general for a ring to be isomorphic to its opposite ring.

The observant reader will have noticed that the scalar multiplication in a left R-module M is simply a ring homomorphism such that for all . We leave it as an exercise to verify that the scalar multiplication in a right R-module is a ring homomorphism . Thus a right R-module is simply a left Rop-module. As a consequence of this, all the results we will formulate for left R-modules are automatically true for right R-modules as well. There are no assumptions that the module is unital, namely that 1m = m for all m in M.

Examples of Modules

[edit | edit source]
  1. Any ring R is trivially an R-module over itself. More interestingly, any left ideal I of R is also a left R-module with the obvious scalar multiplication. In addition, if I is a two-sided ideal of R, then the quotient ring is an R-module with the induced scalar multiplication .
  2. If R is a ring, then the set of matrices with entries in R is an R-module under componentwise addition and scalar multiplication. More generally, for any set X, the set of function from X to R, with or without finite support, is an R-module in an obvious way.
  3. The k-modules over a field k are simply the k-vector spaces.
  4. As was shown in the introduction of this chapter, any abelian group is a -module in a natural way. ("Natural" here has a rigorous mathematical meaning which will be explained later.
  5. Let S be a subring of a ring R. Then R is an S-module in a natural way. We can extend this as follows. Let S,R be rings and a ring homomorphism. Then R is an S-module with scalar multiplication and for all .
  6. Any matrix ring of a ring R is a R-module under componentwise scalar multiplication.
  7. If S is a subring of a ring R, then any left R-module is also a left S-module with the restricted scalar multiplication. We will treat this more generally later.

Submodules

[edit | edit source]

Definition 3: (Submodule)

Given a left -module a submodule of is a subset satisfying
  1. N is a subgroup of M, and
  2. for all and all we have .

The second condition above states that submodules are closed under left multiplication by elements of ; it is implicit that they inherit their multiplication from their containing module; must be the restriction of .

Example 4: Any module M is a submodule of itself, called the improper submodule, and the zero submodule consisting only of the additive identity of M, called the trivial submodule.

Example 5: A left ideal I is a submodule of R viewed as an S-module, where S is any (not necessarily proper) subring of R.

Lemma 6: Let M be a left R-module. Then the following are equivalent.

i) N is a submodule of M
ii) If and for all , then .
iii) If and , then .

Proof: i) => iii): and are in by the second property, then by the first property of Definition 3.

iii) => ii): Follows by induction on .

ii) => i): Let , , then for arbitrary be have , proving is a subgroup. Now let , then for arbitrary , , proving property 2 in Definition 3.

The lemma gives an alternative characterisation of submodules, and those sets closed under linear combinations of elements.

Analogously to the case of vector spaces, we have ways of creating new subspaces from old ones. The rest of this subsection will be concerned with this.

Lemma 7: Let M be a left R-module, and let N and L be submodules of M. Then is a submodule in M, and it is the largest submodule contained in both N in L.

Proof: Let and . Then and since N and L are submodules, so and is a submodule of M. Now, assume that S is a submodule of M contained in N and L. Then any must be in both N and L and therefore in such that , proving the lemma.

Now, as the reader should expect at this point, given submodules N and L of M, the union is in general not a submodule. In fact, we have the following lemma:

Lemma 8: Let M be a left R-module and let N and L be submodules. Then is a submodule if and only if or .

Proof: The left implication is obvious. For the right implication, assume is a submodule of M. Then if and , then , which implies that or . Assume without loss of generality that . Then, since N is a submodule, we must have , proving .

Definition 9: Let M be a left R-module, and let be submodules for . Then define their sum, .

Definition 9 has a straightforward extension to sums over arbitrary index sets. This definition is left for the reader to state. We will only need the finite case in this chapter.

Lemma 10: Let M be a left R-module and let N and L be submodules. Then is a submodule of M, and it is the smallest submodule containing both N and L.

Proof: It is straightforward to see that is a submodule. To see that it is the smallest submodule containing both N and L, let S be a submodule containing both N and L. Then for any and , we must have . But this is the same as saying that , proving the lemma.

With Lemma 7 and Lemma 10 established, we can state the main result of this subsection.

Definition 11: Let M be a left R-module. Then let be the set of submodules ordered by set inclusion.

Lemma 12: Let M be a left R-module. Then forms a lattice, the join of being given by and their meet by .

Proof: Most of the work is already done. All that remains is to check assosiativity, the absorption axioms and the idempotency axioms. The associativity is trivially satisfied, and for all . As for absorption, We have to check and for all , but this is also trivially true. Lastly, we obviously have and for all , so we are done.

Corollary 13: Let M be a left R-module. Then is a modular lattice.

Note: Recall that is modular if and only if whenever such that , we have .

Proof: Let such that . Since , we have for some , such that . Thus and . On the other hand, we have and , so .

Definition 14: Let M be a left R-module. A submodule N is called maximal if whenever L is a submodule satisfying , then or .

Theorem 15: Every submodule of a finitely generated left R-module is contained in a maximal submodule.

Proof: Let N be a submodule, and let . Then S is a poset under set inclusion. Let be a chain in S, and note that is a submodule containing each , such that U is an upper bound for the chain. Then, since each chain in S has an upper bound, by Zorn's Lemma S has a maximal element, P, say. P is obviously an ideal containing N. By the definition of S, P is also a maximal submodule of M, proving the theorem.

Generating Modules

[edit | edit source]

Given a subset of a left -module , we define the left submodule generated by to be the smallest submodule (w.r.t. set containment) of that contains . It is denoted by for a reason which will become clear in a moment.

The existence of such a submodule comes from the fact that an intersection of -modules is again an -module: Consider the set of all submodules of containing . Since contains , we see that is non-empty. The intersection of the modules in clearly contains and is a submodule of . Further, any submodule of containing also contains the intersection. Thus .

Assuming that is unitary, the elements of have a simple description;

.

That is, every element of can be written as a finite left linear combination of elements of . This equality can be justified by double inclusion: First, any submodule containing must contain all left -linear combinations of elements of since modules are closed under addition and left multiplication by elements of . Thus, . Secondly, the set of all such linear combinations forms a submodule of containing (use and ) and hence it contains .

Generating Submodules by Ideals

[edit | edit source]

Consider any ring , left ideal , and left -module . One can think of as a subring of (non-unitary when ) and hence is an -module using the regular multiplication by elements of .

If we consider the set we obtain a submodule of . This follows from our discussion of generated submodules. However, since is not unitary, it is not necessary that .

Thus, we may consider the quotient module . Clearly this is an -module but it is also an module under the obvious action.

Proposition
Given an -module and ideal of , the module is an -module with multiplication .
proof.
To show that this is well defined, we observe that if then and hence
since . Thus,
which proves that the action of on is well defined. It follows now that is an -module simply because it is an -module.

Quotient Modules

[edit | edit source]

Recall that any subgroup of an abelian group allows one to construct an equivalence relation; for ,

.

Cosets of , equivalence classes under the relation above, can then be endowed with a group structure, derived from the original group, and is given the name M/N. The sum of two cosets and is simply .

Lemma 16 Let M be a left R-module and N be a submodule. Then M/N, defined above, is a left R-module.

Proof: M/N is obviously an abelian group, so we just have to check that it has a well-defined R-action. Let and . Then we define . The distributivity and associativity properties of the action are inherited from M, so we just need well-definedness. Let with . Then since N is a submodule, and we are done.

Module Homomorphisms

[edit | edit source]

Like all algebraic structures, we can define maps between modules that preserve their algebraic operations.

Definition (Module Homomorphism)
An -module homomorphism is a function from to satisfying
  1. (it is a group homomorphism), and

When a map between two algebraic structures satisfies these two properties then it called an -linear map.

Definition (Kernel, Image)
Given a module homomorphism the kernel of is the set
and the image of is the set
.

The kernel of is the set of elements in the domain that are sent to zero by . In fact, the kernel of any module homomorphism is a submodule of . It is clearly a subgroup, from group theory, and it is also closed under multiplication by elements of : for .

Similarly, one can show that the image of is a submodule of .


Projective line

Projective Line over a Ring

For a ring A, let be the division of the ring into units U and non-units N, so that

Pairs of ring elements a and b are found in A x A. Another pair c and d are related to the first pair when there is a unit u such that ua = c and ub = d. Using the group properties of U, one can show that this relation is an equivalence. The equivalence classes of this relation are the points of the projective line, provided that the pair a, b generates the improper ideal, A itself.

where [a:b] denotes the equivalence class of (a,b).

Note that when a b = 1, then [a : 1 ] = [1 : b], so for elements of U, exchanging coordinates produces the multiplicative inverse in the opposite component.

The projective line receives two embeddings of A: z → [z : 1] and z → [1 : z]. On embedded U, the exchange in P(A) involves multiplicative inverse, while on embedded N the exchange brings up the identical non-unit in the opposite embedding.

  • Lemma: m + n in U implies mn in U.
proof: am + bn = 1 = am + (−b)(−n) implies mn is in U.

When A is a commutative ring there is a relation holding for certain pairs p and q in P(A):

  • Definition: Points p = [a:b] and q = [c:d] are point parallel when adbc is in N, the non-units. (Benz, page 84, note formula error)

This relation is always reflexive and symmetric.

  • Exercise: Show that in case A is a field, then is the equality relation.

For to be an equivalence, transitivity can be demonstrated for rings which have a unique maximal ideal (known as local rings).

proof: For instance with m, n in N, [n:1] and [m:1] are parallel but not [n:1] and [1:m]. In A, if the principal ideals [m] and [n] are always the same, then A is a local ring, and vice versa. In this case the parallel relation is transitive.

The ring of dual numbers is an example of a local ring.

Homographies

[edit | edit source]

Definition: For a ring A, M(2,A) represents the 2x2 matrices with entries from A. Using the operations of A, and matrix addition and multiplication, M(2,A) is itself a ring.

The exchange is an example of a homography on P(A) and can be represented by The action of matrices in M(2,A) represent transformations of P(A). Row pairs on the left and elements of M(2,A) on the right are the two factors in a multiplicative transformation. When the determinant of such a matrix is a unit in the ring, then the matrix has an inverse in M(2,A).

For example, has determinant pq. When p and q are in N, and p+q is in N also, then the matrix is singular (has no inverse) and p and q are point-parallel. If the above determinant is a unit, then the matrix is in a homography group on P(A). Looking at the first embedding, it maps [p:1] to zero and [q:1] to infinity. According to Walter Benz, p and q are point-parallel when there is no homography connecting them by a "chain", that is, a projective line P(Q) where A is an algebra over Q. The homography moved p and q to two points common to all projective lines.

The operations of the original ring A are represented by elements of M(2,A) acting on an embedding. The multiplication by unit u corresponds to and addition of t corresponds to In addition, M(2,A) contains matrices corresponding to addition in the second embedding or "translation at infinity" with respect to the first embedding.


Fields

Fields and Homomorphisms

[edit | edit source]

Definition

[edit | edit source]
Definition (Field)

A field is a commutative unital ring such that every non-zero has a multiplicative inverse. In other words, for every there exists some such that .

Essentially, a field is a commutative division ring.

Examples

[edit | edit source]
  1. (rational, real and complex numbers) with standard and operations have field structure. These are examples with infinite cardinality.
  2. , the integers modulo where is a prime, and and are mod is a family of finite fields.
  3. If is a field, then , the set of rational functions (i.e. quotients of polynomials), with coefficients in also forms a field.
  4. A non-example is where is not prime. For example, 2 in has no multiplicative inverse, hence is not a field.

Homomorphisms

[edit | edit source]
Definition (Field Homomorphism)

If are fields then is a field homomorphism if

Therefore a field homomorphism is exactly a unital ring homomorphism.

Lemma 4.1.1

Every field homomorphism is injective.

Proof. This is a simple consequence of the ideal structure of fields. Suppose is a field homomorphism. In particular it is a ring homomorphism so we know that is a an ideal of . Since is a field, it only has trivial ideals so or . We can eliminate the second case since so the map cannot be trivial. Therefore we are in the first case which means exactly that is injective.     

The above lemma means that every field homomorphism can also be thought of as an embedding of fields.

As happens so often in mathematics, a map between objects induces further maps between related objects. For example, a continuous map between topological spaces induces a map between the set of closed curves on the spaces and a linear map between vector spaces induces a linear map between the dual spaces (albeit in the opposite direction). In this case, a homomorphism between fields induces a homomorphism between the corresponding ring of polynomials. To be precise, suppose is a field homomorphism. This induces a map given by

It is easy to see that is a (unital) ring homomorphism. Moreover if is an isomorphism then so is .

Characteristic of Fields

[edit | edit source]

An important property of fields is their characteristic. We first need to consider the canonical homomorphism from into a field . Of course this is defined by mapping the unit to the unit. Since is generated by , this is sufficient to define the entire homomorphism. From the First Isomorphism Theorem, we know that . In particular, this means that is a subring and even a subfield of so is an integral domain. Hence is a prime ideal of . There is a unique non-negative integer generating this ideal. We call this integer the characteristic of . Notice by the above argument that the characteristic must be prime if it is non-zero.

Intuitively, the characteristic of a field is the smallest positive integer , if one exists, such that If no such positive integer exists, then has characteristic 0. So for example, all have characteristic while and have characteristic 0.

Sometimes, one calls the image of under the above canonical homomorphism above the prime subfield of . Hence the prime subfield of a finite field is (isomorphic to) (where is the characteristic of ) and the prime subfield of a field of characteristic 0 is (isomorphic to) .

Field Extensions

[edit | edit source]
Definition (Field Extensions)

Let and be fields. If and there is an embedding from into , then is a field extension of .

Let be an extension of . Since we can scale elements of by elements of via the multiplication on , forms a vector space over (one can verify all the axioms for vector spaces hold). The dimension of this vector space is the degree of the extension, . If the degree is finite, then is a finite extension of , and is of degree over F.

Examples

[edit | edit source]
  • The complex numbers are a field extension of the real numbers . The extension is of degree 2.
  • Similarly, one can add the imaginary number to the field of rational numbers to form the field of Gaussian rationals. This is also a degree 2 extension.
  • The real numbers form a field extension over but this is not a finite extension since the real numbers do not form a finite dimensional (or even a countably infinite dimensional) vector space over .

Algebraic Extensions

[edit | edit source]
Definition (Algebraic Extensions)

Let be an extension of . Then is algebraic over if there exists a non-zero polynomial such that . is an algebraic extension of if is an extension of , such that every element of is algebraic over .

For example, is an algebraic extension over (if is any element of then it is a root of ) but is not algebraic over because for example is not the root of any rational polynomial (this is a very difficult statement to prove).

Definition (Minimal Polynomial)

If is algebraic over then the set of polynomials in which have as a root is an ideal of . This is a principle ideal domain and so the ideal is generated by a unique monic non-zero polynomial, . We define the to be the minimal polynomial.

For example, the minimal polynomial of is and the minimal polynomial of is , both over . Note the minimal polynomial is heavily reliant on the field it is being viewed over. The minimal polynomial of over is simply .

Splitting Fields

[edit | edit source]

Our primary goal in this study is to find the roots of a given polynomial. The brilliant insight of Galois and Galois theory is to (try to) answer this question by looking at field extensions. The following two lemmas might help motivate this reasoning.

Lemma 4.1.2

Suppose is a field is a polynomial. Then there exists a (finite) field extension of such that contains a root of .

Proof. Suppose first that is irreducible. Then we can take . We know that is indeed a field because is irreducible. Moreover it contains an isomorphic copy of as the (equivalence classes of) the constant polynomials. Finally , the equivalence class of the linear polynomial , is a root of since Finally the degree of over is exactly the degree of the polynomial (which hopefully motivates the terminology). This is due to the division algorithm. Suppose is any polynomial in . Then we know by the division algorithm that there exist unique polynomials and such that where . In particular, this means every equivalence class contains a unique representative whose degree is less than . Therefore is spanned by where . If is not irreducible then it can be written as a product of irreducibles and applying the above process to any of these produces an extension which contains a root of at least one of these irreducible polynomials and hence contains a root of .    

We know is irreducible over , therefore is a field and one can verify that this field is isomorphic to . In fact, sometimes one defines the complex numbers as this quotient .

Lemma 4.1.3

Suppose is an extension over . Let be an irreducible polynomial in such that contains , a root of . Let be the smallest subfield of containing and . Then .

Proof. By the smallest subfield containing and , we mean the intersection of all subfields of that contain them. This collection of fields is non-empty since it contains for instance and it is easy to see that the intersection of subfields is again a subfield.

If is of degree 1, then we are done since that would mean that so and by the argument towards the end of Lemma 4.1.1, we have . Then we can assume that .

In order to show the isomorphism, we define a ring homomorphism In other words, acts on polynomials by simply evaluating them at . By definition, we know that since . Since is irreducible by assumption, it must also then generate the kernel (otherwise it would be a non-trivial multiple of the the generator of the kernel). Then by the First Isomorphism Theorem, we know that is isomorphic to a subfield of . Notice that contains as the image of the constant polynomials and it contains as the image of . By assumption, was the smallest subfield containing these two things so we must have .     

The first lemma above tells us that we can always find a field extension containing the root of an irreducible polynomial by modding out by the polynomial. The second lemma tells us that any field extension containing a solution is of this form (up to isomorphism). Thus we will spend considerable time looking at the ring of polynomials over a field and studying its quotient spaces.

One often thinks of as 'adjoining' the root to the field . Roughly speaking, we add to the field and then we close it under the field operations by also adding in all the possible sums, products, inverses, etc. and the further condition that satisfies the given polynomial. In fact, this is precisely what the construction in the previous lemma does.

An important consequence of Lemma 4.1.3 is that the roots of an irreducible polynomial are algebraically indistinguishable (this is made precise in Theorem 4.1.4 and in particular by its Corollary 4.1.5). For example, we know that and are both solutions of . There is no algebraic distinction between the two roots; to differentiate them we need topological information like the fact that and . Similarly, and are both solutions to . Interchanging these roots is exactly what leads to complex conjugation. The fact that roots of (irreducible) polynomials are all equivalent to one another is one of the key ideas of Galois Theory.

Theorem 4.1.4

Let be fields and be an irreducible polynomial. Let be an isomorphism of fields. Let be the polynomial . Let be a root of (in some extension of ) and let be a root of (in some extension of ). Then there exists an isomorphism that agrees with on .

Proof. Since is an isomorphism and is irreducible, we must have is also irreducible (since if we had then which would contradict irreducibility of ). Then and generate maximal ideals in their respective rings and the ring isomorphism descends to an isomorphism (of fields) of the quotientsWe know by the previous theorem that the domain is isomorphic to and the codomain is isomorphic to and this map agrees with on by construction.     

Corollary 4.1.5

Let be fields and be an irreducible polynomial. Suppose are roots of in some (potentially different) extensions of . Then .

Proof. Apply the previous theorem to case with and as the identity map.

Definition (Splitting Field)

Let be a field, and are roots of . Then the smallest field extension of which contains is called a splitting field of over . In other words, no proper subfield of contains and all .

Existence and Uniqueness of Splitting Fields

[edit | edit source]

We will see that rather than looking at arbitrary field extension, splitting fields will be the things to consider. First we need to know that they always exist.

Theorem 4.1.6

Let be a field and a polynomial. Then there exists a field extension of that is a splitting field of .

Proof. This is a largely uninteresting case of proof by induction. We will induct on the degree of . If is linear, then clearly its roots (in fact just the one root) is contained in so itself is a splitting field. Suppose . If splits into the product of linear terms, then again all the roots are contained in , so we already have a splitting field. So suppose has an irreducible factor of degree at least 2. Then there exists a field extension containing a root of . Then in , we can factorise the polynomial into where is a polynomial of degree . Then by induction there exists a field extension of that is a splitting field of . Therefore is a field extension of that contains all the roots of . Taking the intersection of all subfields of containing and the roots of gives us , a splitting field of .


Above we were careful to say a splitting field of . In fact, this was an unnecessary precaution since the splitting field of a polynomial is unique up to isomorphism. This follows from a generalisation of Theorem 4.1.4, where we claim the statement of the theorem holds even if we adjoin all the roots of the polynomial, instead of just one.

Theorem 4.1.7

Let be fields and be a polynomial. Let be an isomorphism of fields. Let be the polynomial . Let a splitting field of and a splitting field of . Then there exists an isomorphism that agrees with on .

Proof. This is once again a proof by induction on the degree of . If is of degree 1 or indeed splits into factors of degree 1 then the splitting field of is so we can take . Thus suppose has an irreducible factor of degree at least 2 so is an irreducible factor of . Then by the previous theorem we know extends to an isomorphism where is a root of and is a root of . Therefore over and respectively we can write and . Notice that is a splitting field of over . Indeed if a splitting field was strictly contained within , then it would contain all the roots of and and hence would contain all the roots of . But this would contradict being a splitting field of . Of course the same holds true for over . Since and have degree strictly less than , by induction we can assume that the statement of theorem holds for them. In particular, extends to an isomorphism . But since was an extension of , must also be an extension of concluding the proof.    

Corollary 4.1.8

Let be a field and be a polynomial. If are splitting fields of , then they are isomorphic.

Proof. Apply Theorem 4.1.7 to the case with and as the identity map.     

Classification of Finite Fields

[edit | edit source]
Theorem 4.1.9

If is a finite field, then for some prime and natural number .

Proof. Since is a finite field and we know its prime subfield is for some prime . The prime subfield is in particular a subfield of and hence forms a vector space over . Since is finite, it must be a finite dimensional vector space and in particular we must have have for some (as vector spaces) so .     

Theorem (every member of F is a root of )

[edit | edit source]

let be a field such that , then every member is a root of the polynomial .

proof: Consider as a the multiplicative group. Then by la grange's theorem . So multiplying by gives , which is true for all , including .

Theorem (roots of are distinct)

[edit | edit source]

Let be a polynomial in a splitting field over then the roots are distinct.


Factorization

One of the main motivations of this study is to determine the roots of a polynomial over a field. It is obvious that the roots of a product of polynomials is just their union (and in fact the multiplicities sum). Thus a good first step is to determine whether or not the given polynomial is a product of lower degree polynomials.

Recall we say a non-constant polynomial is reducible if there exist non-constant polynomials and such that . Otherwise, the polynomial is said to be irreducible. It is immediate that linear, i.e. degree 1, polynomials are irreducible. For low degree polynomials, it is easy to determine whether or not they are irreducible.

Lemma 4.2.1

If is a degree 2 or degree 3 polynomial, it is reducible if and only if it has a root.

Proof. This simply amounts to noting that if is of degree at most 3 then, any decomposition of the form must have at least one of or be linear.

Note that the statement does not hold for higher degree polynomials. For example has no roots in the rationals but so it is reducible.

A case we are particularly interested in is polynomials over the rationals. A very useful theorem for this is the Rational Root Theorem which is a consequence of Gauss' Lemma.

Theorem 4.2.2 (Gauss' Lemma)

Let be a primitive polynomial. Then is irreducible over if and only if it is irreducible over .

Proof. First we show that if is reducible over then it must be reducible over . Suppose we have where and are non-constant polynomials in . Suppose is the lowest common multiple of the denominators coefficients of the right hand side. Thenwith . If , we are done. So suppose . We can write as a product of primes . Modding out by we getwhere are the corresponding polynomials in (in other words we mod each of the coefficients by ). Since is an integral domain, this means that at least one of the factors is 0. Without loss of generality we can assume that . But this means that all of its coefficients are a multiple of . Therefore we can cancel out from both sides of the equation . This leaves primes on the left hand side. We can apply the same argument and conclude via induction that is reducible over .

The converse is easy to see since a decomposition over is in particular a decomposition over . Since the coefficients don't share a factor this means that the decomposition is into a product of non-constant polynomials (in particular we avoid cases like which is a non-trivial decomposition in but a trivial one in since is only a unit in the latter ring).    

In particular this makes it easy to determine when a polynomial over the rationals has a rational root. Given , suppose it is monic with integer coefficients. Then if has a root we can write where both factors have integer coefficients by Gauss' Lemma. Therefore in particular is an integer and must be a factor of the constant polynomial. If is not monic and its has a rational root then we would be able to write

In particular is a factor of the leading coefficient of and is a factor of the constant term. This is known as the Rational Root Theorem. By trying all these possibilities, one can immediately determine whether or not there exist rational roots of a given polynomial over the rationals. (even if the coefficients are rational, one can multiply the polynomial by an integer to obtain a polynomial with integer coefficients and then work with this scaled polynomial, which has the same roots as the original).

Now that we know trying to reduce polynomials over is (essentially) equivalent to reducing them over , it's useful to have some irreducibility criteria from the latter case. A very useful result for this is Eisenstein's criterion.

Lemma 4.2.3 (Eisenstein's Criterion)

Let be a prime in and is a polynomial in . Suppose divides all the and does not divide the constant term . Then is irreducible over and over .

Proof. Suppose were reducible. Then there exist non-constant, monic polynomials such that . Consider the polynomial in the quotient . We find that


where and are the respective polynomials modulo (in other words where is the canonical projection map). Since and were taken to be monic, we know their reductions and are also monic and hence non-constant. In particular then is a non-trivial decomposition of .

By comparing coefficients above, we see that the product above has no constant term. Therefore at least one of and has no constant term (this is where we use the fact that is prime so in particular is an integral domain). Suppose one of them does have a non-zero constant term. Then their product would contain lower degree terms but we know their product is exactly . Therefore both and have no constant term. But this means the constant terms of and are both multiples of so in particular their product is a multiple of leading to a contradiction.     

Example: From Eisenstein's criterion, it is immediate that is irreducible over and hence over (by Gauss' Lemma). This is one way to show that are all irrational for .

Example: Here is a more sophisticated example. Consider the polynomial where is a prime. We observe that In particular then is a monic polynomial where every non-leading coefficient is a multiple of and the constant term is exactly . Therefore by Eisenstein's criterion is irreducible which in turn means that is irreducible.

Once we have an irreducible polynomial, we know we can't decompose it any further so we need to start working with field extensions and splitting fields.

Exercises

[edit | edit source]
  1. Show that is irreducible over .
  2. Find all irreducible polynomials of degree at most 3 over and .
  3. Show that is irreducible over .
  4. Show that if is irreducible then is prime. Hint: Prove the contrapositive.


Splitting Fields and Algebraic Closures

Splitting Fields

[edit | edit source]

Let F be a field and p(x) be a nonconstant polynomial in F(x). We already know that we can find a field extension of F that contains a root of p(x). However, we would like to know whether an extension E of F containing all of the roots of p(x) exists. In other words, can we find a field extension of F such that p(x) factors into a product of linear polynomials? What is the "smallest" extension containing all the roots of p(x)?

Let F be a field and be a nonconstant polynomial in F[x]. An extension field E of F is a splitting field of p(x) if there exist elements in E such that and

in E[x].

A polynomial splits in E if it is the product of linear factors in E[x].

Example 1: Let be in . Then p(x) has irreducible factors and . Therefore, the field is a splitting field for p(x).

Example 2: Let be in . Then p(x) has a root in the field . However, this field is not a splitting field for p(x) since the complex cube roots of 3, are not in .

Theorem Let p(x) F(x) be a nonconstant polynomial. Then there exists a splitting field E for p(x).

Proof. We will use mathematical induction on the degree of p(x). If , then p(x) is a linear polynomial and . Assume that the theorem is true for all polynomials of degree k with and let . We can assume that p(x) is irreducible; otherwise, by our induction hypothesis, we are done. There exists a field K such that p(x) has a zero in K. Hence, , where . Since , there exists a splitting field of q(x) that contains the zeros of p(x) by our induction hypothesis. Consequently,

is a splitting field of p(x).

The question of uniqueness now arises for splitting fields. This question is answered in the affirmative. Given two splitting fields K and L of a polynomial , there exists a field isomorphism that preserves F. In order to prove this result, we must first prove a lemma.

Lemma Theorem Let be an isomorphism of fields. Let K be an extension field of E and be algebraic over E with minimal polynomial p(x). Suppose that L is an extension field of F such that is root of the polynomial in F[x] obtained from p(x) under the image of . Then extends to a unique isomorphism such that and agrees with on E.

Lemma Proof. If p(x) has degree n, then we can write any element in as a linear combination of . Therefore, the isomorphism that we are seeking must be

,

where

is an element in . The fact that is an isomorphism could be checked by direct computation; however, it is easier to observe that is a composition of maps that we already know to be isomorphisms.

We can extend to be an isomorphism from E[x] to F[x], which we will also denote by , by letting

.

This extension agrees with the original isomorphism , since constant polynomials get mapped to constant polynomials. By assumption, ; hence, maps onto . Consequently, we have an isomorphism . We have isomorphisms and , defined by evaluation at and , respectively. Therefore, is the required isomorphism.

Now write and , where the degrees of f(x) and g(x) are less than the degrees of p(x) and q(x), respectively. The field extension K is a splitting field for f(x) over E(α), and L is a splitting field for g(x) over F(β). By our induction hypotheses there exists an isomorphism such that agrees with on E(α). Hence, there exists an isomorphism such that agrees with on E.

Corollary Let p(x) be a polynomial in F[x]. Then there exists a splitting field K of p(x) that is unique up to isomorphism.

Algebraic Closures

[edit | edit source]

Given a field F, the question arises as to whether or not we can find a field E such that every polynomial p(x) has a root in E. This leads us to the following theorem.

Theorem 21.11 Let E be an extension field of F. The set of elements in E that are algebraic over F form a field.

Proof. Let be algebraic over F. Then is a finite extension of F. Since every element of is algebraic over , and are all algebraic over F. Consequently, the set of elements in E that are algebraic over F forms a field.


Corollary 21.12 The set of all algebraic numbers forms a field; that is, the set of all complex numbers that are algebraic over makes up a field.

Let E be a field extension of a field F. We define the algebraic closure of a field F in E to be the field consisting of all elements in E that are algebraic over F. A field F is algebraically closed if every nonconstant polynomial in F[x] has a root in F.

Theorem 21.13 A field F is algebraically closed if and only if every nonconstant polynomial in F[x] factors into linear factors over F[x].

Proof. Let F be an algebraically closed field. If is a nonconstant polynomial, then p(x) has a zero in F, say α. Therefore, must be a factor of p(x) and so , where . Continue this process with to find a factorization


,

where . The process must eventually stop since the degree of p(x) is finite.

Conversely, suppose that every nonconstant polynomial p(x) in F[x] factors into linear factors. Let be such a factor. Then . Consequently, F is algebraically closed.


Corollary 21.14 An algebraically closed field F has no proper algebraic extension E.

Proof. Let E be an algebraic extension of F; then . For , the minimal polynomial of α is . Therefore, and .


Theorem 21.15 Every field F has a unique algebraic closure.

It is a nontrivial fact that every field has a unique algebraic closure. The proof is not extremely difficult, but requires some rather sophisticated set theory. We refer the reader to [3], [4], or [8] for a proof of this result.

We now state the Fundamental Theorem of Algebra, first proven by Gauss at the age of 22 in his doctoral thesis. This theorem states that every polynomial with coefficients in the complex numbers has a root in the complex numbers. The proof of this theorem will be given in Abstract Algebra/Galois Theory.

Theorem 21.16 (Fundamental Theorem of Algebra) The field of complex numbers is algebraically closed.


Vector Spaces

Definition (Vector Space)
Let F be a field. A set V with two binary operations: + (addition) and (scalar multiplication), is called a Vector Space if it has the following properties:
  1. forms an abelian group
  2. for and
  3. for and

The scalar multiplication is formally defined by , where .

Elements in F are called scalars, while elements in V are called vectors.

Some Properties of Vector Spaces
Proofs:
  1. We want to show that , but
  2. Suppose such that , then



Algebras

In this section we will talk about structures with three operations. These are called algebras. We will start by defining an algebra over a field, which is a vector space with a bilinear vector product. After giving some examples, we will then move to a discussion of quivers and their path algebras.

Algebras over a Field

[edit | edit source]

Definition 1: Let be a field, and let be an -vector space on which we define the vector product . Then is called an algebra over provided that is a ring, where is the vector space addition, and if for all and ,

  1. ,
  2. and ,
  3. .

The dimension of an algebra is the dimension of as a vector space.

Remark 2: The appropriate definition of a subalgebra is clear from Definition 1. We leave its formal statement to the reader.

Definition 2: If is a commutative ring, is called a commutative algebra. If it is a division ring, is called a division algebra. We reserve the terms real and complex algebra for algebras over and , respectively.

The reader is invited to check that the following examples really are examples of algebras.

Example 3: Let be a field. The vector space forms a commutative -algebra under componentwise multiplication.

Example 4: The quaternions is a 4-dimensional real algebra. We leave it to the reader to show that it is not a 2-dimensional complex algebra.

Example 5: Given a field , the vector space of polynomials is a commutative -algebra in a natural way.

Example 6: Let be a field. Then any matrix ring over , for example , gives rise to an -algebra in a natural way.

Quivers and Path Algebras

[edit | edit source]

Naively, a quiver can be understood as a directed graph where we allow loops and parallell edges. Formally, we have the following.

Definition 7: A quiver is a collection of four pieces of data, ,

  1. is the set of vertices of the quiver,
  2. is the set of edges, and
  3. are functions associating with each edge a source vertex and a target vertex, respectively.

We will always assume that is nonempty and that and are finite sets.

Example 8: The following are the simplest examples of quivers:

  1. The quiver with one point and no edges, represented by .
  2. The quiver with point and no edges, .
  3. The linear quiver with points, .
  4. The simplest quiver with a nontrivial loop, .

Definition 9: Let be a quiver. A path in is a sequence of edges where for all . We extend the domains of and and define and . We define the length of the path to be the number of edges it contains and write . With each vertex of a quiver we associate the trivial path with and . A nontrivial path with is called an oriented loop at .

The reason quivers are interesting for us is that they provide a concrete way of constructing a certain family of algebras, called path algebras.

Definition 10: Let be a quiver and a field. Let denote the free vector space generated by all the paths of . On this vector space, we define a vector product in the obvious way: if and are paths with , define their product by concatenation: . If , define their product to be . This product turns into an -algebra, called the path algebra of .

Lemma 11: Let be a quiver and field. If contains a path of length , then is infinite dimensional.

Proof: By a counting argument such a path must contain an oriented loop, , say. Evidently is a linearly independent set, such that is infinite dimensional.

Lemma 12: Let be a quiver and a field. Then is infinite dimensional if and only if contains an oriented loop.

Proof: Let be an oriented loop in . Then is infinite dimensional by the above argument. Conversely, assume has no loops. Then the vertices of the quiver can be ordered such that edges always go from a lower to a higher vertex, and since the length of any given path is bounded above by , there dimension of is bounded above by .

Lemma 13: Let be a quiver and a field. Then the trivial edges form an orthogonal idempotent set.

Proof: This is immediate from the definitions: if and .

Corollary 14: The element is the identity element in .

Proof: It sufficed to show this on the generators of . Let be a path in with and . Then . Similarily, .

To be covered:

- General R-algebras


Boolean algebra

Boolean Algebra

[edit | edit source]

Boolean algebra is a deductive mathematical system closed over the values zero and one (false and true). A binary operator defined over this set of values accepts two boolean inputs and produces a single boolean output.

For any given algebra system, there are some initial assumptions, or postulates that the system follows. You can deduce additional rules, theorems, and other properties of the system from this basic set of postulates:

  • Closure: The boolean system is closed with respect to a binary operator if for every pair of boolean values it produces a boolean result. For example, logical AND is closed in the boolean system because it accepts only boolean operands and produces only boolean results.
  • Commutativity: A binary operator "#" is said to be commutative if A # B = B # A for all possible boolean values A and B.
  • Associativity: A binary operator "#" is said to be associative if (A # B) # C = A # (B # C) for all boolean values A, B, and C.
  • Distribution: Two binary operators "#" and "%" are distributive if A # (B % C) = (A # B) % (A # C) for all boolean values A, B, and C.
  • Identity: A boolean value I is said to be the identity element with respect to some binary operator "#" if A # I = A
  • Inverse: A boolean value I is said to be the inverse element with respect to some binary operator "#" if A # I = B and B ≠ A (i.e., B is the opposite value of A).

For our purposes, we will base boolean algebra on the following set of operators and values:

  • The two possible values in the boolean system are zero and one. Often we will call these values false and true (respectively).
  • The symbol "∧" represents the logical AND operation (conjunction); e.g., A ∧ B is the result of logically ANDing the boolean values A and B. When using single letter variable names, this text will drop the "∧" symbol; Therefore, AB also represents the logical AND of the variables A and B (we will also call this the product of A and B).
  • The symbol "∨" represents the logical OR operation (disjunction); e.g., A ∨ B is the result of logical ORing the boolean values A and B. (We will also call this the sum of A and B.)
  • Logical complement, negation, or NOT, is a unary operator. This text will use the prime symbol (′) to denote logical negation. For example, A′ denotes the logical NOT of A.
  • If several different operators appear in a single boolean expression, the result of the expression depends on the precedence of the operators. We'll use the following precedences (from highest to lowest) for the boolean operators: parenthesis, logical NOT, logical AND, then logical OR. The logical AND and OR operators are left associative. The logical NOT operation is right associative, although it would produce the same result using left or right associativity since it is a unary operator.

We will also use the following set of postulates:

  1. Boolean algebra is closed under the AND, OR, and NOT operations.
  2. The identity element with respect to ∧ is one and ∨ is zero. There is no identity element with respect to logical NOT.
  3. The ∧ and ∨ operators are commutative.
  4. ∧ and ∨ are distributive with respect to one another. That is, A ∧ (B ∨ C) = (A ∧ B) ∨ (A ∧ C) and A ∨ (B ∧ C) = (A ∨ B) ∧ (A ∨ C).
  5. For every value A there exists a value A′ such that A ∧ A′ = 0 and A ∨ A′ = 1. This value is the logical complement (or NOT) of A.
  6. ∧ and ∨ are both associative. That is, (A ∧ B) ∧ C = A ∧ (B ∧ C) and (A ∨ B) ∨ C = A ∨ (B ∨ C).

You can prove all other theorems in boolean algebra using these postulates. This text will not go into the formal proofs of these theorems, however, it is a good idea to familiarize yourself with some important theorems in boolean algebra. A sampling include:

  1. A ∨ A = A
  2. A ∧ A = A
  3. A ∨ 0 = A
  4. A ∧ 1 = A
  5. A ∧ 0 = 0
  6. A ∨ 1 = 1
  7. (A ∨ B)′ = A′ ∧ B′
  8. (A ∧ B)′ = A′ ∨ B′
  9. A ∨ A ∧ B = A
  10. A ∧ (A ∨ B) = A
  11. A ∨ A′B = A ∨ B
  12. A′ ∧ (A ∨ B′) = A′ B′
  13. AB ∨ AB′ = A
  14. (A′ ∨ B′) ∧ (A′ ∨ B) = A′
  15. A ∨ A′ = 1
  16. A ∧ A′ = 0

Theorems seven and eight above are known as DeMorgan's Theorems after the mathematician who discovered them.

The theorems above appear in pairs. Each pair (e.g. 1 & 2, 3 & 4, etc.) form a dual. An important principle in the boolean algebra system is that of duality. Any valid expression you can create using the postulates and theorems of boolean algebra remains valid if you interchange the operators and constants appearing in the expression. Specifically, if you exchange the ∧ and ∨ operators and swap the 0 and 1 values in an expression. you will wind up with an expression that obeys all the rules of boolean algebra. This does not mean the dual expression computes the same values, it only means that both expressions are legal in the boolean algebra system. Therefore, this is an easy way to generate a second theorem for any fact you prove in the boolean algebra system.

Although we will not be proving any theorems for the sake of boolean algebra in this text, we will use these theorems to show that two boolean equations are identical. This is an important operation when attempting to produce canonical representations of a boolean expression or when simplifying a boolean expression.

Boolean Functions and Truth Tables

[edit | edit source]

A boolean expression is a sequence of zeros, ones, and literals separated by boolean operators. A literal is a primed (negated) or unprimed variable name. For our purposes, all variable names will be a single alphabetic character. A boolean function is a specific boolean expression; we will generally give boolean functions the name "F" with a possible subscript. For example, consider the following function:

F0 = AB ∨ C

This function computes the logical AND of A and B and then logically ORs this result with C. If A=1, B=0, and C=1, then F0 returns the value one (1 ∧ 0 ∨ 1 = 1).

Another way to represent a boolean function is a via a truth table. The previous chapter used truth tables to represent the AND and OR functions. Those truth tables took the forms:

AND Truth Table

[edit | edit source]
AND 0 1
0 0 0
1 0 1

OR Truth Table

[edit | edit source]
OR 0 1
0 0 1
1 1 1

For binary operators with two input variables, this form of a truth table is very natural and convenient. However, reconsider the boolean function F0 above. That function has three input variables, not two. Therefore, one cannot use the truth table format given above. Fortunately, it is still very easy to construct truth tables for three or more variables. The following example shows one way to do this for functions of three variables:

A B C AB AB ∨ C
0 0 0 0 0
0 0 1 0 1
0 1 0 0 0
0 1 1 0 1
1 0 0 0 0
1 0 1 0 1
1 1 0 1 1
1 1 1 1 1


Clifford Algebras

In mathematics, Clifford algebras are a type of associative algebra. They can be thought of as one of the possible generalizations of the complex numbers and quaternions. The theory of Clifford algebras is intimately connected with the theory of quadratic forms and orthogonal transformations. Clifford algebras have important applications in a variety of fields including geometry and theoretical physics. They are named for the English geometer William Clifford.

Some familiarity with the basics of multilinear algebra will be useful in reading this section.

Introduction and basic properties

[edit | edit source]

Specifically, a Clifford algebra is a unital associative algebra which contains and is generated by a vector space V equipped with a quadratic form Q. The Clifford algebra Cℓ(V,Q) is the "freest" algebra generated by V subject to the condition1

If the characteristic of the ground field K is not 2, then one can rewrite this fundamental identity in the form

where <uv> = Q(u + v) − Q(u) − Q(v) is the symmetric bilinear form associated to Q. This idea of "freest" or "most general" algebra subject to this identity can be formally expressed through the notion of a universal property (see below).

Clifford algebras are closely related to exterior algebras. In fact, if Q = 0 then the Clifford algebra Cℓ(V,Q) is just the exterior algebra Λ(V). For nonzero Q there exists a canonical linear isomorphism between Λ(V) and Cℓ(V,Q) whenever the ground field K does not have characteristic two. That is, they are naturally isomorphic as vector spaces, but with different multiplications (in the case of characteristic two, they are still isomorphic as vector spaces, just not naturally). Clifford multiplication is strictly richer than the exterior product since it makes use of the extra information provided by Q. More precisely, they may be thought of as quantizations of the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra.

Quadratic forms and Clifford algebras in characteristic 2 form an exceptional case. In particular, if char K = 2 it is not true that a quadratic form is determined by its symmetric bilinear form, or that every quadratic form admits an orthogonal basis. Many of the statements in this article include the condition that the characteristic is not 2, and are false if this condition is removed.

Universal property and construction

[edit | edit source]

Let V be a vector space over a field K, and let Q : VK be a quadratic form on V. In most cases of interest the field K is either R or C (which have characteristic 0) or a finite field.

A Clifford algebra Cℓ(V,Q) is a unital associative algebra over K together with a linear map i : VCℓ(V,Q) defined by the following universal property: Given any associative algebra A over K and any linear map j : VA such that

j(v)2 = Q(v)1 for all vV

(where 1 denotes the multiplicative identity of A), there is a unique algebra homomorphism f : Cℓ(V,Q) → A such that the following diagram commutes (i.e. such that f o i = j):

Working with a symmetric bilinear form <·,·> instead of Q (in characteristic not 2), the requirement on j is

j(v)j(w) + j(w)j(v) = <vw> for all vwV.

A Clifford algebra as described above always exists and can be constructed as follows: start with the most general algebra that contains V, namely the tensor algebra T(V), and then enforce the fundamental identity by taking a suitable quotient. In our case we want to take the two-sided ideal IQ in T(V) generated by all elements of the form

for all

and define Cℓ(V,Q) as the quotient

Cℓ(V,Q) = T(V)/IQ.

It is then straightforward to show that Cℓ(V,Q) contains V and satisfies the above universal property, so that Cℓ is unique up to isomorphism; thus one speaks of "the" Clifford algebra Cℓ(V, Q). It also follows from this construction that i is injective. One usually drops the i and considers V as a linear subspace of Cℓ(V,Q).

The universal characterization of the Clifford algebra shows that the construction of Cℓ(V,Q) is functorial in nature. Namely, Cℓ can be considered as a functor from the category of vector spaces with quadratic forms (whose morphisms are linear maps preserving the quadratic form) to the category of associative algebras. The universal property guarantees that linear maps between vector spaces (preserving the quadratic form) extend uniquely to algebra homomorphisms between the associated Clifford algebras.

Basis and dimension

[edit | edit source]

If the dimension of V is n and {e1,…,en} is a basis of V, then the set

is a basis for Cℓ(V,Q). The empty product (k = 0) is defined as the multiplicative identity element. For each value of k there are n choose k basis elements, so the total dimension of the Clifford algebra is

Since V comes equipped with a quadratic form, there is a set of privileged bases for V: the orthogonal ones. An orthogonal basis in one such that

where <·,·> is the symmetric bilinear form associated to Q. The fundamental Clifford identity implies that for an orthogonal basis

This makes manipulation of orthogonal basis vectors quite simple. Given a product of distinct orthogonal basis vectors, one can put them into standard order by including an overall sign corresponding to the number of flips needed to correctly order them (i.e. the signature of the ordering permutation).

If the characteristic is not 2 then an orthogonal basis for V exists, and one can easily extend the quadratic form on V to a quadratic form on all of Cℓ(V,Q) by requiring that distinct elements are orthogonal to one another whenever the {ei}'s are orthogonal. Additionally, one sets

.

The quadratic form on a scalar is just Q(λ) = λ2. Thus, orthogonal bases for V extend to orthogonal bases for Cℓ(V,Q). The quadratic form defined in this way is actually independent of the orthogonal basis chosen (a basis-independent formulation will be given later).

Examples: Real and complex Clifford algebras

[edit | edit source]

The most important Clifford algebras are those over real and complex vector spaces equipped with nondegenerate quadratic forms.

Every nondegenerate quadratic form on a finite-dimensional real vector space is equivalent to the standard diagonal form:

where n = p + q is the dimension of the vector space. The pair of integers (p, q) is called the signature of the quadratic form. The real vector space with this quadratic form is often denoted Rp,q. The Clifford algebra on Rp,q is denoted Cp,q(R). The symbol Cn(R) means either Cn,0(R) or C0,n(R) depending on whether the author prefers positive definite or negative definite spaces.

A standard orthonormal basis {ei} for Rp,q consists of n = p + q mutually orthogonal vectors, p of which have norm +1 and q of which have norm −1. The algebra Cp,q(R) will therefore have p vectors which square to +1 and q vectors which square to −1.

Note that C0,0(R) is naturally isomorphic to R since there are no nonzero vectors. C0,1(R) is a two-dimensional algebra generated by a single vector e1 which squares to −1, and therefore is isomorphic to C, the field of complex numbers. The algebra C0,2(R) is a four-dimensional algebra spanned by {1, e1, e2, e1e2}. The latter three elements square to −1 and all anticommute, and so the algebra is isomorphic to the quaternions H. The next algebra in the sequence is C0,3(R) is an 8-dimensional algebra isomorphic to the direct sum HH called Clifford biquaternions.

One can also study Clifford algebras on complex vector spaces. Every nondegenerate quadratic form on a complex vector space is equivalent to the standard diagonal form

where n = dim V, so there is essentially only one Clifford algebra in each dimension. We will denote the Clifford algebra on Cn with the standard quadratic form by Cn(C). One can show that the algebra Cn(C) may be obtained as the complexification of the algebra Cp,q(R) where n = p + q:

.

Here Q is the real quadratic form of signature (p,q). Note that the complexification does not depend on the signature. The first few cases are not hard to compute. One finds that

C0(C) = C
C1(C) = CC
C2(C) = M2(C)

where M2(C) denotes the algebra of 2×2 matrices over C.

It turns out that every one of the algebras Cp,q(R) and Cn(C) is isomorphic to a matrix algebra over R, C, or H or to a direct sum of two such algebras. For a complete classification of these algebras see classification of Clifford algebras.

Properties

[edit | edit source]

Relation to the exterior algebra

[edit | edit source]

Given a vector space V one can construct the exterior algebra Λ(V), whose definition is independent of any quadratic form on V. It turns out that if F does not have characteristic 2 then there is a natural isomorphism between Λ(V) and Cℓ(V,Q) considered as vector spaces (and there exists an isomorphism in characteristic two, which may not be natural). This is an algebra isomorphism if and only if Q = 0. One can thus consider the Clifford algebra Cℓ(V,Q) as an enrichment (or more precisely, a quantization, cf. the Introduction) of the exterior algebra on V with a multiplication that depends on Q (one can still define the exterior product independent of Q).

The easiest way to establish the isomorphism is to choose an orthogonal basis {ei} for V and extend it to an orthogonal basis for Cℓ(V,Q) as described above. The map Cℓ(V,Q) → Λ(V) is determined by

Note that this only works if the basis {ei} is orthogonal. One can show that this map is independent of the choice of orthogonal basis and so gives a natural isomorphism.

If the characteristic of K is 0, one can also establish the isomorphism by antisymmetrizing. Define functions fk : V × … × VCℓ(V,Q) by

where the sum is taken over the symmetric group on k elements. Since fk is alternating it induces a unique linear map Λk(V) → Cℓ(V,Q). The direct sum of these maps gives a linear map between Λ(V) and Cℓ(V,Q). This map can be shown to be a linear isomorphism, and it is natural.

A more sophisticated way to view the relationship is to construct a filtration on Cℓ(V,Q). Recall that the tensor algebra T(V) has a natural filtration: F0F1F2 ⊂ … where Fk contains sums of tensors with rank ≤ k. Projecting this down to the Clifford algebra gives a filtration on Cℓ(V,Q). The associated graded algebra

is naturally isomorphic to the exterior algebra Λ(V). Since the associated graded algebra of a filtered algebra is always isomorphic to the filtered algebra as filtered vector spaces (by choosing complements of Fk in Fk+1 for all k), this provides an isomorphism (although not a natural one) in any characteristic, even two.

Grading

[edit | edit source]

The linear map on V defined by preserves the quadratic form Q and so by the universal property of Clifford algebras extends to an algebra automorphism

α : Cℓ(V,Q) → Cℓ(V,Q).

Since α is an involution (i.e. it squares to the identity) one can decompose Cℓ(V,Q) into positive and negative eigenspaces

where Ci(V,Q) = {xCℓ(V,Q) | α(x) = (−1)ix}. Since α is an automorphism it follows that

where the superscripts are read modulo 2. This means that Cℓ(V,Q) is a Z2-graded algebra (also known as a superalgebra). Note that C0(V,Q) forms a subalgebra of Cℓ(V,Q), called the even subalgebra. The piece C1(V,Q) is called the odd part of Cℓ(V,Q) (it is not a subalgebra). This Z2-grading plays an important role in the analysis and application of Clifford algebras. The automorphism α is called the main involution or grade involution.

Remark. In characteristic not 2 the algebra Cℓ(V,Q) inherits a Z-grading from the canonical isomorphism with the exterior algebra Λ(V). It is important to note, however, that this is a vector space grading only. That is, Clifford multiplication does not respect the Z-grading only the Z2-grading. Happily, the gradings are related in the natural way: Z2 = Z/2Z. The degree of a Clifford number usually refers to the degree in the Z-grading. Elements which are pure in the Z2-grading are simply said to be even or odd.

If the characteristic of F is not 2 then the even subalgebra C0(V,Q) of a Clifford algebra is itself a Clifford algebra. If V is the orthogonal direct sum of a vector a of norm Q(a) and a subspace U, then C0(V,Q) is isomorphic to Cℓ(U,−Q(a)Q), where −Q(a)Q is the form Q restricted to U and multiplied by −Q(a). In particular over the reals this implies that

for q > 0, and
for p > 0.

In the negative-definite case this gives an inclusion C0,n−1(R) ⊂ C0, n(R) which extends the sequence

RCHHH ⊂ …

Likewise, in the complex case, one can show that the even subalgebra of Cn(C) is isomorphic to Cn−1(C).

Antiautomorphisms

[edit | edit source]

In addition to the automorphism α, there are two antiautomorphisms which play an important role in the analysis of Clifford algebras. Recall that the tensor algebra T(V) comes with an antiautomorphism that reverses the order in all products:

.

Since the ideal IQ is invariant under this reversal, this operation descends to an antiautomorphism of Cℓ(V,Q) called the transpose or reversal operation, denoted by xt. The transpose is an antiautomorphism: . The transpose operation makes no use of the Z2-grading so we define a second antiautomorphism by composing α and the transpose. We call this operation Clifford conjugation denoted

Of the two antiautomorphisms, the transpose is the more fundamental.3

Note that all of these operations are involutions. One can show that they act as ±1 on elements which are pure in the Z-grading. In fact, all three operations depend only on the degree modulo 4. That is, if x is pure with degree k then

where the signs are given by the following table:

k mod 4 0 1 2 3
+ + (−1)k
+ + (−1)k(k−1)/2
+ + (−1)k(k+1)/2

The Clifford scalar product

[edit | edit source]

When the characteristic is not 2 the quadratic form Q on V can be extended to a quadratic form on all of Cℓ(V,Q) as explained earlier (which we also denoted by Q). A basis independent definition is

where <a> denotes the scalar part of a (the grade 0 part in the Z-grading). One can show that

where the vi are elements of V — this identity is not true for arbitrary elements of Cℓ(V,Q).

The associated symmetric bilinear form on Cℓ(V,Q) is given by

One can check that this reduces to the original bilinear form when restricted to V. The bilinear form on all of Cℓ(V,Q) is nondegenerate if and only it is nondegenerate on V.

It is not hard to verify that the transpose is the adjoint of left/right Clifford multiplication with respect to this inner product. That is,

and

Structure of Clifford algebras

[edit | edit source]

In this section we assume that the vector space V is finite dimensional and that the bilinear form of Q is non-singular. A central simple algebra over K is a matrix algebra over a (finite dimensional) division algebra with center K. For example, the central simple algebras over the reals are matrix algebras over either the reals or the quaternions.

  • If V has even dimension then Cℓ(V,Q) is a central simple algebra over K.
  • If V has even dimension then C0(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two isomorphic central simple algebras over K.
  • If V has odd dimension then Cℓ(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two isomorphic central simple algebras over K.
  • If V has odd dimension then C0(V,Q) is a central simple algebra over K.

The structure of Clifford algebras can be worked out explicitly using the following result. Suppose that U has even dimension and a non-singular bilinear form with discriminant d, and suppose that V is another vector space with a quadratic form. The Clifford algebra of U+V is isomorphic to the tensor product of the Clifford algebras of U and (−1)dim(U)/2dV, which is the space V with its quadratic form multiplied by (−1)dim(U)/2d. Over the reals, this implies in particular that

These formulas can be used to find the structure of all real Clifford algebras.

The Clifford group Γ

[edit | edit source]

In this section we assume that V is finite dimensional and the bilinear form of Q is non-singular.

The Clifford group Γ is defined to be the set of invertible elements x of the Clifford algebra such that

for all v in V. This formula also defines an action of the Clifford group on the vector space V that preserves the norm Q, and so gives a homomorphism from the Clifford group to the orthogonal group. The Clifford group contains all elements r of V of nonzero norm, and these act on V by the corresponding reflections that take v to v − <v,r>r/Q(r) (In characteristic 2 these are called orthogonal transvections rather than reflections.)

Many authors define the Clifford group slightly differently, by replacing the action xvα(x)−1 by xvx−1. This produces the same Clifford group, but the action of the Clifford group on V is changed slightly: the action of the odd elements Γ1 of the Clifford group is multiplied by an extra factor of −1. This action used here has several minor advantages: it is consistent with the usual superalgebra sign conventions, elements of V correspond to reflections, and in odd dimensions the map from the Clifford group to the orthogonal group is onto, and the kernel is no larger than K*. Using the action α(x)vx−1 instead of xvα(x)−1 makes no difference: it produces the same Clifford group with the same action on V.

The Clifford group Γ is the disjoint union of two subsets Γ0 and Γ1, where Γi is the subset of elements of degree i. The subset Γ0 is a subgroup of index 2 in Γ.

If V is finite dimensional with nondegenerate bilinear form then the Clifford group maps onto the orthogonal group of V and the kernel consists of the nonzero elements of the field K. This leads to exact sequences

In arbitrary characteristic, the spinor norm Q is defined on the Clifford group by

It is a homomorphism from the Clifford group to the group K* of non-zero elements of K. It coincides with the quadratic form Q of V when V is identified with a subspace of the Clifford algebra. Several authors define the spinor norm slightly differently, so that it differs from the one here by a factor of −1, 2, or −2 on Γ1. The difference is not very important.

The nonzero elements of K have spinor norm in the group K*2 of squares of nonzero elements of the field K. So when V is finite dimensional and non-singular we get an induced map from the orthogonal group of V to the group K*/K*2, also called the spinor norm. The spinor norm of the reflection of a vector r has image Q(r) in K*/K*2, and this property uniquely defines it on the orthogonal group. This gives exact sequences:

Note that in characteristic 2 the group {±1} has just one element.

Spin and Pin groups

[edit | edit source]

In this section we assume that V is finite dimensional and its bilinear form is non-singular. (If K has characteristic 2 this implies that the dimension of V is even.)

The Pin group PinV(K) is the subgroup of the Clifford group Γ of elements of spinor norm 1, and similarly the Spin group SpinV(K) is the subgroup of elements of Dickson invariant 0 in PinV(K). When the characteristic is not 2, these are the elements of determinant 1. The Spin group usually has index 2 in the Pin group.

Recall from the previous section that there is a homomorphism from the Clifford group onto the orthogonal group. We define the special orthogonal group to be the image of Γ0. If K does not have characteristic 2 this is just the group of elements of the orthogonal group of determinant 1. If K does have characteristic 2, then all elements of the orthogonal group have determinant 1, and the special orthogonal group is the set of elements of Dickson invariant 0.

There is a homomorphism from the Pin group to the orthogonal group. The image consists of the elements of spinor norm 1 ∈ K*/K*2. The kernel consists of the elements +1 and −1, and has order 2 unless K has characteristic 2. Similarly there is a homomorphism from the Spin group to the special orthogonal group of V.

In the common case when V is a positive or negative definite space over the reals, the spin group maps onto the special orthogonal group, and is simply connected when V has dimension at least 3. Warning: This is not true in general: if V is Rp,q for p and q both at least 2 then the spin group is not simply connected and does not map onto the special orthogonal group. In this case the algebraic group Spinp,q is simply connected as an algebraic group, even though its group of real valued points Spinp,q(R) is not simply connected. This is a rather subtle point, which completely confused the authors of at least one standard book about spin groups.

Spinors

[edit | edit source]

Suppose that p+q=2n is even. Then the Clifford algebra Cℓp,q(C) is a matrix algebra, and so has a complex representation of dimension 2n. By restricting to the group Pinp,q(R) we get a complex representation of the Pin group of the same dimension, called the spinor representation. If we restrict this to the spin group Spinp,q(R) then it splits as the sum of two half spin representations (or Weyl representations) of dimension 2n-1.

If p+q=2n+1 is odd then the Clifford algebra Cℓp,q(C) is a sum of two matrix algebras, each of which has a representation of dimension 2n, and these are also both representations of the Pin group Pinp,q(R). On restriction to the spin group Spinp,q(R) these become isomorphic, so the spin group has a complex spinor representation of dimension 2n.

More generally, spinor groups and pin groups over any field have similar representations whose exact structure depends on the structure of the corresponding Clifford algebras: whenever a Clifford algebra has a factor that is a matrix algebra over some division algebra, we get a corresponding representation of the pin and spin groups over that division algebra. For examples over the reals see the article on spinors.

Applications

[edit | edit source]

Differential geometry

[edit | edit source]

One of the principal applications of the exterior algebra is in differential geometry where it is used to define the bundle of differential forms on a smooth manifold. In the case of a (pseudo-)Riemannian manifold, the tangent spaces come equipped with a natural quadratic form induced by the metric. Thus, one can define a Clifford bundle in analogy with the exterior bundle. This has a number of important applications in Riemannian geometry.

Physics

[edit | edit source]

Clifford algebras have numerous important applications in physics. Physicists usually consider a Clifford algebra to be an algebra spanned by matrices γ1,…,γn called Dirac matrices which have the property that

where η is the matrix of a quadratic form of signature (p,q) — typically (1,3) when working in Minkowski space. These are exactly the defining relations for the Clifford algebra Cl1,3(C) (up to an unimportant factor of 2), which by the classification of Clifford algebras is isomorphic to the algebra of 4 by 4 complex matrices.

The Dirac matrices were first written down by Paul Dirac when he was trying to write a relativistic first-order wave equation for the electron, and give an explicit isomorphism from the Clifford algebra to the algebra of complex matrices. The result was used to define the Dirac equation. The entire Clifford algebra shows up in quantum field theory in the form of Dirac field bilinears.

Footnotes

[edit | edit source]
  1. Mathematicians who work with real Clifford algebras and prefer positive definite quadratic forms (especially those working in index theory) sometimes use a different choice of sign in the fundamental Clifford identity. That is, they take v2 = −Q(v). One must replace Q with −Q in going from one convention to the other.
  2. The opposite is true when uses the alternate (−) sign convention for Clifford algebras: it is the conjugate which is more important. In general, the meanings of conjugation and transpose are interchanged when passing from one sign convention to the other. For example, in the convention used here the inverse of a vector is given by while in the (−) convention it is given by .

References

[edit | edit source]
  • Carnahan, S. Borcherds Seminar Notes, Uncut. Week 5, "Spinors and Clifford Algebras".
  • Lawson and Michelsohn, Spin Geometry, Princeton University Press. 1989. ISBN 0-691-08542-0. An advanced textbook on Clifford algebras and their applications to differential geometry.
  • Lounesto, P., Clifford Algebras and Spinors, Cambridge University Press. 2001. ISBN 0-521-00551-5.
  • Porteous, I., Clifford Algebras and the Classical Groups, Cambridge University Press. 1995. ISBN 0-521-55177-3.
[edit | edit source]


Shear and Slope

The first terms needed are triangle, base, vertex, and area. For instance, the proposition that for a triangle of given base and area, the locus of the vertex is a line parallel to the base. Imagine that the vertex is dragged along this line, deforming the triangle. Imagine also that the whole plane is similarly deformed by a transformation taking lines to lines. This transformation is a shear mapping.

The shear mapping is expressed as a linear transformation:

Here it is written in the kinetic interpretation with a vertical (x) space axis as time (t) evolves horizontally, such as used in time series studies.

At t=1 the shear has transformed (1,0) to (1,v), the point where a slope v line intersects t=1. Thus the parameter v in the shear transformation can be called slope.

The rectangles given by constant t and x are transformed by the shear to parallelograms, but the area of one of these parallelograms equals the area of the rectangle before transformation. Thus shear transformations preserve area.

Let and note that e2 = 0, the zero matrix, and that the shear matrix is ve plus the identity matrix. Dual numbers are used in abstract algebra to provide a short-hand for the matrix subalgebra

Definition: is the set of dual numbers. The basis {1,e} characterizes it as a 2-algebra over R. If z = a+be, let z* = abe, a conjugate. Then

since e2 = 0.

Note that zz* = 1 implies z = ± 1 + be for some b in R. Furthermore, exp(be) = 1 + be since the exponential series is truncated after two terms when applied to the e-axis. Consequently the logarithm of 1 + ve is v. Thus v can be considered the angle of 1+ve in the same way that the logarithm of a point on the unit circle is the radian angle of the point, as in Euler’s formula (exp and log are inverses).

The shear mappings acting on the plane form a multiplicative group that is isomorphic to the additive group of real numbers.

The three angles

[edit | edit source]

In Euclidean plane geometry there is the trichotomy right angle, acute angle, obtuse angle. Here a trichotomy of linear motions distinguishes three species of angle.

Planar rotations with complex, hyperbolic and dual numbers.

Each of the angles pivots on its peculiar motion: rotation for circular angle, squeeze mapping for hyperbolic angle, and shear for the slope. Furthermore, each motion evolves into its peculiar algebra of complexity: the dual numbers for shear, the split-binarions for hyperbolic angle, and the rotation and circular angle correspond to the plane of division binarions, called "complex numbers" by some. In fact, in the sense of a real 2-algebra, "complex" is ambiguous: each of the division binarions, split-binarions, and dual numbers forms a plane of "complex numbers".

A property of arc length on a circle is that it stays the same under rotation. It is said that "arc length is an invariant of rotation." A segment on t=1 that is transformed by a shear has the same length after the shear as before. Similarly, a hyperbolic angle is invariant under a squeeze. These three invariances can be seen together as consequences of area-invariance of the three motions: The hyperbolic angle is the area of the corresponding hyperbolic sector to xy=1, which has minimal radius √2 to (1,1). A circular angle corresponds to the area of its sector in a circle of radius √2. Finally, the slope is equal to the area of the triangle with base on t= √2 and hypotenuse corresponding to the slope. Since squeeze, shear, and rotation are all area-preserving, their motions in their corresponding planes preserve the central angles there. The traditional term for study of angle-preservation is conformal mapping, often presuming circular angles.

These three species of angle provide a parameter for polar coordinates in each of the three 2-algebras found as subspaces of 2 × 2 real matrices.


Quaternions

The algebra of Quaternions is a structure first studied by the Irish mathematician William Rowan Hamilton which extends the two-dimensional complex numbers to four dimensions. Multiplication is non-commutative in quaternions, a feature which enables its representation of three-dimensional rotation. Hamilton's provocative discovery of quaternions founded the field of hypercomplex numbers. Suggestive methods like dot products and cross products implicit in quaternion products enabled algebraic description of geometry now widely applied in science and engineering.

Definitions

[edit | edit source]
Quaternion plaque on Broom Bridge, Dublin, which says:
Here as he walked by
on the 16th of October 1843
Sir William Rowan Hamilton
in a flash of genius discovered
the fundamental formula for
quaternion multiplication i2 = j2 = k2 = ijk = −1 & cut it on a stone of this bridge


A Quaternion corresponds to an ordered 4-tuple , where . A quaternion is denoted . The sum is called the vector part of q, and a is the real part. Hamiltion coined the term vector in this context. Subsequent developments have extended the usage of the term vector to any element of a linear space. The vectors in H form a 3-dimensional subspace V.

The set of all quaternions is denoted by . It is straightforward to define component-wise addition and scalar multiplication on , making it a real vector space.

Multiplication follows the rules of the "quaternion group" Q8 = {1, -1, i, -i, j, -j, k, -k} that Hamilton carved into a stone of Broom Bridge, Dublin:

The rules for the pairwise multiplication of , , and are:

(positive cyclic products)
(negative cyclic products).

Using these, one can define a general rule for multiplication of quaternions. Because quaternion multiplication is not commutative, is a not a field. However, every nonzero quaternion has a multiplicative inverse (see below), so the quaternions are an example of a division ring. It is important to note that the non-commutative nature of quaternion multiplication makes it impossible to define the quotient of two quaternions p and q unambiguously, as the quantities and are generally different.

Like the more familiar complex numbers, the quaternions have a conjugation, often denoted by a superscript star: . The conjugate of the quaternion is . As is the case for the complex numbers, the product is always a positive real number equal to the sum of the squares of the quaternion's components. The norm of a quaternion is the square root of .

If pq is the product of two quaternions, then implying that forms a composition algebra.

The multiplicative inverse of a non-zero quaternion is given by

where division is defined since

Unlike in the complex case, the conjugate of a quaternion can be computed algebraically:

.

Versors and elliptic space

[edit | edit source]

William Kingdon Clifford used Hamilton’s quaternions to explicate rotation geometry as an elliptic space with its own variety of lines, parallels, and surfaces. The ideas were reviewed in 1948 by Lemaitre and Coxeter and that sketch has these definitions:

A versor is a quaternion of norm one, thus it lies on a 3-dimensional sphere found in the 4-space of quaternions. The versors are given by Euler's formula for complex numbers where the imaginary unit is taken from the unit sphere in the 3-space of vector quaternions:

The distance between two versors u and v is

A right parataxy on elliptic space is effected by multiplying on the right by a versor Similarly a left parataxy arises from left multiplication. In recognition of his contribution to elliptic geometry, a parataxy is called a Clifford translation.

The general displacement of elliptic space is a combination of two parataxies, one left, one right: Note that if then the real line in the quaternions is fixed and the displacement is a rotation of the 3-space of quaternion vectors.

The term line is appropriated for elliptic geometry. These lines are not straight, but they are parametrized by real numbers. Each line is associated with a right versor like s when c = π/2 in v. Then is a typical elliptic line. It corresponds to the axis of the rotation

Now for u not on L, there are two Clifford parallels to L through u:

For fixed right versors r and s, a Clifford surface can be formed as a union of Clifford parallels or as

To form elliptic space from versors, two versors u and v are equivalent if u + v = 0. Modulo this equivalence, the versors, their algebra and geometry, represent elliptic space.

Linear viewpoint

[edit | edit source]

Quaternions may be represented by 2×2 matrices with complex number entries: the place of is taken by these arrays:

One uses matrix multiplication to verify that these expressions obey the rules of presentation of Q8.

M(2,C) denotes the full algebra of 2×2 complex matrices, which has eight real dimensions, and sustains a representation of as a four-dimensional subalgebra. The linear properties of and M(2,C) assure the fidelity of the representation once the copy of Q8 has been identified.

Quaternions, like other associative hypercomplex systems of the 19th century, eventually were viewed as matrix algebras in the 20th century. However, in 1853 Hamilton included biquaternions in his book of Lectures on Quaternions.

Biquaternions are quaternions with complex number coefficients, sometimes called complex quaternions. Biquaternions form an algebra isomorphic to M(2,C). If the rows or columns of a matrix are proportional, then the determinant is zero, and there is no inverse. Nevertheless, such matrices have been used in physical science to represent events on a light-path from the origin. Authors Silberstein and Lanczos refer to this algebra as the biquaternions, but other writers have abandoned the label: Elie Cartan used M(2,C) extensively in The Theory of Spinors (1938), and Wolfgang Pauli, in his matrix mechanics of the atom, caused himself to be associated with M(2,C).

Pauli Spin Matrices

[edit | edit source]

Quaternions are closely related to the Pauli spin matrices of Quantum Mechanics. The Pauli matrices are often denoted as

, ,

(Where is the well known quantity of complex numbers)

The 2×2 identity matrix is sometimes taken as .

Thus , the real linear span of the matrices , , and , is isomorphic to . For example, take this matrix product:

Or, equivalently,  

All three of these matrices square to the negative of the identity matrix. If we take , , , and , it is easy to see that the span of the these four matrices is "the same as" (that is, isomorphic to) the set of quaternions .

Exercises

[edit | edit source]
  1. Using the presentation equations of Q8, write out the full product of two quaternions. In other words, given and , find the components of their product
  2. Show the composition algebra property Hint: use w: Euler's four-square identity.

Axial pencils

[edit | edit source]
The axis of a pencil is the real axis in the 4-algebra

Hamilton's quaternions provide a picture of a pencil of complex number planes that fill out his hyperspace. Another pair of pencils provide alternative descriptions of 4-space as made up of planar algebras: The hyperspace is with the first coordinate taken as the real line, and as the axis of the various pencils. The Hamilton case uses the sphere of imaginary units

Any pair of antipodal points on this sphere generates a plane isomorphic to the ordinary complex plane

The second and third pencils derive from the findings of James Cockle and Arthur Cayley. Cayley set up an arithmetic of matrix multiplication which has expedited modern science. For instance, the so-called imaginary units are represented by which has multiplicative square equal to the negative of the identity matrix. But then there is also by which generates an algebraic plane distinct from ordinary complex numbers. This algebra split-binarions, has inverse proportion included as a structural feature, such as found in economics or spacetime. In fact, the Lorentz boost is exhibited by a split-binarion multiplication. The inherent relation was recognized in the 19th century by J. Cockle, W.K. Clifford, and A. Macfarlane in the English world and by some Serbians.

The second pencil is an imaginary one without linear representation. As Hamilton had a sphere of imaginary units, Macfarlane would have a sphere of hyperbolic units u with u2 = +1. The full algebra of split-binarions is Any pair of elements that are polar opposite on this sphere generate a plane isomorphic to the split-binarions A. The 4-algebra containing this pencil is the hyperbolic quaternion algebra. As Oliver Heaviside and Willard Gibbs advocated a positive dot product for vectors, they have been associated with hyperbolic quaternions. When this algebra drew attention in the 1890s a "great vector debate" ensued in various publications including Nature. When the failure of the algebra to satisfy the associative law of multiplication was noted, it was realized that no matrix representation would be found.

Each plane of the pencil can represent a Lorentz boost. However, rotations of the vector subspace, an operation within the reach of Hamilton's structure, is beyond the means of hyperbolic quaternions, hence the Lorentz group cannot be represented with Macfarlane's algebra.

The third pencil arises from both imaginary units and hyperbolic units as found in the ring of 2x2 real matrices. In this figure there must be noted nilpotent matrices such as by which correspond to dual numbers in the matrix algebra. Such planes separate the complex and split-binarion planes, and are included in the pencil. The axis of the pencil is the line of matrices that are real multiples of the identity matrix. Over an alternate basis this ring is known as split-quaternions, and the pencil has three types of planar subrings.



2x2 real matrices

The associative algebra of 2×2 real matrices is denoted by M(2, R). Two matrices p and q in M(2, R) have a sum p + q given by matrix addition. The product matrix p q is formed through matrix multiplication. For

let

Then q q* = q*q = (adbc) I, where I is the 2×2 identity matrix. The real number ad − bc is called the determinant of q. When ad − bc ≠ 0, then q is an invertible matrix, and

The collection of all such invertible matrices constitutes the general linear group GL(2, R). In the terms of abstract algebra, M(2, R) with the associated addition and multiplication operations forms a ring, and GL(2, R) is its group of units. M(2, R) is also a four-dimensional vector space, so it is also an associative algebra.

The 2×2 real matrices are in one-one correspondence with the linear mappings of the two-dimensional Cartesian coordinate system into itself by the rule

M(2,R) is where all three types of planar angle come to common expression in terms of area. M(2,R) is described as a pencil on a real line that is shared by three types of 2-algebras appearing as subalgebras of M(2,R). They are the division binarions, split-binarions, and dual numbers which use circular angle, hyperbolic angle, and slope, respectively.

The hyperbolic angle is defined in terms of area under y=1/x. The circular angle equals the area of the corresponding sector of a circle of radius √2. Likewise, the slope equals the area of a triangle with base on a line and apex at a point √2 distance from the line.

The angles have the feature of invariance under motion, according to the type of plane. Either a rotation, a squeeze, or a shear as the case may be. Generalization of the notion of imaginary unit in M(2,R) is addressed first. It is matrix multiplication that produces the group action on a plane, so the characteristic of matrices that makes them preservers of area is addressed next.

Pencil of planar subalgebras

[edit | edit source]

In synthetic geometry, the term pencil is used for the set of lines on a given point, and axial pencil for the set of planes on a given line. Here the axis is the set of multiples of the identity matrix I by real numbers. Every matrix that is not in this set is contained in a unique planar subalgebra. These subalgebras are division-binarions, split-binarions, or dual numbers.

Given a matrix m with m2 in {I, 0, −I}, there is a subalgebra

Then Pm is a commutative subalgebra and M(2, R) = ⋃Pm  where the union is over all m such that m2 ∈ {−I, 0, I }.

To identify such m, first square the generic matrix:

When a + d = 0 this square is a diagonal matrix.

Thus one assumes d = −a when looking for m to form commutative subalgebras. When mm = −I, then bc = −1 − aa, an equation describing a hyperbolic paraboloid in the space of parameters (a,b,c). Such an m serves as an imaginary unit. In this case Pm is isomorphic to the division binarions, also known as the field of (ordinary) complex numbers.

When mm = +I, m is an involutory matrix. Then bc = +1 − aa, also giving a hyperbolic paraboloid. If a matrix is an idempotent matrix, it must lie in such a Pm and in this case Pm is ring isomorphic to split-binarions.

The case of a nilpotent matrix, mm = 0, arises when only one of b or c is non-zero, and the commutative subalgebra Pm is then a copy of the dual number plane.

When M(2, R) is reconfigured with a change of basis, this pencil is seen in split-quaternions where the sets of square roots of I and −I take a symmetrical shape as hyperboloids.

Equi-areal mapping

[edit | edit source]

First transform one differential vector into another:

Areas are measured with density , an exterior product for which In linear mapping this differential 2-form is transformed:

Area is preserved when the determinant is one. Thus the equi-areal mappings are identified with the special linear group. Given the profile above, every such g lies in a commutative subring Pm representing a type of complex plane according to the square of m. Since g g* = I, one of the following three alternatives occurs:

  • mm = −I and g is a Euclidean rotation, or
  • mm = I and g is a hyperbolic rotation, or
  • mm = 0 and g is a shear mapping.

The preservation of area provides a common foundation for study of conformal mapping in a plane. In fact, there are three species of angle used in analysis, circular and hyperbolic angle and slope as an expression of angle in the dual number plane.

Functions of 2 × 2 real matrices

[edit | edit source]

The commutative subalgebras of M(2, R) determine the function theory; in particular the three types of subalgebras each have their own algebraic structures which set the value of algebraic expressions. Consideration of the square root function and the logarithm function serves to illustrate the constraints implied by the special properties of each type of subalgebra Pm in the above pencil.

First note that the invertible elements, the units, of each plane form a topological group with one, two, or four components. The component that contains 1 is called the component of the identity. The polar coordinates of an element include an angle factor:

  • If mm = −I, then z = ρ exp(θm) where θ is a circular angle.
  • If mm = 0, then z = ρ exp(sm) or z = −ρ exp(sm) where s is a slope.
  • If mm = I, then z = ρ exp(a m) or z = −p exp(a m) or
z = m ρ exp(a m) or z = −m ρ exp(a m) where a is a hyperbolic angle.

In the first case exp(θ m) = cos(θ) + m sin(θ), known as Euler's formula.

In the case of the dual numbers exp(s m) = 1 + s m. Finally, in the case of split-binarions there are four components in the group of units. The identity component is parameterized by ρ and exp(a m) = cosh(a) + m sinh(a).

Now regardless of the subalgebra Pm, but the argument of the function must be taken from the identity component of its group of units. Half the plane is lost in the case of the dual number structure; three-quarters of the plane must be excluded in the case of the split-binarions.

Similarly, if ρ exp(a m) is an element of the identity component of the group of units of a plane associated with 2×2 matrix m, then the logarithm function results in a value log ρ+ a m. The domain of the logarithm function suffers the same constraints as does the square root function described above: half or three-quarters of Pm must be excluded in the cases mm = 0 or mm = I.

2 × 2 real matrices as species of complex numbers

[edit | edit source]

Every 2×2 real matrix can be interpreted as one of three species of (generalized[1]) complex number: a division binarion, a dual number, or a split-binarion. Above, the algebra of 2×2 matrices is profiled as a union of subalgebras Pm, all sharing the same real axis. One can determine to which type of subalgebra a given 2×2 matrix belongs as follows:

Consider the 2×2 matrix

The subalgebra Pm containing z is found by projections:

As noted above, the square of the matrix z is diagonal when a + d = 0. The matrix z must be expressed as the sum of a multiple of the identity matrix I and a matrix in the hyperplane a + d = 0. Projecting z alternately onto these subspaces of R4 yields

Furthermore,

where .

Now z is n one of three species of subalgebra:

  • If p < 0, then it is a division binarion:
    Let . Then .
  • If p = 0, then it is the dual number:
    .
  • If p > 0, then z is a split-binarion:
    Let . Then .

Similarly, a 2×2 matrix can also be expressed in polar coordinates with the caveat that there are two connected components of the group of units in the dual number plane, and four components in the split-binarion plane.

Projective group

[edit | edit source]

A given 2 × 2 real matrix with adbc acts on projective coordinates [x : y] of the real projective line P(R) as a linear fractional transformation:

When cx + dy = 0, the image point is the point at infinity, otherwise

Rather than acting on the plane as in the section above, a matrix acts on the projective line P(R), and all proportional matrices act the same way.

Let p = adbc ≠ 0. Then

The action of this matrix on the real projective line is

because of projective coordinates,

so that the action is that of the identity mapping on the real projective line. Therefore,

act as multiplicative inverses.

The projective group starts with the group of units GL(2,R) of M(2,R), and then relates two elements if they are proportional, since proportional actions on P(R) are identical: PGL(2,R) = GL(2,R)/~ where ~ relates proportional matrices. Every element of the projective linear group PGL(2,R) is an equivalence class under ~ of proportional 2 × 2 real matrices.

Autonomous differential equation

[edit | edit source]

The differential equation has solution where a is a given constant and C is an arbitrary constant.

Using division binarions, the equation may be interpreted as a tangent slope to a curve parametrized by t : for i2 = − 1, the differential equation has solution

Similarly, for j2 = +1, the differential equation has solution a branch of the unit hyperbola.

In the autonomous differential equation the matrix A corresponds to a constant binarion that relates the slope of the tangent and the curve. The solution of the matrix differential equation is given by the exponential function, using this constant as a cofactor in the argument. The solution is periodic when the constant is a division binarion, and not when it is a split-binarion. Evidently the constant also determines which subalgebra of M(2,R) contains the solution curve.

Whereas linear algebra is premised on simultaneous linear equations, there is the existence theorem for solution of a system of differential equations[2]

are continuous functions.

Here p = 2, the matrix is constant, and the solution is exhibited. However, some notational gymnastics are expected of the mathematical reader. Following tradition, the matrix is written to the left and function notation is employed. So rather than row vectors fed into the matrix in previous sections, the function is read as a column vector, which the reader must reconstruct from a binarion:

References

[edit | edit source]
  1. Anthony A. Harkin & Joseph B. Harkin (2004) Geometry of Generalized Complex Numbers, Mathematics Magazine 77(2):118–29
  2. T. J. Willmore (1959) Introduction to Differential Geometry, page 27, chapter 1, Appendix I, Oxford Clarendon Press
  • w:Rafael Artzy (1965) Linear Geometry, Chapter 2-6 Subgroups of the Plane Affine Group over the Real Field, p. 94, Addison-Wesley.
  • Helmut Karzel & Gunter Kist (1985) "Kinematic Algebras and their Geometries", found in
    • Rings and Geometry, R. Kaya, P. Plaumann, and K. Strambach editors, pp. 437–509, esp 449,50, D. Reidel, ISBN 90-277-2112-2 .
  • Svetlana Katok (1992) Fuchsian groups, pp. 113ff, University of Chicago Press ISBN 0-226-42582-7 .
  • Garret Sobczyk (2012). "Chapter 2: Complex and Hyperbolic Numbers". New Foundations in Mathematics: The Geometric Concept of Number. Birkhäuser. ISBN 978-0-8176-8384-9.


Hypercomplex numbers

The terms group theory and ring theory are refinements of algebraic understanding that developed in the era of electronics and aircraft, the 20th century. The term hypercomplex number harkens back to the age of steam. For the most part, the hypercomplex systems have been assimilated through the resolution of vision provided by groups, rings, and fields, and the term has been retired from use other than historic reference. Similarly, the field of complex numbers has an insufficiently descriptive name, and might be better described as division binarions C according to composition algebra theory.

W.R. Hamilton (1805−1865) studied quaternions and biquaternions

Hypercomplex numbers grew out of William Rowan Hamilton's construction of quaternions in the 1840s. The legacy of his vision continues in spatial vector algebra: for vectors and the well-known products are

  • Dot:
  • Cross:

These products are the severed remnants of Hamilton’s quaternion product:

In 1845 John T. Graves and Arthur Cayley described an eight-dimensional hypercomplex system now referred to as octonions or Cayley numbers. They extend quaternions but associativity of multiplication is lost. James Cockle challenged the presumption of quaternions in four dimensions by presenting associative hypercomplex systems tessarines (1848) and coquaternions (1849). Hamilton had his own eight-dimensional system (biquaternions) that were explored in his Lectures on Quaternions (1853), but virtually ignored in Elements of Quaternions (completed by his son in 1865) and in the version edited by Charles Jasper Jolly in 1899.

Quaternions feature the property of anti-commutativity of the basis vectors i, j, k:

(in coquaternions ).

Due to anti-commutativity, squaring a vector leaves many cancelled terms:

  thus for

For any such r, the plane {x + y r : x,y in R} is a complex number plane, and by Euler's formula the mapping takes the ray through r to a wrapping of the unit circle in that plane. The unit sphere in quaternions is composed of these circles, considering the variable r. According to Hamilton, a unit quaternion is a versor; evidently every versor can be known by its parameters a and r.

W.K. Clifford (1845−1879) studied split-biquaternions

When the anti-commutativity axiom is changed to commutativity, then two square roots of minus one, say h and i, have a product hi with square James Cockle’s tessarines are based on such an imaginary unit, now with plus one for its square. Cockle initiated the use of  j, j2 = +1, to represent this new imaginary unit that is not a square root of minus one. The tessarines are z = w + z j where z, w are in C. The real tessarines feature a unit hyperbola, contrasting with the unit circle Whereas the circle surrounds the origin, a hyperbola has radii in only half of the directions of the plane and requires a conjugate hyperbola to cover the other half, and even then the asymptotes, that they share, provide even more directions in the plane. In 1873 William Kingdon Clifford exploited the real tessarines to modify Hamilton's biquaternions: where Hamilton had used elements of C (division binarions) for coefficients of a biquaternion q = w + x i + y j + z k, Clifford used real tessarines (now called split-binarions D). Clifford’s construction illustrated a process of generating new algebras from given ones in a procedure called tensor products: Hamilton’s biquaternions are , and the split biquaternions of Clifford are

Clifford was precocious, particularly in his anticipation of a geometric model of gravitation as hills and valleys in a temporal plenum. But he lived before set theory, modern logical and mathematical symbology, and before abstract algebra with its firmament of groups, rings and fields. One of the realities of light is its finite speed: a foot per nanosecond, an astronomic unit in 500 seconds, or a light year in a year. When a diagram uses any of these pairs of units as axes, the diagonals through the origin represent the locus of light, one for the left beam, one for the right. The diagonals are asymptotes to hyperbolas, such as a real tessarine. Eventually, over decades of deliberation, physicists realized that this hyperbola was the answer to a linear-velocity problem: How can v + w be the sum of two velocities when such accumulation may run over the speed of light?

The hyperbola lies between the asymptotes and will not run over the speed of light. In the real tessarine system the points of the hyperbola are and representing two velocities in the group a hyperbola. The sum of two velocities is found by their product another element of the hyperbola. After 1911, the parameter a was termed rapidity. Evidently this aspect of special relativity was born of real tessarines.

The electromagnetic work of Clerk-Maxell and Heinrich Hertz demanded a fitting context for theorizing with the temporal variable included. Maxwell had used Hamilton’s del operator

in A Treatise on Electricity and Magnetism,

but the quaternion algebra is unsuitable: it is implicitly a Euclidean 4-space since the square of the Euclidean norm.

Alex Macfarlane (1851−1913) studied hyperbolic quaternions

In the 1890s Alexander Macfarlane advocated Space Analysis with a hypercomplex system that exchanged Hamilton's sphere of imaginary units for a sphere of Cockle's imaginary units that square to +1. He retained the anti-commutative property of quaternions so that Then in this system of hyperbolic quaternions, for any r on the sphere, is a plane of split binarions, including unit hyperbola suitable to represent motion at any rapidity in direction r. The hyperbolic quaternions looked like an elegant model for electromechanics until it was found wanting. The problem was that the simple property of associative multiplication broke down in hyperbolic quaternions, and though it was a hypercomplex system with a useful model, loss of this property put it outside the purview of group theory, for instance.

Once the axioms of a vector space were established, hypercomplex systems were included. The axioms require a commutative group of vectors, a scalar field, and rules of operations. Putting the axioms of a vector space together with those for a ring establishes the meaning of an algebra in the study of abstract algebra.

For associative hypercomplex systems, Joseph Wedderburn removed all the mystery in 1907 when he showed that any such system could be represented with matrix rings over a field. For instance, 2 x 2 real matrices form an algebra M(2,R) isomorphic to coquaternions and 2 x 2 complex matrices form an algebra M(2,C) isomorphic to biquaternions. These algebras, along with R, C and tessarines form the associative composition algebras which are noted for the property

About 1897 four cooperative efforts changed mathematics for the better. Giuseppe Peano began to assemble his Formulario Mathematico, Felix Klein spearheaded the mathematical encyclopedia project, the quadrennial series of International Congresses of Mathematics was begun, and the International Association for Promoting the Study of Quaternions and Allied Systems of Mathematics published a bibliography and annual review.

Peano's effort gave mathematicians the symbolic language to compress concepts and proofs using set theory. Klein's encyclopedia upheld German as the primary medium, and the Congresses drew together all nations. The Quaternion Society was the primary arena addressing hypercomplex numbers, and was dissolved after 1913 upon the death of its president, Alexander Macfarlane.

Hypercomplex number systems

[edit | edit source]

The best-known hypercomplex number systems are the 4-dimensional quaternions, 8-dimensional octonions, and 16-dimensional sedenions, as summarized in the table below along with the real and complex number systems.

Name Dimension Symbol
real numbers 1 = 20
complex numbers 2 = 21
quaternions 4 = 22
octonions 8 = 23
sedenions 16 = 24
2n-ions 2n

According to a 2002 paper by American mathematician Robert P. C. de Marrais, after the sedenions are the 32-dimensional pathions, the 64-dimensional chingons, the 128-dimensional routons, the 256-dimensional voudons (coined by Tony Smith), and ad infinitum. Except for the term voudon, these terms were all coined by de Marrais. They are summarized in the table below.[1]

Name Dimension Symbol Etymology Other names
pathions 32 = 25 32 paths of wisdom of Kabbalah, from the Sefer Yetzirah trigintaduonions (), 32-ions
chingons 64 = 26 64 hexagrams of the I Ching sexagintaquattuornions, 64-ions
routons 128 = 27 Massachusetts Route 128, of the "Massachusetts Miracle" centumduodetrigintanions, 128-ions
voudons 256 = 28 256 deities of the Ifá pantheon of Voodoo or West African Vodún ducentiquinquagintasexions, 256-ions

Footnotes

[edit | edit source]
  1. de Marrais, Robert P. C. (2002). "Flying Higher Than a Box-Kite: Kite-Chain Middens, Sand Mandalas, and Zero-Divisor Patterns in the 2n-ions Beyond the Sedenions". arXiv:math/0207003. doi:10.48550/arXiv.math/0207003.


Category theory

Category theory is the study of categories, which are collections of objects and morphisms (or arrows), from one object to another. It generalizes many common notions in Algebra, such as different kinds of products, the notion of kernel, etc. See Category Theory for additional information.

Definitions & Notations

[edit | edit source]

Definition 1: A (locally small) category consists of

A collection of objects.
A collection of morphisms.
For any , is the subcollection of of morphisms from to , where each is required to be a set (hence the term locally small).

These obey the following axioms:

There is a notion of composition. If , and , then and are called a composable pair. Their composition is a morphism .
Composition is associative. whenever the composition is defined.
For any object , there is an identity morphism such that if are objects, and , then and .

Note that we demand neither nor to be sets; if they are both in fact sets, then we call our category small.

Definition 2: A morphism has associated with it two functions and called domain and codomain respectively, such that if and only if and . Thus two morphisms are composable if and only if .

Remark 3: Unless confusion is possible, we will usually not specify which Hom-set a given morphism belongs to. Also, unless several categories are in play, we will usually not write , but just " is an object". We may write or to implicitly indicate the Hom-set belongs to. We may also omit the composition symbol, writing simply for .

Basic Properties

[edit | edit source]

Lemma 4: Let be an object of a category. The identity morphism for is unique.

Proof: Assume and are identity morphisms for . Then .

Example 5: We present some of the simplest categories:

i) is the empty category, with no objects and no morphisms.
ii) is the category containging only a single object and its identity morphism. This is the trivial category.
iii) is the category with two objects, and , their identity morphisms, and a single morphism .
iv) We can also have a category like , but where we have two morphisms with . Then and are called parallel morphisms.
v) is the category with three objects . We have , and .

Initial and Final Objects

[edit | edit source]

Definition An object in a category is called initial or cofinal, if for any object there exists a unique morphism

Lemma If and are initial objects, then they are isomorphic.

Proof: Let and be the unique morphisms between and . Given that both and have a unique endomorphism because of their initiality, this morphism must be the identity. Therefore and are the respective identity morphisms, making and isomorphic.

Definition An object in a category is called final or coinitial, if for any object there exists a unique morphism

Lemma If and are final objects, then they are isomorphic.

Proof Pass to isomorphicness of initial objects in the cocategory.

Some examples of categories

[edit | edit source]
  • : the category whose objects are sets and whose morphisms are maps between sets.
  • : the category whose objects are finite sets and whose morphisms are maps between finite sets.
  • The category whose objects are open subsets of and whose morphisms are continuous (differentiable, smooth) maps between them.
  • The category whose objects are smooth (differentiable, topological) manifolds and whose morphisms are smooth (differentiable, continuous) maps.
  • Let be a field. Then we can define : the category whose objects are vector spaces over and whose morphisms are linear maps between vector spaces over .
  • : the category whose objects are groups and whose morphisms are homomorphisms between groups.

In all the examples given thus far, the objects have been sets with the morphisms given by set maps between them. This is not always the case. There are some categories where this is not possible, and others where the category doesn't naturally appear in this way. For example:

  • Let be any category. Then its opposite category is a category with the same objects, and all the arrows reversed. More formally, a morphism in from an object to is a morphism from to in .
  • Let be any monoid. Then we can define a category with a single object, with morphisms from that object to itself given by elements of with composition given by multiplication in .
  • Let be any group. Then we can define a category with a single object, with morphisms from that object to itself given by elements of with composition given by multiplication in .
  • Let be any small category, and let be any category. Then we can define a category whose objects are functors from to and whose morphisms are natural transformations between the functors from to .
  • : the category whose objects are small categories and whose morphisms are functors between small categories.


Lattice theory

A lattice is a poset such that each pair of elements has a unique least upper bound and a unique greatest lower bound.


Matroids

A matroid is an algebraic construct that is related to the notion of independence.

Matroids are an abstraction of several combinatorial objects, among them graphs and matrices. The word matroid was coined by Whitney in 1935 in his landmark paper "On the abstract properties of linear dependence". In defining a matroid Whitney tried to capture the fundamental properties of dependence that are common to graphs and matrices. Almost simultaneously, Birkhoff showed that a matroid can be interpreted as a geometric lattice. Maclane showed that matroids have a geometric representation in terms of points, lines, planes, dimension 3 spaces etc. Often the term combinatorial geometry is used instead of simple matroids. However, combinatorial geometry has another meaning in mathematical literature. Rank 3 combinatorial geometries are frequently called linear spaces. Matroids are a unifying concept in which some problems in graph theory, design theory, coding theory, and combinatorial optimization become simpler to understand.



Authors

Authors:


Manual of Style

This section defines the style rules that should be applied throughout the Abstract Algebra wikibook.

Purpose

[edit | edit source]

This is intended as a University-level textbook for students of mathematics. Readers are therefore expected to be familiar with math fundamentals, and with the material in the Algebra and Linear Algebra wikibooks.

Language

[edit | edit source]

The style of the language should be that of an ordinary math textbook: clear and precise but without talking down to the reader. Personal comments should be avoided.

The book should use standard American English spelling throughout.

Structure

[edit | edit source]

Each page should read like a section of a paper textbook, with external links to further reading where appropriate.(?)

Chapter and section titles

[edit | edit source]

For top-level chapter titles, all words should be capitalised apart from articles and prepositions (e.g. "Equivalence Relations and Congruence Classes"). Sub-headings within a page should only have the first letter of the heading capitalised.

Page length restrictions

[edit | edit source]

There are no additional page length restrictions imposed by this style guide, but do follow the global Wikibooks convention of limiting page lengths at 35k (?)

Templates

[edit | edit source]

With the exception of unimportant comments, each paragraph of text should be preceded with a "type declaration". Common types are Definition, Theorem, Lemma, Example, Note and Remark. A proof should be preceded with the word "Proof" in italics and a semicolon, and ended with a black square: . Code:

{{Unicode|∎}}

The same counter should be used for all types of paragraphs. Example:

Definition 1: ...

Theorem 2: ...

Proof: ...

Remark 3: ...

etc...

[edit | edit source]

Don't link outside the book (except in the introduction to the book as a whole, which can link to other books that are prerequisites).

[edit | edit source]

There are no navigation templates.

Images

[edit | edit source]

All images should be in Wikimedia commons. Since the vast majority of images will be diagrams to illustrate a point in the surrounding text, thumbnails will probably not be appropriate.

Categories

[edit | edit source]

All book chapters belong under the single category Book:Abstract Algebra

Notation

[edit | edit source]

Left = first usage, right = last usage

  • Groups:
  • Elements of groups:
  • Subgroups:
  • Elements of subgroups:
  • Normal subgroups:
  • Normal series: ,
  • Generic integers:
  • Summation indices:
  • Sequence indices:


The hierarchy of rings

Commutative rings

[edit | edit source]

Definition 11.1:

A ring with multiplication is called commutative if and only if for all .

Examples 11.2:

  • The whole numbers are commutative.
  • The matrix ring of -by- real matrices with matrix multiplication and component-wise addition is not commutative for .

In commutative rings, a left ideal is a right ideal and thus a two-sided ideal, and a right ideal also.

Integral domains

[edit | edit source]

Definition 11.3:

An integral domain is defined to be a commutative ring (that is, we assume commutativity by definition) such that whenever (), then or .

We can characterize integral domains in another way, and this involves the so-called zero-divisors.

Definition 11.4:

Let

Thus, a ring is an integral domain iff it has no zero divisors.

Unique factorisation domains

[edit | edit source]

Theorem 11.?:

Suppose that is a commutative ring

Principal ideal domains

[edit | edit source]

Due to its importance in algebra, we'll briefly give the definition of noetherian rings, which is a fairly exhaustive class of rings for which many useful properties hold. The theory of noetherian rings is well-studied, powerful and extensive, and we'll only study it in detail in the wikibook on Commutative Algebra. The reason that we give the definition here is that principal ideal domains are noetherian rings, which will imply that they are, in fact, unique factorisation domains.

Definition 11.?:

Let be a commutative ring. is called noetherian iff for every sequence of ideals of such that

there exists an such that .

This condition can be interpreted to state that every ascending chain of ideals stabilizes. Noetherian rings are named in honour of Emmy Noether.

Theorem 11.?:

Every PID is Noetherian.

Proof:

We observed earlier that the set of all ideals of a ring is inductive, with an explicit description of. If therefore we are given an ascending chain of ideals

Theorem 11.?:

Every PID is a UFD.

Proof:

Let be a PID, and let .

Euclidean domains

[edit | edit source]

Example 11.? (Gaussian integers):

We have already seen that is a Euclidean domain. Now consider the ring

with addition and multiplication induced by that of . We'll see in the exercises that this is indeed a commutative ring with identity. Furthermore, on it we define a Euclidean function as thus:

This is indeed a Euclidean function, the units of are and furthermore we may precisely describe the prime elements of and set them in relation to the prime elements of :

  1. If is a prime in , then either it is already a prime in , or there is a prime in the Gaussian primes such that .
  2. If is a Gaussian prime, then set . Either we have that is a prime in , or , where is a prime in .
  3. In 1., if , the former case happens if and only if and the latter if and only if .

Proof:

First, the proof of multiplicativity of is relegated to the exercises, that is, you'll show in the exercises that

.

Then we have to prove that division with remainder holds. Let thus and be elements of .

Due to , are units. Any other unit would have to have the form , where . Let be its inverse. Then , a contradiction.

Finally, let's prove the statements about the relation of the Gaussian primes to the integer primes.

  1. Since is a Euclidean domain, we have a decomposition of into prime elements of , say , where is a unit in . If , we are done. If , observe that , and since is prime, uniqueness of prime factorisation in the integers implies that at most two of are not one and those that are are either or . If one is , there is exactly one prime factor of in , which is absurd since is obviously not irreducible. If ,

Exercises

[edit | edit source]
  1. Prove that the Gaussian integers as defined above do form a commutative ring with identity. Use your knowledge on complex numbers (cf. the corresponding chapter in the wikibook on complex analysis).


Rings, ideals, ring homomorphisms

Basic definitions

[edit | edit source]

Definition 10.1:

A ring is a set together with two binary operations and and two special elements, the unit and the zero , such that:

  1. is an abelian group with respect to with neutral element .
  2. is a monoid (that is, a group without inversion) with respect to with neutral element .
  3. The distributive laws hold: , .

Examples 10.2:

  • The whole numbers with respect to usual addition and multiplication are a ring.
  • Every field is a ring.
  • If is a ring, then all polynomials over form a ring. This example will be explained later in the section on polynomial rings.

Definition 10.3:

Let be a ring. A left ideal of is a subset such that the following two things hold:

  1. is a subgroup of .
  2. , where (closedness by left multiplication).

Replacing closedness by left multiplication by closedness by right multiplication, we can define right ideals, and then both-sided ideals. If is a both-sided ideal of , we write .

We'll now show an important property of the set of all ideals of a given ring, namely that it's inductive. This means:

Definition 10.4:

Let be a partially ordered set (that is, the usual conditions transitivity, reflexivity and anti-symmetry are satisfied). is called inductive if and only if every ascending chain of elements of (that is, a sequence in such that ) has an upper bound (that is, an element such that ).

With this definition, we observe:

Theorem 10.5:

If a commutative ring is given, the set of all ideals of , partially ordered by inclusion (i.e. , where we use the convention of Donald Knuth and denote the power set of a set by ) is inductive.

Proof:

If

is an ascending chain of ideals, we set

and claim that . Indeed, if , find such that and . Then set , so that since . Similarly, if and , pick such that , whence since .

Residue class rings

[edit | edit source]

Definition and theorem 10.4:

Let be a ring, and . Then we define a relation on as follows:

.

This relation is an equivalence relation, and an equivalence class shall be denoted by for . If we define an addition

and a multiplication

,

then these two are well-defined (i. e. independent of the choice of the representatives and ) and turn into a ring, called the residue class ring with respect to the ideal .

Proof:

First, we check that is an equivalence relation.

  1. Reflexiveness: since is an additive subgroup.
  2. Symmetry: since inverses are in the subgroup.
  3. Transitivity: Let and . Then , since a subgroup is closed under the group operation.

Then we check that addition and multiplication are well-defined. Let and . Then

for certain .

Furthermore,

for these same ; this is in by closedness by left and right multiplication.

The ring axioms directly carry over from the old ring .

Ring homomorphisms

[edit | edit source]

Definition 10.5:

Let be rings. A ring homomorphism between the two is a map

such that:

  1. For all and .
  2. ( is the unit of and of ).


Sources

  • M. Artin, Algebra (2nd edition)