Introduction to Graphs

Recall that a relation models a relationship between objects. Of particular note are binary relations which capture relationships between pairs of objects drawn from the same universe. For example, we might consider:

An ordering relationship $(\leq)$ between members of $N$ .
A friendship relationship between friends.
A connectedness relationship between cities by roads or available flights.
A transition relationship between states in an abstract machine. For example, an idealized traffic light is such an abstract machine where the traffic light switches between "green", "yellow", and "red."

Binary relationships are ubiquitous in mathematics and computer science, and they all have a similar structure: a relation $R : U \times U$ . Can we exploit this structure to talk about all these sorts of relationships in a uniform manner? Is there a set of universal definitions and properties that apply to binary relations?

This is precisely the study of graph theory, the next area of discrete mathematics we'll examine in this course. Graph theory is really the study of binary relations, although we more commonly think of a graph as a visual object with nodes and edges.

Basic Definitions

Consider the following abstract binary relation over universe $U = {a, b, c, d, e, f}$ .

$R = {(a, b), (b, c), (c, d), (d, e)} .$

A graph allows us to visualize these relationships. Here is an example of a such graph for this relation:

A graph for relation R

We call the elements $a, \dots, f$ vertices or nodes of the graph. For each related pair of elements, we draw a line called an edge in our graph.

While our graph is simply a graphical representation of our binary relation, we traditionally represent a graph using a slightly different structure. We say that the graph above is $G = (V, E)$ . That is, graph $G$ is a pair of sets:

$V$ is the set of vertices in the graph. Here $V = {a, b, c, d, e, f}$ .
$E$ is the set of edges in the graph. Here $E = {(a, b), (b, c), (c, d), (d, e)}$ .

Definition (Graph)

A graph $G$ is a pair of sets:

$V$ , the set of vertices or nodes of the graph.
$E : V \times V$ , the set of edges of the graph.

Because we talk about edges so much, we frequently write the edge $(a, b) \in E$ as $ab \in E$ , i.e., we drop the pair notation and simply write the vertices together.

Exercise (Sketchin')

Draw the graph $G = (V, E)$ where:

$V = {a, b, c, d, e, f, g}$ .
$E = {a g, b g, c g, d g, e g, f g}$ .

Variations

The fundamental definition of a graph is a simple riff on a binary relation. We call such graphs simple graphs. However, there exists several variations of graphs that accommodate the wide range of scenarios we might find ourselves in.

Directed versus Undirected Graphs

Because individual relationships are encoded as pairs, the order matters between vertices. For example, the pair $(a, b)$ is distinct from the pair $(b, a)$ . In a directed graph or digraph, we acknowledge this fact and distinguish between the two orderings.

For example, consider the following graph $G = (V, E)$ with

$V = {a, b, c, d}$ .
$E = {ab, b c, c d, d c, d a}$ .

If we consider this graph directed, we would draw it as follows:

A directed graph

Note that the edges are directed edges where the direction is indicated by an arrowhead. If we were to have two vertices be mutually related, i.e., related in both directions, we need two edges, one for each direction. For example, $c$ and $d$ are mutually related, so we connect them with two edges $c d$ and $d c$ .

In contrast, we can consider $G$ to be undirected where we do not distinguish between the two orderings. Effectively, this means relations are unordered sets rather ordered pairs, but in terms of notation, we still keep $E : V \times V$ . If we consider $G$ to be undirected, we would draw it as follows:

A undirected graph

Here, the edges are undirected, i.e., without arrowheads. Effectively, we treat a single edge pair as relating symmetrically by default, so the edge $ab$ implies that $a$ is related to $b$ and $b$ is related to $a$ . Because of this, we should not include symmetric pairs in our set of edges. So we should define $E$ for the above graph as $E = {ab, b c, c d, d a}$ where we removed the symmetric pair $d c$ .

When should we employ a directed versus undirected graph? We should employ a directed graph where it is not assumed that our relation is symmetric for every pair of related vertices. For example, a "loves"-style relationship where $a$ loves $b$ is not inherently symmetric since $b$ might not love $a$ . A directed graph allows us to represent this distinction. A directed graph can always represent an undirected graph by explicitly including symmetric edges. Therefore, we can think of an undirected graph as a shortcut where we can avoid writing extras edges if we know that our relation is already symmetric. For example, a "friends"-style relationship is symmetric because $a$ being friends with $b$ implies that $b$ is also $a$ 's friend.

Self-loops

Like symmetry, we may or may not take reflexivity of a relation for granted. If we do not take this for granted, i.e., some elements are reflexively related but not all of them are, then we might consider introducing self-loops into a graph. For example, consider the following digraph $G = (V, E)$ with $E = {ab, b c, c a, aa, cc}$ .

A graph with self-loops

In this graph, $a$ and $c$ are related to themselves, but not $b$ .

Weights and Multi-graphs

Edges encode relations between objects in a graph. We can also carry additional information on these edges dependent on context. Most commonly, we will add numeric weights to our edges, e.g., to capture the distance between cities, or the cost of moving from one state to another. Both directed and undirected graphs can be weighted. As an example, consider the digraph $G = (V, E)$ with $E = {ab, b c, c a, c d}$ .

A weighted graph

We annotate the edges with a weight whose interpretation depends on context. For example, we can see that the edge $c a$ has weight 5. We represent the weights on our graph formally with an additional structure, a function $W : E \to Z$ , that maps edges to weight value. The codomain of $W$ can be whatever type is appropriate for the problem at hand; here we choose integers ( $Z$ ). For the above graph, we would define our weight function as:

$W (ab) = W (b c) = W (c a) = W (c d) = 315 - 2$

We can also extend our graphs further by extending $E$ to be a multiset, a set that tracks duplicate elements. This allows us to express the idea of multiple edges, e.g., with different weights according to $W$ .

Simple Graphs Revisited

Now that we have introduced various variations on a graph, we can finally come back and formally define a simple graph as a graph with no such variations.

Definition (Simple Graph)

A simple graph is an undirected, unweighted graph with no self-loops.

In closing, we have many variations of a graph that we might consider. In successive readings, we'll consider various analyses over graphs and problems we might try to solve. The beauty of graph theory is that because graphs are so general, by defining and solving problems in terms of graphs, we can apply our solutions to a whole host of problems!

Exercise (What's That?, ‡)

Consider the following formal definition of an abstract graph $G = (V, E)$ with:

$V = {a, b, c, d, e} E = {(a, b), (a, c), (a, d), (a, e), (c, d), (c, e), (b, e)}$

Draw $G$ .
Instantiate this abstract graph to a real-life scenario. Describe what objects the vertices $V$ represent and what relationship between objects is captured by $E$ .
Observe that $c$ , $d$ , and $e$ are mutually connected in this graph, i.e., each vertex has an edge to the other. Interpret the fact that they are mutually connected in your real-life scenario. Is the fact that they are mutually connected have special meaning in the scenario you envisioned?

Trees

Frequently the relations we draw between objects are hierarchical in nature. That is the objects have a parent-to-child relationship, for example:

A literal parent and their children.
A manager and the employees that report to them.
A folder and the files it contains.

We represent these relationships with a specialized kind of graph called a tree.

Definition (Tree)

A tree is an undirected graph that contains no cycles.

Here is an artificial example of a tree with five nodes, $a$ -- $f$ :

An example abstract tree with five nodes

We distinguish a vertex of the tree as its root. Here we'll consider $a$ to be the root of the tree although any of the vertices could be considered the root. By convention, we draw trees "upside down" with the root at the top and the tree growing downwards.

The root allows us to categorize the vertices of the tree by their distance from the root. We call a collection of vertices that are the same distance away from the root a level.

Definition (Level (Tree))

Let $T = (V, E)$ be a tree with a distinguished root $r$ . Define the $i$ th level of a tree, denoted $L_{i}$ to be the set of vertices that are $i$ nodes away from $r$ :

$L_{i} = {v ∣ v \in V and there exists a path of length i from r to v} .$

Definition (Height)

The height at tree is the maximum level of any of its vertices.

In our above example:

$L_{0} = L_{1} = L_{2} = {a} {b, c} {d, e, f}$

And the tree has height 2. Note that $L_{k}$ for any $k > 2$ is $\emptyset$ since there are no nodes greater than 2 away from $A$ .

With levels defined, we can now formally define the parent-child relationship that characterizes trees:

Definition (Parent)

The parent of a vertex $v$ at level $i$ of a tree is the node $u$ for which the edge $(u, v)$ is in the tree and $u$ is at level $i - 1$ .

Definition (Children)

The children of a vertex $u$ at level $i$ of a tree are the nodes $V$ for which each $v \in V$ , the edge $(u, v)$ is in the tree and $v$ is at level $i + 1$ .

Because a tree contains no cycles and the tree is rooted at a particular node, it follows that every vertex of a tree except the root has exactly one parent. (This is a worthwhile claim to prove yourself for practice!)

We can categorize trees by the maximal number of children any single node possesses. We call this value the tree's fan-out:

Definition (Fan-out)

The fan-out of a tree $k$ is the maximum number of children that any one vertex of the tree possesses. We call such a tree a $k$ -ary tree or a $k$ -tree for short. Notably a $1$ -tree is a sequence or a list, and a $2$ -tree is a binary tree.

Finally, we've restricted ourselves to connected trees, trees in which all its vertices are mutually reachable. If a graph is unconnected, but all of its connected components themselves are trees, then we call the graph a forest:

Definition (Connected Components)

Call an undirected graph $G = (V, E)$ connected if there exists a path between every pair of vertices in $G$ . A connected component $G^{'}$ of $G$ is a sub-graph of $G$ that is, itself, connected.

Definition (Forest)

A graph is a forest if all of its connected components are trees.

Directed Acyclic Graphs

We generally assume that a tree is an undirected graph. However, we can apply the same concepts to a directed graph. This results in a kind of graph called a directed acyclic graph (DAG):

Definition (Directed Acylic Graph)

A directed acyclic graph (DAG) is a directed graph that contains no cycles.

DAGs are outside the scope of our discussion of the basics of graphs, but be aware that DAGs have their own interesting properties and operations distinct from trees!

Depth-First Tree Traversals

Previously, we learned about depth-first and breadth-first traversals for graphs. Breadth-first traversals remain the same: we traverse the vertices of the tree by order of increasing level. However, the specialized nature of trees lets us specify different sorts of depth-first traversals, in particular, for binary trees where every node possesses at most two children. In such a tree, we call one child the left child and the other child the right child.

With this in mind, we can describe a recursive algorithm for depth-first search specialized to binary trees. In the Python-like code below, we use dot-notation to denote children, e.g., v.l for v's left child and v.r for v's right child.

def preDFS(u):
    visit(u)
    preDFS(u.l)
    preDFS(u.r)

Note that we visit the node u before visiting its children. Performing this traversal on our example tree from the beginning of this section yields the sequence:

$a, b, d, e, c, f .$

This kind of depth-first traversal of the tree is called a pre-order traversal of the tree. We first visit the current element and then visit its children.

In contrast, a post-order traversal of the tree visits the children first and then the current node last.

def postDFS(u):
    postDFS(u.l)
    postDFS(u.r)
    visit(u)

A post-order traversal of the graph yields the following sequence:

$d, e, b, f, c, a .$

Finally, we can intermix visiting children and visiting the current node with an in-order traversal:

def inDFS(u):
    inDFS(u.l)
    inDFS(u.r)
    visit(u)

An in-order traversal yields the following sequence:

$d, b, e, a, c, f .$

CSC 208: Discrete Structures