Graph theory is my favourite topic in mathematics and computing science and in this blog post I’ll introduce an algebra of graphs that I’ve been working on for a while. The algebra has become my go-to tool for manipulating graphs and I hope you will find it useful too.
The roots of this work can be traced back to my CONCUR’09 conference submission that was rightly rejected. I subsequently published a few application-specific papers gradually improving my understanding of the algebra. The most comprehensive description can be found in ACM TECS (a preprint is available here). Here I’ll give a general introduction to the simplest version of the algebra of graphs and show how it can be implemented in Haskell.
Constructing graphs
Let G be a set of graphs whose vertices come from a fixed universe. As an example, we can think of graphs whose vertices are positive integers. A graph g ∈ G can be represented by a pair (V, E) where V is the set of its vertices and E ⊆ V × V is the set of its edges.
The simplest possible graph is the empty graph. I will be denoting it by ε in formulas and by empty
in Haskell code. Hence, ε = (∅, ∅) and ε ∈ G.
A graph with a single vertex v will be denoted simply by v. For example, 1 ∈ G is a graph with a single vertex 1, that is ({1}, ∅). In Haskell I’ll use vertex
to lift a given vertex to the type of graphs.
To construct bigger graphs from the above primitives I’ll use two binary operators overlay and connect, denoted by + and →, respectively. The overlay + of two graphs is defined as:
(V1, E1) + (V2, E2) = (V1∪ V2, E1∪ E2)
In words, the overlay of two graphs is simply the union of their vertices and edges. The definition of connect → is similar:
(V1, E1) → (V2, E2) = (V1∪ V2, E1∪ E2 ∪ V1× V2)
The difference is that when we connect two graphs, we add an edge from each vertex in the left argument to each vertex in the right argument. Here are a few examples:
- 1 + 2 is the graph with two isolated vertices 1 and 2.
- 1 → 2 is the graph with a directed edge between vertices 1 and 2.
- 1 → (2 + 3) is the graph with three vertices {1, 2, 3} and two directed edges (1, 2) and (1, 3). In Haskell we can write
connect 1 (overlay 2 3)
. - 1 → 1 is the graph with vertex 1 and a self-loop (an edge going from a vertex to itself).
The following type class expresses the above in Haskell:
classGraphgwhere typeVertex g empty::g vertex::Vertexg->g overlay::g->g->g connect::g->g->g
Let’s construct some graphs! A graph that contains a given list of unconnected vertices can be constructed as follows:
vertices::Graphg=> [Vertexg] ->g vertices =foldr overlay empty .map vertex
And here is a clique (a fully connected graph) on a given list of vertices:
clique::Graphg=> [Vertexg] ->g clique =foldr connect empty .map vertex
For example, clique [1..]
is an infinite clique on all positive integers. We can also construct any graph given its edgelist:
fromEdgeList::Graphg=> [(Vertexg, Vertexg)] ->g fromEdgeList =foldr overlay empty .map edge where edge (x, y) = vertex x `connect` vertex y
As we will see in the next section, graphs satisfy a few laws and form an algebraic structure that is very similar to a semiring.
Algebraic structure
The structure (G, +, →, ε) introduced above satisfies many usual laws:
- (G, +, ε) is an idempotent commutative monoid
- (G, →, ε) is a monoid
- → distributes over +, e.g. 1 → (2 + 3) = 1 → 2 + 1 → 3
The following decomposition axiom, is the only law that makes the algebra of graphs different from a semiring:
x → y → z = x → y + x → z + y → z
Indeed, in a semiring the two operators have different identity elements, let’s denote them ε+ and ε→, respectively. By using the decomposition axiom we can prove that they coincide:
ε+ | = | ε+→ ε→→ ε→ | (identity of →) |
= | ε+→ ε→ + ε+→ ε→ + ε→→ ε→ | (decomposition) | |
= | ε+ + ε+ + ε→ | (identity of →) | |
= | ε→ | (identity of +) |
The idempotence of + also follows from the decomposition axiom.
The following is a minimal set of axioms that describes the graph algebra:
- + is commutative and associative
- (G, →, ε) is a monoid, i.e. → is associative and ε is the identity element
- → distributes over +
- → can be decomposed: x → y → z = x → y + x → z + y → z
An exercise for the reader: prove that ε is the identity of + from the minimal set of axioms above. This is not entirely trivial! Also prove that + is idempotent.
Note, to switch from directed to undirected graphs it is sufficient to add the axiom of commutativity of →. We will explore this in a future blog post.
Examples
Let’s look at two basic instances of the Graph type class that satisfy the laws from the previous section. The first one, called Relation, adopts our set-based definitions for the overlay and connect operators and is therefore a free instance (i.e. it doesn’t satisfy any other laws):
dataRelation a =Relation { domain ::Set a, relation ::Set (a, a) } deriving (Eq, Show) instanceOrda=>Graph (Relationa) where typeVertex (Relation a) = a empty =RelationSet.empty Set.empty vertex x =Relation (Set.singleton x) Set.empty overlay x y =Relation (domain x `Set.union` domain y) (relation x `Set.union` relation y) connect x y =Relation (domain x `Set.union` domain y) (relation x `Set.union` relation y `Set.union` Set.fromDistinctAscList [ (a, b) | a <-Set.elems (domain x) , b <-Set.elems (domain y) ])
Let’s also make Relation an instance of Num type class so we can use + and * operators for convenience.
instance (Orda, Numa) =>Num (Relationa) where fromInteger= vertex .fromInteger (+)= overlay (*) = connect signum=const empty abs=id negate=id
Note: the Num law abs x * signum x == x
is satisfied since x → ε = x. In fact, any Graph instance can be made a Num instance if need be. We can now play with graphs using interactive GHC:
λ>1 * (2+3) ::RelationInt Relation {domain = fromList [1,2,3], relation = fromList [(1,2),(1,3)]} λ>1 * (2+3) +2 * 3== (clique [1..3] ::RelationInt) True
Another simple instance can be obtained by embedding all graph constructors into a basic algebraic datatype:
dataBasic a =Empty |Vertex a |Overlay (Basic a) (Basic a) |Connect (Basic a) (Basic a) derivingShow instanceGraph (Basica) where typeVertex (Basic a) = a empty =Empty vertex =Vertex overlay =Overlay connect =Connect
We cannot use the derived Eq instance here, because it would clearly violate the laws of the algebra, e.g. Overlay Empty Empty
is structurally different from Empty
. However, we can implement a custom Eq instance as follows:
instanceOrda=>Eq (Basica) where x == y = toRelation x == toRelation y where toRelation::Orda=>Basica->Relationa toRelation = foldBasic foldBasic:: (Vertexg ~ a, Graphg) =>Basica->g foldBasic Empty= empty foldBasic (Vertex x ) = vertex x foldBasic (Overlay x y) = overlay (foldBasic x) (foldBasic y) foldBasic (Connect x y) = connect (foldBasic x) (foldBasic y)
The Basic instance is useful because it allows to represent densely connected graphs more compactly. For example, clique [1..n] :: Basic Int
has linear-size representation in memory, while clique [1..n] :: Relation Int
stores each edge separately and therefore takes O(n2) memory. As I will demonstrate in future blog posts, we can exploit compact graph representations for deriving algorithms that are asymptotically faster on dense graphs compared to existing graph algorithms operating on edgelists.
Summary
I’ve been using the algebra of graphs presented above for several years in a number of different projects and found it very useful. There are a few flavours of the algebra that I will introduce in follow-up blog posts that allow to work with undirected graphs, transitively closed graphs (also known as partial orders or dependency graphs), graph families, and their various combinations. All these flavours of the algebra can be obtained by extending the set of axioms.
I am working on a Haskell library alga implementing the algebra of graphs and intend to release it soon. Let me know if you have any suggestions on how to improve the above code snippets.