Eugene Lawler was an American computer scientist, professor of UC Berkeley and was one of the founders of the field of Combinatorial Optimization.
Lawler made important contributions on branch and bound algorithms, dynamic programming, and was also the first one to observe that matroid intersection could be done in polynomial time.
He also proved that two of Karp’s 21 NP-Complete problems, The Directed Hamiltonian Cycle and 3-Dimensional Matching were NP-Complete.
This is the first post in our series of Matroids in Combinatorial Optimization context. They will be mainly based on the Matroids and the Greedy Algorithm chapter from Combinatorial Optimization – Networks and Matroids, by Eugene Lawler.
Hassler Whitney developed Matroids in 1935 in the context of algebraic theory and it was further applied by Jack Edmonds in the context of combinatorial optimization.
Matroids are a structure that allows us to solve problems by always taking local optimal steps and by doing that there’s the guarantee we’ll reach the global optimum. These types of algorithms are known as greedy algorithms.
First we’ll define Matroids and then give some examples of problems to modeled as matroids. Next, we’ll introduce weighted matroids and describe a generic algorithm to solve them and how such algorithm applied to matroids in graphs is actually the famous Kruskal algorithm.
Let be a set of elements and a family of subsets of (family here has the meaning of a set of elements that share some properties and it’s also clearer than using ‘set of subsets’). Let and consider the following properties:
1) and if a subset , then all subsets of belong to as well.
2) If with and with , then there is an element such that . (Henceforth, by abuse of notation, when we say we mean ).
If satisfies both (1) and (2), we say that is a matroid.
Terminology. An independent set is any set belonging to the family . If there’s no other set containing in we say it’s a maximal independent set. Note that by property (2), all maximal independent sets have the same size.
The rank of a set , denoted by is the largest independent subset of . A minimal dependent set is called circuit.
Special types of matroids
We can model some structures as matroids and take advantage of matroids properties to find solutions to problems involving these structures. In this section we’ll present a few examples of such modelings.
The modelling process consists in defining the set , the family (usually by defining the property that the subsets of must have to be in there) and then proving that satisfies (1) and (2).
A matric matroid is a matroid in the context of matrices. Given a matrix with the set of columns as , if is the family of sets containing only linear independent columns, then is a matric matroid.
Proof. It’s easy to see that any subset of a linear independent (LI) set of columns is also LI, so (1) is straightforward. For (2), we need to prove that given subsets and of and LI columns, respectively, then there must be a column from such that is still LI. Well, if it’s not the case, it means that each column in can be written as linear combination of the LI columns from , which contradicts the fact that is LI.
A graphic matroid is a matroid in the context of graphs. Given a graph , if is the family of arcs that do not form a cycle, then is a graphic matroid.
Proof. It’s possible to show that subsets of edges that are cycle free, correspond to LI columns in the incidence matrix of , so in this case we can also view as a matric matroid of the incidence matrix of .
A matching matroid is a matroid in the context of matchings in graphs.
Let be an undirected graph and let be the family of subsets of nodes such that there’s a matching in that covers all nodes in (we say that an node in is covered if there’s an edge in the matching incident to it, even if the other end of the edge is not in ). Then is a matching matroid.
Sketch of the Proof. It’s easy to verify that satisfies (1). The proof of why it satisfies (2) consists in getting the symmetric difference between matchings covering and vertices respectively and showing that in that difference there will be at least one alternating path such that if we change the alternation will increase the number of matched vertices.
Let be a matroid and a weight function for elements in . The problem we want to solve is to find the independent set that has maximum weight, where the weight of a set is the sum of the weight of its elements.
Let’s assume the elements on each independent set are listed in the non-increasing order of weight, so given sets and , we list them as
such that and
We say is lexicographically greater than if for the first position their components differ, say at index , or in case listing is a prefix of listing (that is, the same way we sort strings).
A set that is not lexicographically less than any other set is said to be lexicographically maximum. We can show such set is also a maximum independent set, because otherwise, by property (2), we can always add more elements to it, making it lexicographically greater.
The following Theorem from Rado and Edmonds states an important property of weighted matroids:
Theorem 1. Let be a matroid. Then
3) For any negative weight function on , a lexicographically maximum set in is also the set with the maximum weight.
Conversely, given and , if (1) and (3) are satisfied, is a matroid.
We say that a set is Gale optimal if all its components are not less than the corresponding components of any other set when these components are listed in non-increasing order. More formally, there exists a one-to-one mapping such that for all .
Note that a Gale optimal set is clearly a lexicographically maximum and thus has optimal weight.
Given that, Gale’s theorem provides another a stronger result than Theorem 1 regarding weighted matroids:
Theorem 2. Let be a matroid. Then
4) For any weight function of elements on , there exists a Gale optimal set .
Conversely, given and , if (1) and (4) are satisfied, is a matroid.
Weighted Matroid Greedy Algorithm. Property (4) allows to use a simple greedy algorithm to find the set of the largest weight in .
In the first step, we look for the single-element set in with the largest weight. By property (2) and (4), we can show that the Gale optimal set contains . Next, we look for the largest element such that . Again, we can show that such elements are contained in the Gale optimal set. We repeat this until we get a maximum independent set, which is also the Gale optimal set.
More generically, we have the following algorithm:
Let be our current solution, which starts as the empty set. At each step, we look for the element with maximum weight not in such that belongs to .
The Maximum Spanning Tree problem and the Kruskal Algorithm.
In the maximum spanning tree problem we are given a connected, undirected graph with non-negative weights on and we want to find a spanning tree (subset of that forms a tree) with the lowest cost.
Recall that a tree is a connected graph without cycles. The graphic matroid represents cycle-free subsets of arcs (or a forest) and thus the maximum independent set of a connected graph is a tree. If we assign non-negative weights to arcs, we can find the optimal Gale independent set which corresponds to the maximum spanning tree.
The Kruskal algorithm is then basically an application of the weighted matroid greedy algorithm for graphic matroids: It first sorts the edges by non-increasing order of weight and then adds an edge to the current solution if it doesn’t form a cycle.
In this post we learned the basics of matroids and weighted matroids. In future posts we’ll study Matroid Intersection, Matroid Partition and other types of matroids like Oriented Matroids.