In the Set Cover problem, we are given a universal set \(U\) and a family of subsets \(S_1, \ldots, S_k \subseteq U\). A set cover is a collection of subsets \(C\) from \(S_1, \ldots, S_k\) whose union is the universal set \(U\). We would like to minimise \(\left\vert{C}\right\vert\).
The problem of finding the optimum \(C\) is NP-Complete, but a greedy algorithm can give an \(O(log_e n)\) approximation to optimal solution.
The greedy algorithm selects the set \(S_i\) containing the largest number of uncovered points at each step, until all of the points have been covered.
Below is an implementation in Python:
def set_cover(universe, subsets): """Find a family of subsets that covers the universal set""" elements = set(e for s in subsets for e in s) # Check the subsets cover the universe if elements != universe: return None covered = set() cover = [] # Greedily add the subsets with the most uncovered points while covered != elements: subset = max(subsets, key=lambda s: len(s - covered)) cover.append(subset) covered |= subset return cover def main(): universe = set(range(1, 11)) subsets = [set([1, 2, 3, 8, 9, 10]), set([1, 2, 3, 4, 5]), set([4, 5, 7]), set([5, 6, 7]), set([6, 7, 8, 9, 10])] cover = set_cover(universe, subsets) print(cover) if __name__ == '__main__': main()
Output:
[set([1, 2, 3, 8, 9, 10]), set([4, 5, 7]), set([5, 6, 7])]