# NP-Completeness

$$NP$$-completeness is a property of some computational problems, which makes them amongst the hardest computational problems, and also at least as hard as each other.

## Definition

A decision problem is a problem with a yes/no answer (or 0/1, true/false).

Polynomial time is a running time for an algorithm which is some polynomial multiplied by the size of the algorithm’s
inputs $$N$$, such as $$N$$, $$log\ N$$, $$N\ log\ N$$, $$N^2$$, or $$N^3$$, but not $$2^N$$ or $$N!$$.

A deterministic algorithm is one which, when it has to pursue several options, needs to pursue them one at a time.
A nondeterministic algorithm is one which, when it has to pursue several options, can pursue them simultaneously.
A nondeterministic algorithm is faster than a deterministic algorithm, but it is no more powerful in the sense of
being able to solve problems that a deterministic algorithm can’t.

The class $$P$$ is the class of decision problems that can be solved in polynomial time by a deterministic algorithm.
The class $$NP$$ is the class of decision problems that can be solved in polynomial time by a nondeterministic algorithm.
It follows from this that $$P$$ is a subset of $$NP$$.

A polynomial reduction is an polynomial time algorithm that can transform an instance of a problem $$A$$
(decision or otherwise) into an instance of another problem $$B$$, such that the solution is preserved, i.e., for every
instance of $$A$$, if the solution of $$A$$ is yes, then the instance of $$B$$ produced by the polynomial transformation will also
be yes, and the same for no, or any other value.

A problem (decision or otherwise) is $$NP$$-hard if every problem in $$NP$$ can be reduced to it by a polynomial reduction.
So $$NP$$-hard can be read as “at least as hard as $$NP$$“.
A problem is $$NP$$-complete if it is $$NP$$-hard and it is in $$NP$$.

So a problem is $$NP$$-hard but not $$NP$$-complete if it isn’t in $$NP$$.
There are two two reasons for this:
1. It isn’t a decision problem, which means it isn’t in $$NP$$
2. It isn’t decidable, e.g., the Turing machine Halting Problem is $$NP$$-hard, but not in $$NP$$ because the definition of $$NP$$
requires there is an algorithm, but by definition no algorithm exists for undecidable problems.

Since every problem in $$NP$$ can be reduced to any $$NP$$-complete problem, but the converse is not true, the $$NP$$-complete
problems can be seen as the hardest problems in $$NP$$.

Most scientists believe that there is no polynomial time algorithm that can solve an $$NP$$-hard problem.
This means that there are problems in $$NP$$ that are not in $$P$$, so $$P \neq NP$$.
This is the $$P-NP$$ Problem.

## Karp’s 21 Problems

Richard Karp’s 1972 paper “Reducibility Among Combinatorial Problems” demonstrated the $$NP$$-completeness of 20 problems by reducing the known $$NP$$-complete problem SATISFIABILITY directly or indirectly to them as in the following diagram:

### Problem definitions

#### SATISFIABILITY

INPUT: Clauses $$C_1, C_2, \dots, C_p$$

PROPERTY: The conjunction of the given clauses is satisfiable;
i.e., there is a set $$S \subseteq \{x_1, x_2, \dots, x_n;\overline{x_1}, \overline{x_2}, \dots, \overline{x_n}\}$$ such
that
a) $$S$$ does not contain a complementary pair of literals and
b) $$S \cap C_k \neq \emptyset, k = 1, 2, \dots, p$$

#### 2. 0-1 INTEGER PROGRAMMING

INPUT: Integer matrix $$C$$ and integer vector $$d$$

PROPERTY: There exists a $$0-1$$ vector $$x$$ such that $$C_x$$ = $$d$$.

#### 3. CLIQUE

INPUT: Graph $$G$$, positive integer $$k$$

PROPERTY: $$G$$ has a set of $$k$$ mutually adjacent nodes.

#### 4. SET PACKING

INPUT: Family of sets $$\{S_j\}$$, positive integer $$\ell$$

PROPERTY: $$\{S_j\}$$ contains $$\ell$$ mutually disjoint sets.

#### 5. NODE COVER

INPUT: Graph $$G^\prime$$, positive integer $$\ell$$

PROPERTY: There is a set $$R \subseteq N^\prime$$ such that $$|R| \le \ell$$ and
every arc is incident with some node in $$R$$.

#### 6. SET COVERING

INPUT: Finite family of finite sets $$\{S_j\}$$

PROPERTY: There is a subfamily $$\{T_h\} \subseteq \{S_j\}$$ containing $$\le k$$
sets such that $$\bigcup T_h = \bigcup S_j$$.

#### 7. FEEDBACK NODE SET

INPUT: Digraph $$H$$, positive integer $$k$$

PROPERTY: There is a set $$R \subseteq V$$ such that every (directed)
cycle of $$H$$ contains a node in $$R$$.

#### 8. FEEDBACK ARC SET

INPUT: Digraph $$H$$, positive integer $$k$$

PROPERTY: There is a set $$S \subseteq E$$ such that every (directed)
cycle of $$H$$ contains an arc in $$S$$.

#### 9. DIRECTED HAMILTON CIRCUIT

INPUT: Digraph $$H$$

PROPERTY: $$H$$ has a directed cycle which includes each node exactly once.

#### 10. UNDIRECTED HAMILTON CIRCUIT

INPUT: Graph $$G$$

PROPERTY: $$G$$ has a cycle which includes each node exactly once.

#### 11. SATISFIABILITY WITH AT MOST 3 LITERALS PER CLAUSE

INPUT: Clauses $$D_1, D_2, \ldots, D_r$$, each consisting of at most 3
literals from the set $$\{u_1, u_2, \ldots, u_m\} \cup \{\overline{u_1}, \overline{u_2}, \ldots, \overline{u_m}\}$$

PROPERTY: The set {$$D_1, D_2, \ldots, D_r$$} is satisfiable.

#### 12. CHROMATIC NUMBER

INPUT: Graph $$G$$, positive integer $$k$$

PROPERTY: There is a function $$\phi: N \to \mathbb{Z_k}$$ such that, if $$u$$
and $$v$$ are adjacent, then $$\phi(u) \ne \phi(v)$$.

#### 13. CLIQUE COVER

INPUT: Graph $$G$$, positive integer $$\ell$$

PROPERTY: $$N^\prime$$ is the union of $$\ell$$ or fewer cliques.

#### 14. EXACT COVER

INPUT: Family $$\{S_j\}$$ of subsets of a set $$\{u_i, i = 1, 2, \ldots, t\}$$

PROPERTY: There is a subfamily $$\{T_h\} \subseteq \{S_j\}$$ such that the
sets $$T_h$$ are disjoint and $$\bigcup T_h = \bigcup S_j = \{u_i, i = 1, 2, \ldots, t\}$$.

#### 15. HITTING SET

INPUT: Family $$\{U_i\}$$ of subsets of $$\{S_j, j = 1, 2, \ldots, r\}$$

PROPERTY: There is a set $$W$$ such that, for each $$i$$, $$|W \cap U_i| = 1$$.

#### 16. STEINER TREE

INPUT: Graph $$G$$, $$R \subseteq N$$, weighting function $$w: A \to \mathbb{Z}$$,
positive integer k

PROPERTY: $$G$$ has a subtree of weight $$\leq k$$ containing the set of nodes in $$R$$.

#### 17. 3-DIMENSIONAL MATCHING

INPUT: Set $$U \subseteq T \times T \times T$$, where $$T$$ is a finite set

PROPERTY: There is a set $$W \subseteq U$$ such that $$|W| = |T|$$ and
no two elements of $$W$$ agree in any coordinate.

#### 18. KNAPSACK

INPUT: $$(a_1, a_2, \ldots, a_r, b) \in \mathbb{Z}^{n+1}$$

PROPERTY: $$\sum a_j x_j = b$$ has a $$0-1$$ solution.

#### 19. JOB SEQUENCING

INPUT:
“execution time vector” $$(T_1, \ldots, T_P) \in \mathbb{Z}^P$$
“deadline vector” $$(D_1, \ldots, D_P) \in \mathbb{Z}^P$$
“penalty vector” $$(P_1, \ldots, P_P) \in \mathbb{Z}^P$$
positive integer $$k$$

PROPERTY: There is a permutation $$\pi$$ of $$\{1, 2, \ldots, P$$} such that

$$\left( \sum_{j=1}^P [\text{if } T_{\pi(1)} + \dots + T_{\pi(j)} \gt D_{\pi(j)} \text{ then } P_{\pi(j)} \text{ else } 0]\right) \le k .$$

#### 20. PARTITION

INPUT: $$(c_1, c_2, \ldots, c_s) \in \mathbb{Z}^S$$

PROPERTY: There is a set $$I \subseteq \{1, 2, \ldots, s\}$$ such that

$$\sum_{h \in I} c_h = \sum_{h \notin I} c_h$$

#### 21. MAX CUT

INPUT: graph $$G$$, weighting function $$w: A \to \mathbb{Z}$$, positive
integer $$W$$

PROPERTY: There is a set $$S \subseteq N$$ such that

$$\sum_{\{u, v\} \in A, u \in S v \notin S} w(\{u, v\}) \ge W.$$

### Reductions

#### SATISIFIABILITY $$\propto$$ 0,1 INTEGER PROGRAMMING

$$c_{ij} = \begin{cases} 1 & if\; x_j \in C_i\quad i=1, 2, \dots, p \\ -1 & if\; \overline{x_j} \in C_i\quad i=1, 2, \dots, n \\ 0 & otherwise\; \end{cases}$$
The vector $$b$$ is dimension $$p$$ with:
$$b_i$$ = 1 – (the number of complemented variables in $$C_i$$),
$$i = 1, 2, \ldots, p$$.

#### SATISFIABILITY $$\propto$$ CLIQUE

$$N = \{\langle \sigma, i\rangle | \sigma$$ is a literal and occurs in $$C_i\}$$
$$A = \{\{\langle \sigma, i\rangle, \langle\delta, j\rangle\} | i \neq j$$ and $$\sigma \neq \overline{\delta}\}$$
$$k = p$$, the number of clauses

#### CLIQUE $$\propto$$ SET PACKING

Assume $$N = \{1, 2, \ldots, n\}$$.
The elements of the sets $$S_1, S_2, \ldots, S_n$$ are those two-element sets of nodes $$\{i, j\}$$ not in $$A$$.
$$S_i = \{\{i, j\}| \{i, j\} \notin A\}, i = 1, 2, \ldots, n$$.
$$\ell = k$$.

#### CLIQUE $$\propto$$ NODE COVER

$$G^\prime$$ is the complement of $$G$$.
$$\ell = |N| – k$$.

#### NODE COVER $$\propto$$ SET COVERING

Assume $$N^\prime = \{0, 1, \ldots, n\}$$.
The elements are the arcs of $$G^\prime$$.
$$S_j$$ is the set of arcs incident with node $$j$$.
$$k = \ell$$.

#### NODE COVER $$\propto$$ FEEDBACK NODE SET

$$V = N^\prime$$.
$$E = \{\langle u, v\rangle | \{u,v\} \in A^\prime\}$$.
$$k = \ell$$.

#### NODE COVER $$\propto$$ FEEDBACK ARC SET

$$V = N^\prime \times \{0, 1\}$$.
$$E = \{\langle \langle u, 0 \rangle, \langle u, 1 \rangle \rangle| u \in N^\prime\} \cup \{\langle \langle u, 1\rangle, \langle v, 0\rangle\rangle| \{u, v\} \in A^\prime\}$$.
$$k = \ell$$.

#### NODE COVER $$\propto$$ DIRECTED HAMILTONIAN CIRCUIT

Without loss of generality assume $$A^\prime = Z_m$$.
$$V = \{a_1, a_2, \dots, a_\ell\} \cup \{\langle u, i, \alpha \rangle\ | u \in N^\prime$$ is incident with $$i \in A^\prime$$ and $$\alpha \in \{0, 1\}\}$$
$$E = \{\langle \langle u, i, 0 \rangle, \langle u, i, 1 \rangle, \rangle | \langle u, i, 0 \rangle \in V\}$$
$$\;\cup\;\{\langle \langle u, i, \alpha \rangle, \langle v, i, \alpha \rangle \rangle | i \in A^\prime, u$$ and $$v$$ are incident with $$i$$, $$\alpha \in \{0, 1\}\}$$
$$\;\cup\;\{\langle \langle u, i, 1 \rangle, \langle u, j, 0 \rangle \rangle | u$$ is incident with $$i$$ and $$j$$ and $$!\exists h, i \lt h \lt j,$$ such that $$u$$ is incident with $$h\}$$
$$\;\cup\;\{\langle \langle u, i, 1 \rangle, a_f \rangle | 1 \leq f \leq \ell$$ and $$!\exists h \gt i$$ such that $$u$$ is incident with $$h\}$$
$$\;\cup\;\{\langle a_f,\langle u, i, 0 \rangle \rangle | 1 \leq f \leq \ell$$ and $$!\exists h \lt i$$ such that $$u$$ is incident with $$h\}$$

#### DIRECTED HAMILTONIAN CIRCUIT $$\propto$$ UNDIRECTED HAMILTONIAN CIRCUIT

$$N = V \times \{0, 1, 2\}$$
$$A = \{\{\langle u, 0 \rangle, \langle u, 1\}, \{\langle u, 1 \rangle, \langle u, 2 \rangle\}| u \in V\}$$
$$\; \cup \{\{\langle u, 2 \rangle, \langle v, 0 \rangle\} | \langle u, v \rangle \in E\}$$

#### SATISFIABILITY $$\propto$$ SATISFIABILITY WITH AT MOST 3 LITERALS PER CLAUSE

Replace a clause $$\sigma_1 \cup \sigma_2 \cup \ldots \cup \sigma_m$$, where the $$\sigma_i$$ are literals and $$m \gt$$ 3, by
$$(\sigma_1 \cup \sigma_2 \cup u_1)(\sigma_3 \cup \ldots \cup \sigma_m \cup \overline{u_1})(\overline{\sigma_3} \cup u_1) \ldots (\overline{\sigma_m} \cup u_1)$$,
where $$u_1$$ is a new variable.
Repeat this transformation until no clause has more than three literals.

#### SATISFIABILITY WITH AT MOST 3 LITERALS PER CLAUSE $$\propto$$ CHROMATIC NUMBER

Assume without loss of generality that $$m \ge 4$$.
$$N = \{u_1, u_2, \ldots, u_m\} \cup \{\overline{u1}, \overline{u2}, \ldots, \overline{um}\} \cup {v1, v2, \ldots, v_m}$$
$$\;\cup \{D_1, D_2, \dots, D_r\}$$
$$A = \{\{u_i, \overline{ui}\}| i = 1, 2, \ldots, n\} \cup \{\{v_i, v_j\} | i \neq j\} \cup \{\{v_i, x_j\}| i != j\} \cup \{\{v_i, \overline{x_j}\}| i \neq j\}$$
$$\;\cup \{\{u_i, D_f\} | u_i \notin D_f\} \cup \{\{\overline{u_i}, D_f\}| \overline{u_i} \in D_f\}$$
$$k = r + 1$$

#### CHROMATIC NUMBER $$\propto$$ CLIQUE COVER

$$G^\prime$$ is the complement of $$G$$
$$\ell = k$$

#### CHROMATIC NUMBER $$\propto$$ EXACT COVER

The set of elements is
$$N \cup A \cup \{ \langle u, e, f\rangle | u$$ is incident with $$e$$ and $$1 \leq f \leq k\}$$.
The sets $$S_j$$ are the following:
for each $$f, 1 \leq f \leq k$$, and each $$u \in N$$,
$$\quad\{u\} \cup \{\langle u, e, f \rangle | e$$ is incident with $$u\}$$;
for each $$e \in A$$ and each pair $$f_1, f_2$$ such that
$$\quad 1 \leq f_1 \leq k, 1 \leq f_2 \leq k$$ and $$f_1 \neq f_2$$
$$\quad\{e\} \cup \{\langle u, e, f \rangle, f \neq f_1\} \cup \{\langle v, e, g \rangle | g \neq f_2\}$$,
where $$u$$ and $$v$$ are the two nodes incident with $$e$$.

#### EXACT COVER $$\propto$$ HITTING SET

The hitting set problem has sets $$U_i$$ and elements $$s_j$$ such that
$$s_j \in U_i \iff u_i \in S_j$$.

#### EXACT COVER $$\propto$$ STEINER TREE

$$N = \{n_0\} \cup \{S_j\} \cup \{u_1\}$$
$$R = \{n_0\} \cup \{u_i\}$$
$$A = \{\{n_0, S_j\}\} \cup \{\{S_j, u_i\}| u_i \in S_j\}$$
$$w(\{n_0, S_j\}) = |S_j|$$
$$w(\{S_j, u_i\}) = 0$$
$$k = |\{u_i\}|$$.

#### EXACT COVER $$\propto$$ 3-DIMENSIONAL MATCHING

Without loss of generality, assume $$|S_j| \geq 2$$ for each $$j$$.
Let $$T = \{\langle i, j \rangle | u_i \in S_j\}$$.
Let $$\alpha$$ be an arbitrary one-one function from $$\{u_i\}$$ into $$T$$.
Let $$pi: T \to T$$ be a permutation such that,
for each fixed $$j, \{\langle i, j\rangle | u_i \in S_j\}$$ is a cycle of $$\pi$$.
$$U = \{\langle \alpha(u_i), \langle i, j\rangle , \langle i, j\rangle \rangle | \langle i, j\rangle \in T\}$$
$$\quad\cup \langle \beta, \sigma, \pi(\sigma)\rangle | \text{for all } i, \beta \ne \alpha(u_i)\}$$.

#### EXACT COVER $$\propto$$ KNAPSACK

Let $$d = |\{S_j\}|+1.$$
Let $$\epsilon_ji = \begin{cases} 1 & if\; u_i \in S_j \\ 0 & if\; u_i \notin S_j \end{cases}$$
Let $$r = |\{S_j\}|, a_j = \sum \epsilon_{ji}d^{i-1}\;and\;b = \frac{d^t-1}{d-1}.$$

#### KNAPSACK $$\propto$$ SEQUENCING

$$p = r, T_i = P_i = a_i, D_i = b$$.

#### KNAPSACK $$\propto$$ PARTITION

$$s = r + 2$$
$$c_i = a_i i = 1, 2, \ldots r$$
$$c_{r+1} = b + 1$$
$$c_{r+2} = (\sum_{i = 1}^r a_i) + 1 – b$$

#### PARTITION $$\propto$$ MAX CUT

$$N = \{1, 2, \dots, s\}$$
$$A = \{\{i, j\} | i \in N, j \in N, i \ne j\}$$
$$w(\{i, j\}) = c_i \cdot c_j$$
$$W = [\frac{1}{4} \sum c_i^2]$$

# Greedy partition in C

In a previous post, I showed a dynamic programming algorithm that gives an exact solution to the partition problem. It’s also possible to get an approximate solution using a greedy algorithm.

The algorithm is:

1. Sort the elements of the set in decreasing order
2. For each element in the set:
1. If the sum of the first subset is greater, put the element in the second set
2. Else, put it in the first set

You can see intuitively that this is going to make the sets as close as possible to having the same sum without using an exact algorithm. Below is an implementation in C. The function partition() takes an array of unsigned integers and its size, and returns an array of unsigned integers which are 0 if the corresponding element is in the first set of the partition, and 1 if it is in the second set. Note that the indices of this set correspond to the array of numbers after having been sorted.

#include <stdlib.h>

int compare_unsigned_ints_decreasing(const unsigned int *p1, const unsigned int *p2)
{
if (*p1 > *p2) {
return -1;
}
if (*p1 < *p2) {
return 1;
}
return 0;
}

unsigned int *partition(unsigned int *numbers, size_t n)
{
unsigned int i, sum0 = 0, sum1 = 0;
unsigned int *solution = calloc(n, sizeof(unsigned int));
if (solution == NULL) {
return NULL;
}
qsort(numbers, n, sizeof(unsigned int),
(int(*)(const void *, const void *))compare_unsigned_ints_decreasing);
for (i = 0; i < n; i++) {
if (sum1 < sum0) {
solution[i] = 1;
sum1 += numbers[i];
}
else {
sum0 += numbers[i];
}
}
return solution;
}


Here is an example program that partitions the set of numbers from 1 to 7:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
unsigned int numbers[] = {1, 2, 3, 4, 5, 6, 7};
const size_t len = sizeof(numbers) / sizeof(numbers[0]);
unsigned int i;
unsigned int *solution = partition(numbers, len);
printf("First set:\n");
for (i = 0; i < len; i++) {
if (solution[i] == 0) {
printf("%u\n", numbers[i]);
}
}
printf("Second set:\n");
for (i = 0; i < len; i++) {
if (solution[i] == 1) {
printf("%u\n", numbers[i]);
}
}
putchar('\n');
free(solution);
return 0;
}


Here is the output. In this case, the partition divides the set perfectly into two sets of sum 14, but that won’t generally be the case for this algorithm as it is an approximate one.

First set:
7
4
3
Second set:
6
5
2
1


# Subset-Sum using dynamic programming in C

The Subset-Sum problem is to determine, given a set of integers, whether there is a subset that sums to a given value. The problem is NP-complete, but can be solved in pseudo-polynomial time using dynamic programming.

Below is an implementation in C. The algorithm works by filling in a table. The rows of the table correspond to values from 0 to the target value, while the columns correspond to elements of the set. Each cell in the table repesents a solution for that row’s total using the elements chosen from the columns to the left. When the table has been filled, the truth value in the bottom-right-hand cell will indicate whether there is a successful solution. I have used chains of pointers within the table to allow the elements in the solution to be read back.

struct entry {
unsigned int truth;
int element;
unsigned int count;
struct entry *prev;
};
typedef struct entry entry;

unsigned int subset_sum(const unsigned int *weights, size_t len, unsigned int target,
unsigned int **solution)
{
entry **table;
unsigned int i, j;
unsigned int count = 0;

table = malloc((target + 1) * sizeof(entry *));
for (i = 0; i <= target; i++) {
table[i] = malloc((len + 1 ) * sizeof(entry));
}

for (i = 0; i <= target; i++) {
for (j = 0; j <= len; j++) {
/* First row */
if (i == 0) {
table[i][j].truth = 1;
table[i][j].element = -1;
table[i][j].count = 0;
table[i][j].prev = NULL;
}
else if (j == 0) {
/* First column */
table[i][j].truth = 0;
table[i][j].element = -1;
table[i][j].count = 0;
table[i][j].prev = NULL;
}
else {
/* Initialise for don't add element case */
table[i][j].truth = table[i][j-1].truth;
table[i][j].prev = &table[i][j - 1];
table[i][j].element = -1;
table[i][j].count = table[i][j - 1].count;
if (i >= weights[j-1]) {
if (!table[i][j].truth) {
table[i][j].truth = table[i - weights[j-1]][j-1].truth;
table[i][j].element = j - 1;
table[i][j].count = table[i - weights[j-1]][j-1].count + 1;
table[i][j].prev = &table[i - weights[j-1]][j-1];
}
}
}
}
}

if (!table[target][len].truth) {
/* No solution */
*solution = NULL;
}
else {
/* Solution */
*solution = calloc(len, sizeof(unsigned int));
}
if (*solution) {
count = table[target][len].count;
}
}
}

for (i = 0; i <= target; i++) {
free(table[i]);
}
free(table);

return count;
}


Here is an example program:

int main(void)
{
unsigned int weights[] = {5, 7, 1, 3, 11, 9};
unsigned int target = 14;
const size_t len = sizeof(weights) / sizeof(weights[0]);
unsigned int *solution;
const unsigned int count = subset_sum(weights, len, target, &solution);
if (count) {
unsigned int i;
printf("Found a solution\n");
printf("Elements:\n");
for (i = 0; i < len; i++) {
if (solution[i]) {
printf("%u ", weights[i]);
}
}
putchar('\n');
}
else {
printf("No solution\n");
}
free(solution);

return 0;
}


The output:

Found a solution
Elements:
3 11


# Making change using dynamic programming in C

A coin system is canonical if the greedy algorithm for making change is optimal for all values. All coin systems in the world are canonical for obvious reasons. If a coin system is not canonical, the change-making problem becomes NP-hard. We can use dynamic programming to solve the change-making problem for abitrary coin systems.

This is another table-filling algorithm. The table has a column for every value from 0 to the target value, and a column for every coin. Each cell will be filled with the minimum number of the coins from that row upwards that are needed to make that value. When the table has been filled, the minimum number of coins for the target value can be read from the bottom-right-hand cell.

Below is an implemention in C. As with other dynamic programming algorithms, I have used pointers within the table to link each cell to the cell from which its value was derived. These pointers can then be used to read back how many of each coin was used in producing the solution. The function make_change() takes an array of coin values, its length, a target value, and the address of a pointer to which to assign the solution array. The solution array has an entry for each coin and indicates how many of each coin were used in the solution.

#include <stdlib.h>

struct change_entry {
unsigned int count;
int coin;
struct change_entry *prev;
};
typedef struct change_entry change_entry;

unsigned int make_change(const unsigned int *coins, size_t len, unsigned int value,
unsigned int **solution)
{
unsigned int i, j;
change_entry **table;
unsigned int result;

/* Allocate the table */
table = malloc((len + 1) * sizeof(change_entry *));
for (i = 0; i <= len; i++) {
table[i] = malloc((value + 1) * sizeof(change_entry));
}

/* Calculate the values and build chains */
for (i = 0; i <= len; i++) {
for (j = 0; j <= value; j++) {
if (i == 0) {
/* Initialising the first row */
table[i][j].count = j;
table[i][j].coin = -1;
table[i][j].prev = NULL;
}
else if (j == 0) {
/* Initialising the first column */
table[i][j].count = 0;
table[i][j].coin = -1;
table[i][j].prev = NULL;
}
else if (coins[i - 1] == j) {
/* Can use coin */
table[i][j].count = 1;
table[i][j].coin = i - 1;
table[i][j].prev = NULL;
}
else if (coins[i - 1] > j) {
/* Can't use coin */
table[i][j].count = table[i - 1][j].count;
table[i][j].coin = -1;
table[i][j].prev = &table[i - 1][j];
}
else {
/* Can use coin - choose best solution */
if (table[i - 1][j].count < table[i][j - coins[i - 1]].count + 1) {
/* Don't use coin */
table[i][j].count = table[i - 1][j].count;
table[i][j].coin = -1;
table[i][j].prev = &table[i - 1][j];
}
else {
/* Use coin */
table[i][j].count = table[i][j - coins[i - 1]].count + 1;
table[i][j].coin = i - 1;
table[i][j].prev = &table[i][j - coins[i - 1]];
}
}
}
}
result = table[len][value].count;
/* Read back the solution */
*solution = calloc(len, sizeof(unsigned int));
if (*solution) {
}
}
}
else {
result = 0;
}

for (i = 0; i <= len; i++) {
free(table[i]);
}
free(table);

return result;
}


Here is an example program:

int main(void)
{
unsigned int coins[] = {1, 2, 5, 10, 20, 50, 100};
const size_t len = sizeof(coins) / sizeof(coins[0]);
const unsigned int value = 252;
unsigned int *solution;
unsigned int result = make_change(coins, len, value, &solution);
unsigned int i;
printf("Number of coins: %u\n", result);
printf("Coins used:\n");
for (i = 0; i < len; i++) {
if (solution[i] > 0) {
printf("%u x %u\n", solution[i], coins[i]);
}
}
free(solution);
return 0;
}


The output:

Number of coins: 4
Coins used:
1 x 2
1 x 50
2 x 100


# Partition Problem using dynamic programming in C

The Partition Problem is to determine if, given a set of positive integers, there is a subset that sums to precisely half of the sum of the set. For example, given the numbers from 1 to 7, their sum is 28, half of which is 14, and the subset (2, 3, 4, 5} sums to 14, so this is a positive instance. The partition problem is NP-complete, but can be solved in pseudo-polynomial time using dynamic programming.

The algorithm works by constructing an array with entries for every number from 0 to the sum of the set, and filling in the cells that are the sum of some subset. Once complete, the solution can be read back from the middle cell of the array, which corresponds to a subset that sums to half of the total sum. At first, only the 0 cell is filled in, because it is always possible to find a subset that sums to 0 — the empty set. After that, the cells to fill in are found by simulating repeated additions of the numbers from the set to the already-filled-in cells. For example, if the cell for some number $$m$$ is already filled in, and the set contains the number $$n$$, then the cell $$m + n$$ can be filled in. When a cell is filled in, a pointer is set to point back to the number from which it has been derived so that the solution can be read back.

Below is the code in C. The function partition() takes an array of numbers, its length, and the address of a pointer to which to attach the solution array. It returns the number of elements in the solution subset if there is a partition, or 0 otherwise.

struct partition_entry {
unsigned int value;
unsigned int truth;
unsigned int count;
struct partition_entry *prev;
};
typedef struct partition_entry partition_entry;

unsigned int sum(const unsigned int *numbers, size_t len)
{
unsigned int i, total = 0;
for (i = 0; i < len; i++) {
total += numbers[i];
}
}

int compare_unsigned_ints(const unsigned int *p1, const unsigned int *p2)
{
if (*p1 > *p2) {
return 1;
}
if (*p1 < *p2) {
return -1;
}
return 0;
}

unsigned int min(unsigned int first, unsigned int second)
{
if (first < second) {
return first;
}
return second;
}

unsigned int partition(unsigned int *numbers, size_t n, unsigned int **solution)
{
unsigned int total = sum(numbers, n);
int i, j, rightmost = 0, size;
partition_entry *partitions;
if (total % 2 == 1) {
/* Total is odd */
*solution = NULL;
return 0;
}
/* Initialise array */
partitions = malloc((total + 1) * sizeof(partition_entry));
if (partitions == NULL) {
*solution = NULL;
return 0;
}
for (i = 0; i <= total; i++) {
partitions[i].value = i;
partitions[i].truth = 0;
partitions[i].count = 0;
partitions[i].prev = NULL;
}
partitions[0].truth = 1;
/* Sort the numbers */
qsort(numbers, n,
sizeof(unsigned int), (int(*)(const void *, const void *))compare_unsigned_ints);
/* Fill in the sums and build chains */
for (i = 0; i < n; i++) {
for (j = rightmost; j >= 0; j--) {
if (partitions[j].truth && !partitions[j + numbers[i]].truth) {
partitions[j + numbers[i]].truth = 1;
partitions[j + numbers[i]].count = partitions[j].count + 1;
partitions[j + numbers[i]].prev = &partitions[j];
}
}
rightmost = min(total / 2, rightmost + numbers[i]);
}
if (partitions[total / 2].truth) {
/* Success, read back the solution */
partition_entry *ptr = &partitions[total / 2];
unsigned int number = ptr->value;
size = partitions[total / 2].count;
*solution = malloc(size * sizeof(unsigned int));
if (*solution) {
for (i = 0, j = size - 1; i < size; i++, j--) {
ptr = ptr->prev;
(*solution)[j] = number - ptr->value;
number = ptr->value;
}
}
else {
size = 0;
}
}
free(partitions);
return size;
}


Here is an example that looks for a partition of the numbers from 1 to 7:

int main(void)
{
unsigned int numbers[] = {7, 6, 5, 4, 3, 2, 1};
const size_t len = sizeof(numbers) / sizeof(numbers[0]);
unsigned int *solution;
unsigned int i;
unsigned int size = partition(numbers, len, &solution);
printf("Partition size: %d\n", size);
for (i = 0; i < size; i++) {
printf("%u ", solution[i]);
}
putchar('\n');
free(solution);
return 0;
}


The output:

Partition size: 4
2 3 4 5


# Greedy Vertex Cover in C

The Vertex Cover Problem is to find a subset of the vertices of a graph that contains an endpoint of every edge. For example, in the graph shown above, the subset (0, 1) highlighted in red is a vertex cover. The minimisation problem of finding the smallest vertex cover is NP-hard. However a good approximation can be achieved using a greedy algorithm. The algorithm is simply:

• While there are uncovered edges:
1. Find the first uncovered edge
2. Put one of its endpoints in the covering set
3. Mark all of the edges incident upon that endpoint as covered

Below is an implementation in C. The function vertex_cover() takes a graph in edge list format, its size (number of edges) and order (number of vertices), and the address of a pointer through which to return the covering set as an out parameter. The return value is the size of the covering set.

#include <stdlib.h>

typedef struct {
unsigned int first;
unsigned int second;
} edge;

unsigned int vertex_cover(const edge *edges, unsigned int size, unsigned int order,
unsigned int **cover)
{
unsigned int uncovered_size = size;
unsigned int cover_size = 0;
unsigned int *covered = calloc(size, sizeof(unsigned int));
*cover = calloc(order, sizeof(unsigned int));
if (covered == NULL || cover == NULL) {
free(covered);
free(*cover);
*cover = NULL;
return 0;
}
while (uncovered_size > 0) {
unsigned int e, v;
/* Find the first edge that hasn't been covered */
for (e = 0; e < size && covered[e] == 1; e++);
/* Add its first endpoint to the covering set */
v = edges[e].first;
(*cover)[v] = 1;
cover_size++;
/* Mark all uncovered edges containing its first endpoint as covered */
for (e = 0; e < size; e++) {
if (!covered[e] && (edges[e].first == v || edges[e].second == v)) {
covered[e] = 1;
uncovered_size--;
}
}
}
free(covered);
return cover_size;
}


Here is an example program that finds a vertex cover on the graph shown above:

#include <stdio.h>
#include <stdlib.h>

/* Connect two edges */
void edge_connect(edge *edges, unsigned int first, unsigned int second,
unsigned int *pos)
{
edges[*pos].first = first;
edges[*pos].second = second;
(*pos)++;
}

int main(void)
{
const unsigned int size = 4; /* Edges */
const unsigned int order = 5; /* Vertices */
edge *edges = malloc(size * sizeof(edge));
unsigned int i = 0;
edge_connect(edges, 0, 1, &i);
edge_connect(edges, 0, 2, &i);
edge_connect(edges, 0, 3, &i);
edge_connect(edges, 1, 4, &i);
unsigned int *cover;
unsigned int c = vertex_cover(edges, size, order, &cover);
printf("Cover size is %u\n", c);
printf("Vertices in cover:\n");
for (i = 0; i < order; i++) {
if (cover[i]) {
printf("%u ", i);
}
}
putchar('\n');
free(edges);
free(cover);
return 0;
}


The output:

Cover size is 2
Vertices in cover:
0 1


# 0-1 Knapsack using backtracking in C

In the 0-1 Knapsack problem, we are given a set of items with individual weights and profits and a container of fixed capacity (the knapsack), and are required to compute a loading of the knapsack with items such that the total profit is maximised. We only have 1 of each item, so there is either 0 or 1 of each item in in the knapsack, hence the 0-1 in the name of the problem.

The 0-1 knapsack problem is NP-hard, but can be solved quite efficiently using backtracking. Below is a backtracking implementation in C. The function knapsack() takes arrays of weights, and profits, their size, the capacity, and the address of a pointer through which the solution array is returned. It returns the profit of the best knapsack. The profit density of each item (weight / profit) is first calculated, and the knapsack is filled in order of decreasing profit density. A bounding function is used to determine the upper bound on the profit achievable with the current knapsack so as to reduce the search space.

#include <stdlib.h>
#include <string.h>

/* A knapsack item */
typedef struct {
unsigned int id;
double weight;
double profit;
double profit_density;
} item;

/* Compare items by lesser profit density */
static int compare_items(const item *item1, const item *item2)
{
if (item1->profit_density > item2->profit_density) {
return -1;
}
if (item1->profit_density < item2->profit_density) {
return 1;
}
return 0;
}

/* Bounding function */
static double profit_bound(const item *items, size_t n, double capacity,
double current_weight, double current_profit,
unsigned int level)
{
double remaining_capacity = capacity - current_weight;
double bound = current_profit;
unsigned int lvl = level;
/* Fill in order of decreasing profit density */
while (lvl < n &&
items[lvl].weight <= remaining_capacity)
{
remaining_capacity -= items[lvl].weight;
bound += items[lvl].profit;
lvl++;
}
/* Pretend we can take a fraction of the next object */
if (lvl < n) {
bound += items[lvl].profit_density
* remaining_capacity;
}
return bound;
}

static void knapsack_recursive(const item *items, size_t n, double capacity,
unsigned int *current_knapsack, double *current_weight, double *current_profit,
unsigned int *max_knapsack, double *max_profit, unsigned int level)
{
if (level == n) {
/* Found a new max knapsack */
*max_profit = *current_profit;
memcpy(max_knapsack, current_knapsack, n * sizeof(unsigned int));
return;
}
if (*current_weight + items[level].weight <= capacity)
{   /* Try adding this item */
*current_weight += items[level].weight;
*current_profit += items[level].profit;
current_knapsack[items[level].id] = 1;
knapsack_recursive(items, n, capacity, current_knapsack, current_weight,
current_profit, max_knapsack, max_profit, level + 1);
*current_weight -= items[level].weight;
*current_profit -= items[level].profit;
current_knapsack[items[level].id] = 0;
}
if (profit_bound(items, n, capacity, *current_weight,
*current_profit, level + 1) > *max_profit) {
/* Still promising */
knapsack_recursive(items, n, capacity, current_knapsack, current_weight,
current_profit, max_knapsack, max_profit, level + 1);
}
}

double knapsack(const double *weights, const double *profits,
size_t n, double capacity, unsigned int **max_knapsack)
{
double current_weight = 0.0;
double current_profit = 0.0;
double max_profit = 0.0;
unsigned int i;
item *items  = malloc(n * sizeof(item));
unsigned int *current_knapsack = calloc(n, sizeof(unsigned int));
*max_knapsack = malloc(n * sizeof(unsigned int));
if (!(items && current_knapsack && *max_knapsack)) {
free(items);
free(current_knapsack);
free(*max_knapsack);
*max_knapsack = NULL;
return 0;
}
/* Populate the array of items */
for (i = 0; i < n; i++) {
items[i].id = i;
items[i].weight = weights[i];
items[i].profit = profits[i];
items[i].profit_density = profits[i] / weights[i];
}
/* Sort into decreasing density order */
qsort(items, n, sizeof(item),
(int (*)(const void *, const void *))compare_items);
knapsack_recursive(items, n, capacity, current_knapsack, &current_weight,
&current_profit, *max_knapsack, &max_profit, 0);
free(items);
free(current_knapsack);
return max_profit;
}


Here is an example program:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
double weights[] = {3, 5, 2, 1};
double profits[] = {9, 10, 7, 4};
const size_t n = sizeof(profits) / sizeof(profits[0]);
const double capacity = 7;
unsigned int *max_knapsack;
double max_profit = knapsack(weights, profits, n, capacity, &max_knapsack);
unsigned int i;
printf("Profit: %.2f\n", max_profit);
printf("Knapsack contains:\n");
for (i = 0; i < n; i++) {
if (max_knapsack[i] == 1) {
printf("Item %u with weight %.2f and profit %.2f\n", i, weights[i], profits[i]);
}
}
free(max_knapsack);
return 0;
}


The output:

Profit: 20.00
Knapsack contains:
Item 0 with weight 3.00 and profit 9.00
Item 2 with weight 2.00 and profit 7.00
Item 3 with weight 1.00 and profit 4.00


# Value-Independent Knapsack Problem using backtracking in C

In another post I described a backtracking algorithm for the Subset-Sum Problem, in which the aim is to find a subset of a set of weights that sums to a given weight. A related problem, which is sometimes called Subset-Sum but is also called Value-Independent Knapsack, is to find a subset of the set of weights that is the maximum weight less than or equal to the given weight, rather than the exact weight.

This problem can be thought of as a 0-1 knapsack problem in which the weights are equal to the values for all items. Like 0-1 knapsack, the problem is NP-hard, but a backtracking algorithm can produce an exact solution quite efficiently. This is a backtracking algorithm for Value Independent Knapsack in C. The function subset_sum() takes an array of weights and its size, a capacity, and an out parameter for the solution array, and returns the weight of the best set found.

static void subset_sum_recursive(const unsigned int *weights, size_t n, unsigned int capacity,
unsigned int *current_set, unsigned int *current_weight, unsigned int *max_set,
unsigned int *max_weight, unsigned int *remaining_weight, unsigned int level)
{
if (level == n) {
/* Found a new best set */
*max_weight = *current_weight;
memcpy(max_set, current_set, n * sizeof(unsigned int));
return;
}
*remaining_weight -= weights[level];
if (*current_weight + weights[level] <= capacity) {
/* Try solutions that include this item */
*current_weight += weights[level];
current_set[level] = 1;
subset_sum_recursive(weights, n, capacity, current_set, current_weight,
max_set, max_weight, remaining_weight, level + 1);
*current_weight -= weights[level];
current_set[level] = 0;
}
if (*current_weight + *remaining_weight > *max_weight) {
/* Still promising */
subset_sum_recursive(weights, n, capacity, current_set, current_weight,
max_set, max_weight, remaining_weight, level + 1);
}
*remaining_weight += weights[level];
}

static unsigned int sum(const unsigned int *numbers, size_t len)
{
unsigned int total = 0;
unsigned int i;
for (i = 0; i < len; i++) {
total += numbers[i];
}
}

static unsigned int subset_sum(const unsigned int *weights, size_t n, unsigned int capacity,
unsigned int **max_set)
{
unsigned int current_weight = 0;
unsigned int max_weight = 0;
unsigned int remaining_weight = sum(weights, n);
unsigned int *current_set = calloc(n, sizeof(unsigned int));
*max_set = malloc(n * sizeof(unsigned int));
if (current_set == NULL || *max_set == NULL) {
free(current_set);
free(*max_set);
return 0;
}
subset_sum_recursive(weights, n, capacity, current_set, &current_weight, *max_set,
&max_weight, &remaining_weight, 0);
free(current_set);
return max_weight;
}


Here is an example program. It prints the maximum weight found and the elements of the set:

int main(void)
{
unsigned int weights[] = {8, 6, 2, 3};
const size_t n = sizeof(weights) / sizeof(weights[0]);
const unsigned int capacity = 12;
unsigned int *set;
unsigned int weight = subset_sum(weights, n, capacity, &set);
unsigned int i;
printf("%u\n", weight);
for (i = 0; i < n; i++) {
if (set[i]) {
printf("%u ", weights[i]);
}
}
putchar('\n');
free(set);
return 0;
}


The output:

11
8 3


# Traveling Salesman Problem using backtracking in C

I have previously shown the Cheapest-Link, Nearest-Neigbour, and Repetitive-Nearest Neighbour algorithms for the Traveling Salesman Problem. These are all greedy algorithms that give an approximate result. The Traveling Salesman Problem is NP-complete, so an exact algorithm will have exponential running time unless $$P=NP$$. However, we can reduce the search space for the problem by using backtracking.

This is an implementation of TSP using backtracking in C. It searches the permutation space of vertices, fixing the start of each tour at vertex 0. This means that the last edge is always the one that connects the second-last edge to vertex 0, so it is not necessary to find this edge by permutation. The function traveling_salesman() takes a graph in the form of a matrix of distances (adjmat), the number of vertices (order), and the address of a pointer to an array of unsigned integers used as an output parameter (best_tour). It returns the cost of the best tour, and assigns an array containing the vertices of the tour in order to *best_tour.

#include <stdlib.h>
#include <string.h>

static void swap(unsigned int *a, unsigned int *b)
{
unsigned int temp = *a;
*a = *b;
*b = temp;
}

static void traveling_salesman_recursive(const unsigned int **adjmat, unsigned int order,
unsigned int *best_tour, unsigned int *best_tour_cost, unsigned int *partial_tour,
unsigned int *partial_tour_cost, unsigned int level)
{
if (level == order - 1) {
/* Add last two edges to complete the tour */
unsigned int tour_cost = *partial_tour_cost
+ adjmat[partial_tour[order - 2]][partial_tour[order - 1]]
if (*best_tour_cost == 0 || tour_cost < *best_tour_cost) {
/* Best tour so far */
memcpy(best_tour, partial_tour, order * sizeof(unsigned int));
*best_tour_cost = tour_cost;
}
}
else {
unsigned int i;
for (i = level; i < order; i++) {
if (*best_tour_cost == 0
|| *partial_tour_cost + adjmat[partial_tour[level - 1]][partial_tour[i]]
< *best_tour_cost)
{
/* Next permutation */
swap(&partial_tour[level], &partial_tour[i]);
unsigned int cost = adjmat[partial_tour[level - 1]][partial_tour[level]];
*partial_tour_cost += cost;
partial_tour, partial_tour_cost, level + 1);
*partial_tour_cost -= cost;
swap(&partial_tour[level], &partial_tour[i]);
}
}
}
}

unsigned int traveling_salesman(const unsigned int **adjmat, unsigned int order,
unsigned int **best_tour)
{
unsigned int i;
unsigned int best_tour_cost = 0;
unsigned int partial_tour_cost = 0;
unsigned int *partial_tour = malloc(order * sizeof(unsigned int));
*best_tour = malloc(order * sizeof(unsigned int));
if (partial_tour == NULL || *best_tour == NULL) {
free(partial_tour);
free(*best_tour);
*best_tour = NULL;
return 0;
}
for (i = 0; i < order; i++) {
partial_tour[i] = i;
}
&partial_tour_cost, 1);
free(partial_tour);
return best_tour_cost;
}


Here is an example program:

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
unsigned int r1[] = {0, 5, 7, 9, 10};
unsigned int r2[] = {4, 0, 11, 3, 7};
unsigned int r3[] = {3, 1, 0, 4, 5};
unsigned int r4[] = {6, 5, 7, 0, 11} ;
unsigned int r5[] = {13, 2, 8, 3, 0} ;
const size_t order = 5; /* Vertices */
const unsigned int *adjmat[] = { r1, r2, r3, r4, r5 };
unsigned int *best_tour;
unsigned int cost = traveling_salesman(adjmat, order, &best_tour);
unsigned int i;
printf("Best tour cost: %u\n", cost);
printf("Vertices:\n");
for (i = 0; i < order; i++) {
printf("%u ", best_tour[i]);
}
putchar('\n');
printf("Edge weights:\n");
for (i = 0; i < order - 1; i++) {
}
free(best_tour);
return 0;
}


The output:

Best tour cost: 23
Vertices:
0 2 4 1 3
Edge weights:
7 5 2 3 6


# Maximum Clique using backtracking in C

A clique in a graph is a subgraph in which all vertices are connected to each other – a complete subgraph. The Maximum Clique Problem is to find the largest clique of a graph. For example, in the graph shown above the maximum clique size is 3, and the vertices (1, 2, 4) are a maximum clique, as are (0, 1, 4) and (0, 3, 4).

The Maximum Clique Problem is NP-complete, but can be solved efficiently using backtracking. The following is a backtracking implementation in C. The function maximum_clique() takes a graph in adjacency matrix form and its order (the number of vertices), and will return a maximum clique and its size.

#include <stdlib.h>
#include <string.h>

static void maximum_clique_recursive(const unsigned int **adjmat, unsigned int order,
unsigned int *current_clique, unsigned int *current_clique_size, unsigned int *max_clique,
unsigned int *max_clique_size, unsigned int level)
{
unsigned int i, connected;
if (level == order) {
/* Found a new max clique */
memcpy(max_clique, current_clique, order * sizeof(unsigned int));
*max_clique_size = *current_clique_size;
return;
}
/* Find out if current vertex is connected to all vertices in current clique */
connected = 1;
for (i = 0; i < level && connected; i++) {
connected = 0;
}
}

if (connected) {
/* Add this vertex to the clique */
current_clique[level] = 1;
(*current_clique_size)++;
max_clique_size, level + 1);
(*current_clique_size)--;
}
if (*current_clique_size + order - level > *max_clique_size) {
/* Still promising */
current_clique[level] = 0;
max_clique_size, level + 1);
}
}

unsigned int maximum_clique(const unsigned int **adjmat, unsigned int order,
unsigned int **max_clique)
{
unsigned int max_clique_size = 0;
unsigned int current_clique_size = 0;
unsigned int *current_clique = malloc(order * sizeof(unsigned int));
*max_clique = malloc(order * sizeof(unsigned int));
if (current_clique == NULL || *max_clique == NULL) {
free(current_clique);
free(max_clique);
return 0;
}
&max_clique_size, 0);
free(current_clique);
return max_clique_size;
}


Here is an example program that finds a maximum clique on the graph shown at the top:

#include <stdio.h>
#include <stdlib.h>

int main()
{
const unsigned int order = 5; /* Vertices */
unsigned int r1[] = {0, 1, 0, 1, 1};
unsigned int r2[] = {1, 0, 1, 0, 1};
unsigned int r3[] = {0, 1, 0, 0, 1};
unsigned int r4[] = {1, 0, 0, 0, 1};
unsigned int r5[] = {1, 1, 1, 1, 0};
const unsigned int *adjmat[] = {r1, r2, r3, r4, r5};
unsigned int *max_clique;
unsigned int max_clique_size = maximum_clique(adjmat, order, &max_clique);
unsigned int i;
printf("Max clique size is %u\n", max_clique_size);
for (i = 0; i < order; i++) {
if (max_clique[i] == 1) {
printf("%u ", i);
}
}
putchar('\n');
free(max_clique);
return 0;
}


The output:

Max clique size is 3
1 2 4