## Introduction

A connected component of a graph is a maximal subgraph in which the vertices are all connected, and there are no connections between the subgraph and the rest of the graph. A connected graph has only one connected component, which is the graph itself, while unconnected graphs have more than one component. For example, the graph shown below has three components, (0, 1, 2, 3), (4, 5, 6), and (7, 8).

The connected components of a graph can be found using either a depth-first search (DFS), or a breadth-first search (BFS). We start at an arbitrary vertex, and visit every vertex adjacent to it recursively, adding them to the first component. Once this search has finished, we have visited all of the vertices in the first connected component, so we choose another unvisited vertex (if any) and perform the same search starting from it, adding the vertices we find to the second component. This process continues until all vertices have been visited, at which point we know the number of connected components in the graph, and which vertices they contain.

These are implementations of both connected components algorithms in C. An array is used to store the number of the connected component for each vertex, starting with component 0. The array elements are initialised to -1 so the array is also used to determine which vertices have not yet been visited, as their component number will still be -1.

The graph edges are represented as a struct as follows:

typedef struct { unsigned int first; unsigned int second; } edge;

## Depth-first search

#include <stdlib.h> void connected_components_recursive(const edge *edges, unsigned int n, int *components, unsigned int order, unsigned int vertex, unsigned int component) { unsigned int i; /* Put this vertex in the current component */ components[vertex] = component; for (i = 0; i < n; i++) { if (edges[i].first == vertex || edges[i].second == vertex) { /* Adjacent */ const unsigned int neighbour = edges[i].first == vertex ? edges[i].second : edges[i].first; if (components[neighbour] == -1) { /* Not yet visited */ connected_components_recursive(edges, n, components, order, neighbour, component); } } } } unsigned int connected_components(const edge *edges, unsigned int n, unsigned int order, int **components) { unsigned int i; unsigned int component = 0; *components = malloc(order * sizeof(int)); if (components == NULL) { return 0; } for (i = 0; i < order; i++) { (*components)[i] = -1; } for (i = 0; i < order; i++) { if ((*components)[i] == -1) { connected_components_recursive(edges, n, *components, order, i, component); component++; } } return component; }

## Breadth-first search

#include <stdlib.h> #include <cirque.h> static void connected_components_internal(const edge *edges, unsigned int n, int *components, unsigned int order, unsigned int vertex, unsigned int component) { unsigned int i; /* Put this vertex in the current component */ components[vertex] = component; cirque *queue = cirque_create(); if (!queue) { cirque_delete(queue); return; } cirque_insert(queue, &vertex); while (cirque_get_count(queue)) { unsigned int e; unsigned int *current = cirque_remove(queue); for (e = 0; e < n; e++) { if (edges[e].first == *current || edges[e].second == *current) { const unsigned int *neighbour = edges[e].first == *current ? &edges[e].second : &edges[e].first; if (components[*neighbour] == -1) { components[*neighbour] = component; cirque_insert(queue, (void*)neighbour); } } } } cirque_delete(queue); } unsigned int connected_components(const edge *edges, unsigned int n, unsigned int order, int **components) { unsigned int i; unsigned int component = 0; *components = malloc(order * sizeof(int)); if (components == NULL) { return 0; } for (i = 0; i < order; i++) { (*components)[i] = -1; } for (i = 0; i < order; i++) { if ((*components)[i] == -1) { connected_components_internal(edges, n, *components, order, i, component); component++; } } return component; }

## Example program

Here is an example program that constructs the graph shown above and then finds its connected components:

#include <stdio.h> #include <stdlib.h> static void print_components(int *components, unsigned int order) { unsigned int i; for (i = 0; i < order; i++) { printf("Vertex %u is in component %d\n", i, components[i]); } } int main(void) { const unsigned int order = 9; /* Vertices */ const unsigned int n = 8; /* Edges */ edge *edges; int *components; unsigned int c; edges = malloc(n * sizeof(edge)); if (edges == NULL) { return 1; } /* Square */ edges[0].first = 0; edges[0].second = 1; edges[1].first = 1; edges[1].second = 2; edges[2].first = 2; edges[2].second = 3; edges[3].first = 3; edges[3].second = 0; /* Triangle */ edges[4].first = 4; edges[4].second = 5; edges[5].first = 5; edges[5].second = 6; edges[6].first = 6; edges[6].second = 4; /* Line */ edges[7].first = 7; edges[7].second = 8; c = connected_components(edges, n, order, &components); if (components == NULL) { free(edges); return 1; } printf("There are %u components:\n", c); print_components(components, order); free(edges); free(components); return 0; }

The output:

There are 3 components: Vertex 0 is in component 0 Vertex 1 is in component 0 Vertex 2 is in component 0 Vertex 3 is in component 0 Vertex 4 is in component 1 Vertex 5 is in component 1 Vertex 6 is in component 1 Vertex 7 is in component 2 Vertex 8 is in component 2

The connected components of a graph can also be represented as sets of edges, rather than vertices. This is called the spanning forest of a graph.