# Shortest Common Supersequence via LCS in C

In my Shortest Common Supersequence post I gave a dynamic programming algorithm that computes the SCS in a manner very similar to the Longest Common Subsequence algorithm. Another way of computing the SCS is to find the LCS first, and then insert into it those characters from the two strings that are not in the LCS.

As an example, given the two strings “pAqBrCs”, and “wAxByCz”, their LCS is “ABC”, and we can form their SCS by inserting the remaining characters from each string into it in the correct order. One such SCS would be “pwAqxBryCsz”, but there are several possibilities because it doesn’t matter whether we insert the characters from the first string first or the second string at any point, as long as we add them in the order they occur in that string.

The algorithm just needs to scan the two strings, adding the characters not in the LCS while keeping track of where the LCS characters are, and making sure they are only added once. This is how it looks in C:

#include <string.h>
#include <stdlib.h>

char *shortest_common_supersequence_by_lcs(const char *str1, const char *str2)
{
char *lcs = longest_common_subsequence(str1, str2);
const size_t lcs_len = strlen(lcs);
const size_t scs_len = lcs_len + (strlen(str1) - lcs_len) + (strlen(str2) - lcs_len);
char *scs = malloc(scs_len + 1);
unsigned int lcs_pos = 0, str1_pos = 0, str2_pos = 0, scs_pos = 0;

/* Add the characters up to the last LCS character */
while (lcs_pos < lcs_len) {
while (str1[str1_pos] != lcs[lcs_pos]) {
scs[scs_pos++] = str1[str1_pos++];
}
while (str2[str2_pos] != lcs[lcs_pos]) {
scs[scs_pos++] = str2[str2_pos++];
}
scs[scs_pos++] = lcs[lcs_pos++];
str1_pos++;
str2_pos++;
}
/* Add the characters after the LCS */
while (str1[str1_pos]) {
scs[scs_pos++] = str1[str1_pos++];
}
while (str2[str2_pos]) {
scs[scs_pos++] = str2[str2_pos++];
}
scs[scs_pos] = '\0';
free(lcs);
return scs;
}


An example program:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
char a[] = "pAqBrCs";
char b[] = "wAxByCz";
char *scs = shortest_common_supersequence_by_lcs(a, b);
printf("Shortest Common Supersequence is %s\n", scs);
free(scs);
return 0;
}


Output:

Shortest Common Supersequence is pwAqxBryCsz