Split a string in C

There are two problems with the standard C strtok function, which is used to split strings:

  • It stores the string being split between calls. This means that while a string is in the process of being split, another string cannot be split in the current thread, or any other thread – i.e., strtok is non-reentrant
  • strtokoverwrites the separators in the string being split with nul characters, so the original string is unusable afterwards. i.e., strtok is destructive

Here’s a function to split a string that has neither of the problems of strok. It takes a pointer to a function from the caller – a callback, which is then passed the start address and length of each token found as the string is split.

typedef void(*split_fn)(const char *, size_t, void *);

void split(const char *str, char sep, split_fn fun, void *data)
{
    unsigned int start = 0, stop;
    for (stop = 0; str[stop]; stop++) {
        if (str[stop] == sep) {
            fun(str + start, stop - start, data);
            start = stop + 1;
        }
    }
    fun(str + start, stop - start, data);
}

Here’s how to use it to print the tokens:

#include <stdio.h>

#include <split.h>

void print(const char *str, size_t len, void *data)
{
    printf("%.*s\n", (int)len, str);
}

int main(void)
{
    char str[] = "first,second,third,fourth";
    split(str, ',', print, NULL);
    return 0;
}
first
second
third
fourth

The third argument to split is a void* pointer, which the caller can use to pass in anything it wants to use within its callback during the splitting process, such as a collection to which to add the tokens.

For example, here’s how to add the tokens to a dynamic array:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <split.h>
#include <dynarray.h>

void add_to_dynarray(const char *str, size_t len, void *data)
{
    dynarray *array = data;
    char *token = calloc(len + 1, 1);
    memcpy(token, str, len);
    dynarray_add_tail(array, token);
}

int main(void)
{
    char str[] = "first,second,third,fourth";
    dynarray *array = dynarray_create(0);
    split(str, ',', add_to_dynarray, array);
    dynarray_for_each(array, (dynarray_forfn)puts);
    dynarray_for_each(array, free);
    dynarray_delete(array);
    return 0;
}

Related