Martin Broadhurst
CSV Parser
This is a parser for CSV (Comma-Separated Values) files.
It processes input that conforms to the folowing rules:
- Values are separated by commas.
- Records are separated by newlines (DOS or Unix)
- Values containing commas, the quote character(") or carriage returns must be in quotes
- Literal quote characters must be doubled (" becomes "")
- Blank lines are only permitted at the end
You construct a parser with MBcsvparser_create().
It can be given two pointers to functions that are called when a value has been read, and when a record has finished.
Both callbacks are optional.
You set these with MBcsvparser_set_valuefn() and MBcsvparser_set_recordfn() respectively.
Both callbacks take a void pointer, which is a optional item of user-defined data.
You provide this to the parser with MBcsvparser_set_userdata().
Once the parser has been created, you call MBcsvparser_parse() with a buffer of data and its length.
A third argument is a boolean value indicating whether the buffer of data provided is the last.
You can get the data you pass to the parser from fread, fgets or any other file reading function such as my text file reader.
Once you have finished with the parser you delete it with MBcsvparser_delete().
Here is an example of using the parser and a couple of simple functions to print the first two columns of a file called customers.txt.
#include <stdio.h>
#include "csvparser.h"
#define BUF_SIZE 4096
/* Called for each value */
void valuefn(const char *value, void *v)
{
unsigned int *column = v;
if (*column < 3) {
printf("%s", value);
}
if (*column == 1) {
putchar('\t');
}
(*column)++;
}
/* Called at the end of each record */
void recordfn(void *v)
{
unsigned int *column = v;
*column = 1;
putchar('\n');
}
int main(void)
{
FILE *fptr;
char filename[] = "customers.txt";
if ((fptr = fopen(filename, "rt")) != NULL) {
MBcsvparser *parser = MBcsvparser_create();
if (parser) {
char buf[BUF_SIZE];
unsigned int ok = 1;
unsigned int column = 1;
MBcsvparser_set_valuefn(parser, valuefn);
MBcsvparser_set_recordfn(parser, recordfn);
MBcsvparser_set_userdata(parser, &column);
while (!feof(fptr) && ok) {
size_t bytes = fread(buf, 1, BUF_SIZE, fptr);
if (!MBcsvparser_parse(parser, buf, bytes, bytes < BUF_SIZE)) {
const char *message;
unsigned int record;
unsigned int character;
MBcsvparser_get_error(parser, &message, &record, &character);
fprintf(stderr, "Error: %s at record %d, character %d\n",
message, record, character);
ok = 0;
}
}
MBcsvparser_delete(parser);
}
fclose(fptr);
}
else {
printf("Couldn't open %s for reading\n", filename);
}
return 0;
}