Although phrase structure grammars are used in the development of parsers, they really describe how to generate sentences in a language. If a grammar is context-free, there is a simple algorithm to generate sentences from it. The algorithm uses a queue to perform a breadth-first search as follows:
- Put the start symbol in the queue
- While the queue is not empty and more sentences are wanted:
- Pop the first sentential form from the queue
- Find the first non-terminal in it
- For every right-hand side of that non-terminal
- Make a copy of the sentential form
- Substitute the right-hand side for the non-terminal
- Put this new sentential form into the queue
- If there are no non-terminals left in the sentential form, it is a complete sentence, so return it
Note that the queue, and consequent breadth-first search, are required because if the rule’s right-hand side is left-recursive, the algorithm would go into an infinite recursion without producing anything if it performed a depth-first search.
Here is the code in C++, using the class I described earlier in Context-free grammar in C++. The generator
class is instantiated with a grammar, and its generate()
method will write n
sentences to the supplied iterator, if n
sentences can be generated.
#ifndef GENERATOR_HPP #define GENERATOR_HPP #include <vector> #include <list> #include <queue> #include <algorithm> #include <grammar.hpp> namespace MB { class generator { public: generator(const grammar& grammar) : grammar_(grammar) { // Store the left-hand sides std::transform(grammar_.rules().begin(), grammar_.rules().end(), std::back_inserter(lefts_), [](rule::const_ptr r)->const std::string& { return r->left(); }); } template <class InputIt> InputIt generate(InputIt init, unsigned int n) const { std::queue<sentential_form> queue; unsigned int sentences = 0; // Add the start symbol to the queue queue.push(sentential_form(1, grammar_.get_start_left())); while (sentences < n && queue.size()) { sentential_form sf = queue.front(); queue.pop(); // Find the first non-terminal sentential_form::iterator it = std::find_first_of(sf.begin(), sf.end(), lefts_.begin(), lefts_.end()); if (it != sf.end()) { // Replace with each right-hand side alternative std::string left = *it; std::vector<rule::const_ptr> rules; grammar_.get_rules_for_left(left, std::back_inserter(rules)); for (rule::const_ptr rule : rules) { for (const std::vector<std::string>& alternative : rule->right()) { sentential_form new_sf(sf); it = std::find(new_sf.begin(), new_sf.end(), left); new_sf.insert(it, alternative.begin(), alternative.end()); new_sf.erase(it); queue.push(new_sf); } } } else { // Finished sentence *init++ = std::vector<std::string>(sf.begin(), sf.end()); ++sentences; } } return init; } private: typedef std::list<std::string> sentential_form; const grammar& grammar_; std::vector<std::string> lefts_; }; } // namespace MB #endif // GENERATOR_HPP
Here is an example grammar:
S -> NP VP NP -> NP PP NP -> DET N VP -> V NP VP -> V PP -> P NP DET -> a | the N -> man | dog | house | telescope | spoon V -> saw | fed | barked P -> with
And an example program that generates the first 50 sentences:
#include <fstream> #include <iostream> #include <vector> #include <algorithm> #include <grammar.hpp> #include <generator.hpp> int main() { std::string filename = "generator.dat"; std::ifstream ifs(filename); if (ifs) { MB::grammar grammar(ifs); MB::generator gen(grammar); std::vector<std::vector<std::string> > sentences; gen.generate(std::back_inserter(sentences), 50); for (auto sentence : sentences) { std::copy(sentence.begin(), sentence.end(), std::ostream_iterator<const std::string&>(std::cout, " ")); std::cout << '\n'; } } else { std::cerr << "Couldn't open " << filename << " for reading\n"; } }
This is the somewhat surreal output:
a man saw a man fed a man barked a dog saw a dog fed a dog barked a house saw a house fed a house barked a telescope saw a telescope fed a telescope barked a spoon saw a spoon fed a spoon barked the man saw the man fed the man barked the dog saw the dog fed the dog barked the house saw the house fed the house barked the telescope saw the telescope fed the telescope barked the spoon saw the spoon fed the spoon barked a man saw a man a man saw a dog a man saw a house a man saw a telescope a man saw a spoon a man saw the man a man saw the dog a man saw the house a man saw the telescope a man saw the spoon a man fed a man a man fed a dog a man fed a house a man fed a telescope a man fed a spoon a man fed the man a man fed the dog a man fed the house a man fed the telescope a man fed the spoon