4.12 DCG Grammar rules

Grammar rules form a comfortable interface to difference lists. They are designed both to support writing parsers that build a parse tree from a list of characters or tokens and for generating a flat list from a term.

Grammar rules look like ordinary clauses using -->/2 for separating the head and body rather than :-/2. Expanding grammar rules is done by expand_term/2, which adds two additional arguments to each term for representing the difference list.

The body of a grammar rule can contain three types of terms. A callable term is interpreted as a reference to a grammar rule. Code between {...} is interpreted as plain Prolog code, and finally, a list is interpreted as a sequence of literals. The Prolog control-constructs (\+/1, ->/2, ;/;2, ,/2 and !/0) can be used in grammar rules.

We illustrate the behaviour by defining a rule set for parsing an integer.

integer(I) -->
        digit(D0),
        digits(D),
        { number_codes(I, [D0|D])
        }.

digits([D|T]) -->
        digit(D), !,
        digits(T).
digits([]) -->
        [].

digit(D) -->
        [D],
        { code_type(D, digit)
        }.

Grammar rule sets are called using the built-in predicates phrase/2 and phrase/3:

phrase(:DCGBody, ?List)
Equivalent to phrase(DCGBody, InputList, []).
phrase(:DCGBody, ?List, ?Rest)
True when DCGBody applies to the difference List/Rest. Although DCGBody is typically a callable term that denotes a grammar rule, it can be any term that is valid as the body of a DCG rule.

The example below calls the rule set `integer' defined in section 4.12, binding Rest to the remainder of the input after matching the integer.

?- phrase(integer(X), "42 times", Rest).
X = 42
Rest = [32, 116, 105, 109, 101, 115]

The next example exploits a complete body.

digit_weight(W) -->
        [D],
        { code_type(D, digit(W)) }.

?- phrase(("Version ",
           digit_weight(Major),".",digit_weight(Minor)),
          "Version 3.4").
Major = 3,
Minor = 4.

See also portray_text/1, which can be used to print lists of character codes as a string to the top-level and debugger to facilitate debugging DCGs that process character codes. The library library(apply_macros) compiles phrase/3 if the argument is sufficiently instantiated, eliminating the runtime overhead of translating DCGBody and meta-calling.

As stated above, grammar rules are a general interface to difference lists. To illustrate, we show a DCG-based implementation of reverse/2:

reverse(List, Reversed) :-
        phrase(reverse(List), Reversed).

reverse([])    --> [].
reverse([H|T]) --> reverse(T), [H].