Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

10. Dealing with parens

Our parsers already support the precedence of the different operators. Let's add support for parens as well, so users can override the precedence rules when they need to.

We can add a new parser for parsing (and evaluating) expressions in parens. First we introduce tokens for parsing the ( and ) symbols:

> using lparen_token = token<lit_c<'('>>;
> using rparen_token = token<lit_c<')'>>;

copy-paste friendly version

A paren can contain an expression with any operators in it, so we add a parser for parsing (and evaluating) an expression containing operators of the highest precedence:

> using plus_exp1 = \
...> foldl_start_with_parser< \
...>   sequence<one_of<plus_token, minus_token>, mult_exp4>, \
...>   mult_exp4, \
...>   boost::mpl::quote2<binary_op> \
...> >;

copy-paste friendly version

This was just a refactoring of our last parser for the calculator language. We can build the parser for our calculator language by using build_parser<plus_exp1> now. Let's write a parser for a paren expression:

> using paren_exp1 = sequence<lparen_token, plus_exp1, rparen_token>;

This definition parses a left paren, then a complete expression followed by a right paren. The result of parsing a paren expression is a vector of three elements: the ( character, the value of the expression and the ) character. We only need the value of the expression, which is the middle element. We could wrap the whole thing with a transform that gets the middle element and throws the rest away, but we don't need to. This is such a common pattern, that Metaparse provides middle_of for this:

> #include <boost/metaparse/middle_of.hpp>
> using paren_exp2 = middle_of<lparen_token, plus_exp1, rparen_token>;

copy-paste friendly version

This implementation is almost the same as paren_exp1. The difference is that the result of parsing will be the value of the wrapped expression (the result of the plus_exp1 parser).

Let's define a parser for a primary expression which is either a number or an expression in parens:

> using primary_exp1 = one_of<int_token, paren_exp2>;

This parser accepts either a number using int_token or an expression in parens using paren_exp1.

Everywhere, where one can write a number (parsed by int_token), one can write a complete expression in parens as well. Our current parser implementation parses int_tokens in unary_exp, therefore we need to change that to use primary_exp instead of int_token.

There is a problem here: this makes the definitions of our parsers recursive. Think about it:

[Note] Note

Since we are versioning the different parser implementations in Metashell (paren_exp1, paren_exp2, etc) you might try to define these recursive parsers and it might seem to work for the first time. In that case, when you later try creating a parser as part of a library (save your Metashell environment to a file or re-implement the important/successful elements) you face this issue.

We have been using type aliases (typedef and using) for defining the parsers. We can do it as long as their definition is not recursive. We can not refer to a type alias until we have defined it and type aliases can not be forward declared, so we can't find a point in the recursive cycle where we could start defining things.

A solution for this is making one of the parsers a new class instead of a type alias. Classes can be forward declared, therefore we can declare the class, implement the rest of the parsers as they can refer to that class and then define the class at the end.

Let's make plus_exp a class. So as a first step, let's forward declare it:

> struct plus_exp2;

Now we can write the rest of the parsers and they can refer to plus_exp2:

> using paren_exp3 = middle_of<lparen_token, plus_exp2, rparen_token>;
> using primary_exp2 = one_of<int_token, paren_exp2>;
> using unary_exp2 = \
...> foldr_start_with_parser< \
...>   minus_token, \
...>   primary_exp2, \
...>   boost::mpl::lambda<boost::mpl::negate<boost::mpl::_1>>::type \
...> >;
> using mult_exp5 = \
...> foldl_start_with_parser< \
...>   sequence<one_of<times_token, divides_token>, unary_exp2>, \
...>   unary_exp2, \
...>   boost::mpl::quote2<binary_op> \
...> >;

copy-paste friendly version

There is nothing new in the definition of these parsers. They build up the hierarchy we have worked out in the earlier sections of this tutorial. The only element missing is plus_exp2:

> struct plus_exp2 : \
...> foldl_start_with_parser< \
...>   sequence<one_of<plus_token, minus_token>, mult_exp5>, \
...>   mult_exp5, \
...>   boost::mpl::quote2<binary_op> \
...> > {};

copy-paste friendly version

This definition makes use of inheritance instead of type aliasing. Now we can write the parser for the calculator that supports parens as well:

> using exp_parser19 = build_parser<plus_exp2>;

Let's try this parser out:

> exp_parser19::apply<BOOST_METAPARSE_STRING("(1 + 2) * 3")>::type
mpl_::integral_c<int, 9>

Our parser accepts and can deal with parens in the expressions.


PrevUpHomeNext