Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

5. Parsing longer expressions

[Note] Note

Note that you can find everything that has been included and defined so far here.

We can parse simple expressions adding two numbers together. But we can't parse expressions adding three, four or maybe more numbers together. In this section we will implement a parser for expressions adding lots of numbers together.

[Note] Note

Note that you can find everything that has been included and defined so far here.

We can't solve this problem with sequence, since we don't know how many numbers the input will have. We need a parser that:

  • parses the first number
  • keeps parsing + <number> elements until the end of the input

Parsing the first number is something we can already do: the int_token parser does it for us. Parsing the + <number> elements is more tricky. Metaparse offers different tools for approaching this. The most simple is repeated:

> #include <boost/metaparse/any.hpp>

repeated needs a parser (which parses one + <number> element) and it keeps parsing the input with it as long as it can. This will parse the entire input for us. Let's create a parser for our expressions using it:

> using exp_parser7 = \
...> build_parser< \
...>   sequence< \
...>     int_token,                                /* The first <number> */ \
...>     repeated<sequence<plus_token, int_token>> /* The "+ <number>" elements */ \
...>   > \
...> >;

copy-paste friendly version

We have a sequence with two elements:

  • The first number (int_token)
  • The + <number> parts

The second part is an repeated, which parses the + <number> elements. One such element is parsed by sequence<plus_token, int_token>. This is just a sequence of the + symbol and the number.

Let's try parsing an expression using this:

> exp_parser7::apply<BOOST_METAPARSE_STRING("1 + 2 + 3 + 4")>::type

Here is a formatted version of the result which is easier to read:

boost_::mpl::vector<
  // The result of int_token
  mpl_::integral_c<int, 1>,

  // The result of repeated< sequence<plus_token, int_token> >
  boost_::mpl::vector<
    boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 2> >,
    boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 3> >,
    boost_::mpl::vector<mpl_::char_<'+'>, mpl_::integral_c<int, 4> >
  >
>

The result is a vector of two elements. The first element of this vector is the result of parsing the input with int_token, the second element of this vector is the result of parsing the input with repeated< sequence<plus_token, int_token>>. This second element is also a vector. Each element of this vector is the result of parsing the input with sequence<plus_token, int_token> once. Here is a diagram showing how exp_parser7 parses the input 1 + 2 + 3 + 4:

The diagram shows that the + <number> elements are parsed by sequence<plus_token, int_token> elements and their results are collected by repeated, which constructs a vector of these results. The value of the first <number> and this vector are placed in another vector, which is the result of parsing.


PrevUpHomeNext