Home | Libraries | People | FAQ | More |
So far we have been focusing on parsing valid user input. However, users of our parsers will make mistakes and we should help them finding the source of the problem. And we should make this process not too painful.
The major difficulty in error reporting is that we have no direct way of showing error messages to the user. The parsers are template metaprograms. When they detect that the input is invalid, they can make the compilation fail and the compiler (running the metaprogram) display an error message. What we can do is making those error messages short and contain all information about the parsing error. We should make it easy to find this information in whatever the compiler displays.
So let's try to parse some invalid expression and let's see what happens:
> exp_parser19::apply<BOOST_METAPARSE_STRING("hello")>::type << compilation error >>
You will get a lot (if you have seen error messages coming from template metaprograms you know: this is not a lot.) of error messages. Take a closer look. It contains this:
x__________________PARSING_FAILED__________________x< 1, 1, boost::metaparse::v1::error::literal_expected<'('> >
You can see a formatted version above. There are no line breaks in the real
output. This is relatively easy to spot (thanks to the ____________
part) and contains answers to the main questions one has when parsing fails:
1
in line 1
(inside BOOST_METAPARSE_STRING
). This
is the 1,
1
part.
literal_expected<'('>
.
This is a bit misleading, as it contains only a part of the problem.
An open paren is not the only acceptable token here, a number would also
be fine. This misleading error message is our fault:
we (the parser authors) need to make the parsing
errors more descriptive.
So how can we improve the error messages? Let's look at what went wrong in the previous case:
hello
.
plus_exp2
tried to
parse it.
plus_exp2
tried to
parse it using mult_exp5
(assuming that this is the initial mult_exp
in the list of +
/ -
separated mult_exp
s).
mult_exp5
tried
to parse it.
mult_exp5
tried to
parse it using unary_exp2
(assuming that this is the initial unary_exp
in the list of *
/ /
separated unary_exp
s).
unary_exp2
tried
to parse it.
unary_exp2
parsed all
of the -
symbols using
minus_token
. There
were none of them (the input started with an h
character).
unary_exp2
tried
to parse it using primary_exp2
.
primary_exp2
is: one_of
<int_token, paren_exp2>
. It tried parsing the input with
int_token
(which failed)
and then with paren_exp2
(which failed as well). So one_of
could not parse the input with any of the choices and therefore it
failed as well. In such situations one_of
checks which parser made the most progress (consumed the most characters
of the input) before failing and assumes, that that is the parser the
user intended to use, thus it returns the error message coming from
that parser. In this example none of the parsers could make any progress,
in which case one_of
returns the error coming from the last parser in the list. This was
paren_exp2
, and it
expects the expression to start with an open paren. This is where the
error message came from. The rest of the layers did not change or improve
this error message so this was the error message displayed to the user.
We, the parser authors know: we expect a primary expression there. When
one_of
fails, it means that none was found.