Parsnip Parser Library

parser combinators for C++

Author: Alex Rubinsteyn

Introduction

The parsnip library allows you to build complex parsers from a rich set of parser primitives. This method of parser construction is inspired by parsec and other parser combinator libraries for Haskell and ML. Parsnip parsers use packrat parsing as their default parse strategy. Packrat parsing is a dynamic programming optimization of backtracking top-down parsing which allows for linear-time processing of Parser Expression Grammars.

Note: Parsnip is currently under heavy development and not intended for use in production systems. If you find any bugs or have ideas for improvements, send me an email.

Downloading

You can download the latest release of Parsnip from the fine folks at SourceForge.

Compiling

Thus far, Parsnip has been developed and tested under Visual C++ 2005. Compatibility with gcc 4.x will be maintained once the features are finalized.

Examples

I will write some formal documentation and tutorials in the future, but for now I have put up example code as posts on a blog.

Function Reference

ch (char c) ⇒ CharParser<string, string>
Returns a CharParser which matches character c and returns it as a string.
str (string s) ⇒ StringParser<string, string>
Returns a StringParser which matches sequence of input characters to string s and return s
range (char l, char u) ⇒ CharRangeParser<string, string>
Returns a CharRangeParser which succeeds only if input character is within range [l, u].
oneOf (string s) ⇒ OneOfParser<string, string>
Returns a OneOfParser which matches an input character to any character in string s. Returns matched character in a string.
seq <I, R> (Parser<I, R> p1, Parser<I, R> p2) ⇒ SeqTupleParser<I, R>
Creates a sequence parser: if both parsers succeed, return their results in a tuple, fail otherwise.
The shorthand for this parser is:
p1 >> p2
seq_vec <I, R> (Parser<I, R> p1, Parser<I, R> p2) ⇒ SeqVecParser<I, R>
Creates a vector sequence parser: if both parsers succeed, return their results in a vector, fails otherwise. Vectors, unlike tuples, can be of arbitrary length but require the parsers to be of a homogeneous input and output type.
The shorthand for this parser is:
p1 && p2
concat (Parser<string, string> p1, Parser<string, string> p2) ⇒ ConcatParser<string, string>
Creates a concat parser: parses in sequence but concatenates results rather than tupling them.
The shorthand for this parser is:
p1 + p2
.
choice <I, R> (Parser<I, R> p1, Parser<I, R> p2) ⇒ ChoiceParser <I, R>
Parser choice: Returns the first of its two parser parameters to succeed, fails otherwise. Backtracks on the input stream for second parser.
The shorthand for this parser is:
p1 | p2.
not<I, R> (Parser<I, R> p) ⇒ NotParser<I, R>
Creates a NotParser which succeeds if p fails and consumes no input.
call0<I, R> (Parser<I, void> p, R (*fn) (void) ) ⇒ callParser0<I, R>
Creates a CallParser which calls fn if p's parse succeeds.
call1<I, T, R> (Parser<I, T> p, R (*fn) (T)) ⇒ CallParser1<I, R>
Creates a CallParser which calls fn with the results of p's parse.
call2<I, T1, T2, R> (Parser<I, Tuple2<T1, T2>> p, R (*fn) (T1, T2) ) ⇒ CallParser2<I, R>
Creates a CallParser which unpacks the 2-tuple that p returns and passes the components as arguments to fn.
call3<I, T1, T2, T3, R> (Parser<I, Tuple3<T1, T2, T3>> p, R (*fn) (T1, T2, T3) ) ⇒ CallParser3<I, R>
Creates a CallParser which unpacks the 3-tuple that p returns and passes the components as arguments to fn.
many <I, R, Acc> (Parser<I, R> p, int min = 0, int max = INT_MAX) ⇒ ManyParser<I, Acc::ResultType>
The ManyParser repeatedly parses p until max is reached or p fails. If the number of parses is less than min then the ManyParser fails. Each time p is parsed successfully the result is passed to an accumulator of type Acc. If the ManyParser is successful it returns the accumulated data of its Acc object.
many1 <I, R, Acc> (Parser<I, R> p, int max = INT_MAX) ⇒ ManyParser<I, Acc::ResultType>
Generates a ManyParser which requires atleast one parse of p to succeed.
atleast <I, R, Acc> (Parser<I, R> p, int min) ⇒ ManyParser<I, Acc::ResultType>
Generates a ManyParser which requires atleast min parses of p to succeed.
sepBy
sepBy1
sepByAtleast
skip <I, R> (Parser<I, R> p) ⇒ SkipParser<I, void>
Requires p to succeed but discards its input. If used in a sequence, skip parsers are ignored during tuple construction. Thus, if pS is a SkipParser, then the parse of
p1 >> pS >> p2
will generate a Tuple2 containing the results of p1 and p2.
skipMany <I, R> (Parser<I, R> p) ⇒ SkipParser<I, void>
Repeatedly applies p to the input stream, discarding the results.
skipMany1<I, R> (Parser<I, R> p) ⇒ SkipParser<I, void>
Applies p to the input stream at least one (or more) times. Discards the results.
skipAtleast<I, R> (Parser<I, R> p, unsigned n) ⇒ SkipParser<I, void>
Applies p to the input stream at least n times. Discards the results.
token<I, R> (Parser<I, R> p) ⇒ Parser<I, R>
Pads the parser p with skipped whitespace.
token_ch (char c) ⇒ Parser<string, string>
Creates a CharParser surrounded by skipped whitespace.
token_str (string s) ⇒ Parser<string, string>
Creates a StringParser surrounded by skipped whitespace.
skip_ch (char c) ⇒ SkipParser<string, void>
skip(char(c))
skip_str (string s) ⇒ SkipParser<string, void>
skip(str(s))
fail<I, R> (string msg) ⇒ FailParser<I, R>
Always returns a failed result with error message msg.
require<I, R> (Parser<I, R> p, string msg) ⇒ RequireParser<I, R>
Creates a RequireParser which passes along p's result if it succeeds and throws an exception with message msg otherwise.
optional <In, Out> (Parser<In,Out> p, Out default) ⇒ Parser<In, Out>
Creates an OptionalParser which, if p succeeds returns p's value, and otherwise returns default.
lazy<I, R> (Parser<I, R> () ⇒ LazyParser<I, R>
Creates a LazyParser, which acts as a proxy for its target parser. This is required to allow a parser to refer to itself recursively.
setLazy< I, R> (Parser<I, R> proxy, Parser<I, R> target) ⇒ Parser<I, R>
Sets the target of proxy, whose type must be LazyParser, to target.
longest <I, R> (Parser<I,R> p1, Parser<I,R> p2) ⇒ LongestParser<I, R>
Parses the input stream using p1 and then backtracks to parse again using p2. The returned value belongs to Whichever parser ends at a later position.
op_table <I, R> () ⇒ OpTableParser<I, R>
Creates an OpTableParser which accepts input of type I and returns an object of type R. The member functions used to add operators are
infix_left
and
infix_right
. Eventually prefix and postfix parsing will also be added.