How-To Guides

How to Prevent StackError?

Considering a json nested array: [[[[…]]]]. If the depth is too deep, Peppa PEG may end up with a StackError since Peppa PEG is a recursion-based parser. To prevent the stack address from exhausted, one can set a limit on the recursion depth. By default, the depth is 8192. The depth tracks the frames in the stack; each expression will have a frame in the stack.

If you need to adjust the depth, try P4_SetRecursionLimit().

>> P4_SetRecursionLimit(grammar, 1000);
P4_Ok

>> P4_GetRecursionLimit(grammar);
1000
seealso

P4_SetRecursionLimit().

How to Parse Substring?

Peppa PEG supports parsing a subset of source input. By default, all source content are attempted.

One can use P4_SetSourceSlice to set the start position and stop position in the source input.

In this example, Peppa PEG only parses the substring “YXXXY”[1:4], e.g. “XXX”.

>> P4_Source* source = P4_CreateSource("YXXXY", "entry");
>> P4_SetSourceSlice(source, 1, 4);
P4_Ok
>> P4_Parse(grammar, source);
P4_Ok
seealso

P4_SetSourceSlice().

How to Prevent Partial Parse?

Considering below case,

P4_Grammar* grammar = P4_CreateGrammar();
assert(P4_Ok == P4_AddZeroOrMore(grammar, "entry", P4_AddLiteral("a", true)));

P4_Source* source = P4_CreateSource("aaab", "entry");
P4_Parse(grammar, source); // P4_Ok!

Guess what, P4_Parse() returns P4_Ok! Peppa PEG eats 3 “a” characters and ignores the rest of the input.

The expression of Start-of-Input and End-of-Input match the start and end of the input. They don’t consume text.

To make “aaab” as a whole, we need to add Start-of-Input and End-of-Input before and after the ZeroOrMore rule:

If you use using PEG API, use &. and !.:

entry = &. a*  !. ;

If you are using low level API, use P4_CreateStartOfInput() and P4_CreateEndOfInput().

P4_Grammar* grammar = P4_CreateGrammar();
assert(P4_Ok == P4_AddSequenceWithMembers(grammar, "entry", 3,
    P4_CreateStartOfInput(),
    P4_CreateZeroOrMore(P4_AddLiteral("a", true)),
    P4_CreateEndOfInput()
));
assert(P4_Ok == P4_AddZeroOrMore(grammar, "a", P4_AddLiteral("a", true)));

P4_Source* source = P4_CreateSource("aaab", "entry");
P4_Parse(grammar, source); // P4_MatchError
seealso

P4_CreateStartOfInput(), P4_CreateEndOfInput().

How to Join Expressions by Separators?

Joining a rule by a separator is a common use, such as f(p1, p2, p3), [1, 2, 3]. Peppa PEG provides a sugar to make it easier to match such a pattern.

For example, let’s match 1,2,3:

# define ROW 1
# define NUM 2
P4_Grammar* grammar = P4_CreateGrammar();

// Or: P4_AddGrammarRule(grammar, ROW, P4_CreateJoin(",", NUM))
assert(P4_Ok == P4_AddJoin(grammar, ROW, ",", NUM));

assert(P4_Ok == P4_AddRange(grammar, NUM, '0', '9', 1));

When parsing 1,2,3, it will produce such a data structure:

Node(0..5, ROW):
    Node(0..1, NUM)
    Node(2..3, NUM)
    Node(4..5, NUM)

The separator will not have its corresponding node, while all joined members have their nodes.

How to Replace Malloc/Free/Realloc?

You may choose your own memory management solution by replacing macros P4_MALLOC, P4_FREE and P4_REALLOC.

Say you want to replace stdlib malloc/free/realloc with bdwgc GC_*, you can define the above macros before including “peppa.h”:

# include "gc.h"

# define P4_MALLOC GC_MALLOC
# define P4_FREE
# define P4_REALLOC GC_REALLOC

# define "peppa.h"

How to Transfer the Ownership of Source AST?

You may transfer the ownership of AST out of the source object using P4_AcquireSourceAst().

Say you want to get AST while do not want to keep track of the source, you can:

P4_Node* root = P4_AcquireSourceAst(source);
P4_DeleteSource(source);

// do something.
P4_DeleteNode(root);