All APIs & References

Peppa PEG - Ultra lightweight PEG Parser in ANSI C.

MIT License

Copyright (c) 2021 Ju

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Author

Ju Lin

Copyright

MIT

Date

2021

See

https://github.com/soasme/PeppaPEG

DEFINES

Defines

P4_MALLOC

The malloc function. By default, it’s malloc.

P4_FREE

The free function. By default, it’s free.

P4_REALLOC

The realloc function. By default, it’s realloc.

P4_MAJOR_VERSION

Major version number.

P4_MINOR_VERSION

Minor version number.

P4_PATCH_VERSION

Patch version number.

P4_FLAG_NONE

No flag.

P4_FLAG_SQUASHED

When the flag is set, the grammar rule will have squash all children nodes.

node->head, node->tail will be NULL.

For example, rule b has P4_FLAG_SQUASHED. After parsing, the children nodes d and e are gone:

     a   ===>   a
   (b) c ===>  b c
   d e

The flag will impact all of the descendant rules.

P4_FLAG_LIFTED

When the flag is set, the grammar rule will replace the node with its children nodes.

For example, rule b has P4_FLAG_SQUASHED. After parsing, the node (b) is gone, and its children d and e become the children of a:

     a   ===>   a
   (b) c ===> d e c
   d e

P4_FLAG_TIGHT

When the flag is set, the grammar rule will insert no P4_FLAG_SPACED rules inside the sequences and repetitions

This flag only works for the repetition and sequence expressions.

P4_FLAG_SCOPED

When the flag is set, the effect of SQUASHED and TIGHT are canceled.

Regardless if the ancestor expression has SQUASHED or TIGHT flag, starting from this expression, Peppa PEG will start creating nodes and apply SPACED rules for sequences and repetitions.

P4_FLAG_SPACED

When the flag is set, the expression will be inserted between every node inside the sequences and repetitions

If there are multiple SPACED expressions, Peppa PEG will iterate through all SPACED expressions.

This flag makes the grammar clean and tidy without inserting whitespace rule everywhere.

P4_FLAG_NON_TERMINAL

When the flag is set and a sequence/repetition node tree has only one child, the child node will replace the parent.

P4_DEFAULT_RECURSION_LIMIT

The default recursion limit.

It can be adjusted via function P4_SetRecursionLimit.

P4_MAX_RULE_NAME_LEN

The maximum length of a rule name.

E_WRONG_LIT
E_BACKREF_OUT_REACHED
E_BACKREF_TO_SELF
E_TEXT_TOO_SHORT
E_OUT_RANGED
E_INVALID_UNICODE_CHAR
E_INVALID_UNICODE_PROPERTY
E_INVALID_UNICODE_CATEGORY
E_NO_SUCH_RULE
E_NO_ALTERNATIVE
E_NO_PROGRESSING
E_INSUFFICIENT_REPEAT
E_INSUFFICIENT_REPEAT2
E_EXCESSIVE_REPEAT
E_LEFT_RECUR_NO_LIFT
E_VIOLATE_NEGATIVE
E_MAX_RECURSION
E_INVALID_MATCH_CALLBACK
E_INVALID_ERROR_CALLBACK
E_NO_EXPR
E_WRONG_BACKREF
E_RECURSIVE_BACKREF

ENUMS

Enums

enum P4_ExpressionKind

The expression kind.

Values:

enumerator P4_Literal

Rule: Case-Sensitive Literal, Case-Insensitive Literal.

enumerator P4_Range

Rule: Range.

enumerator P4_UnicodeCategory

Rule: Unicode Category.

enumerator P4_Reference

Rule: Reference.

enumerator P4_Positive

Rule: Positive.

enumerator P4_Negative

Rule: Negative.

enumerator P4_Sequence

Rule: Sequence.

enumerator P4_BackReference

Rule: Case-Sensitive BackReference, Case-Insensitive BackReference.

enumerator P4_Choice

Rule: Choice.

enumerator P4_Repeat

Rule: RepeatMinMax, RepeatMin, RepeatMax, RepeatExact, OnceOrMore, ZeroOrMore, ZeroOrOnce.

enumerator P4_Cut

Rule: Cut.

enumerator P4_LeftRecursion

Rule: Left Recursion.

enum P4_Error

The error code.

Values:

enumerator P4_Ok

No error is like a bless.

enumerator P4_InternalError

When there is an internal error. Please raise an issue: https://github.com/soasme/peppapeg/issues.

enumerator P4_MatchError

When no text is matched.

enumerator P4_NameError

When no name is resolved.

enumerator P4_AdvanceError

When the parse gets stuck forever or has reached the end.

enumerator P4_MemoryError

When out of memory.

enumerator P4_ValueError

When the given value is of unexpected type.

enumerator P4_IndexError

When the index is out of boundary.

enumerator P4_KeyError

When the id is out of the table.

enumerator P4_NullError

When null is encountered.

enumerator P4_StackError

When recursion limit is reached.

enumerator P4_PegError

When the given value is not valid peg grammar.

enumerator P4_CutError

When the failure occurs after a @cut operator.

TYPEDEFS

Typedefs

typedef uint32_t ucs4_t

A single unicode character. Used when libunistring is not installed.

typedef uint32_t P4_ExpressionFlag

The flag of expression.

typedef char *P4_String

The C string type in locale encoding, by default utf-8.

typedef uint8_t *P4_Utf8

The utf-8 string type.

typedef void *P4_UserData

The reference of user data.

typedef void (*P4_UserDataFreeFunc)(P4_UserData)

The function to free user data.

typedef struct P4_Grammar P4_Grammar

The grammar object that holds all grammar rules.

typedef struct P4_Expression P4_Expression

The grammar rule expression.

typedef struct P4_Frame P4_Frame

The stack frame.

typedef struct P4_Node P4_Node

The node object of abstract syntax tree.

typedef struct P4_Source P4_Source

The source object that holds text to parse.

typedef struct P4_Slice P4_Slice

The slice of a string.

typedef P4_Error (*P4_MatchCallback)(P4_Grammar*, P4_Expression*, P4_Node*)

The callback for a successful match.

typedef P4_Error (*P4_ErrorCallback)(P4_Grammar*, P4_Expression*)

The callback for a failure match.

typedef void (*P4_Formatter)(FILE *stream, P4_Node *node)

A formatter function for the node.

typedef struct P4_Position P4_Position

The position.

P4_Position does not hold a pointer to the string.

Example:

 P4_Position pos = { .pos=10, .lineno=1, .offset=2 };
 printf("%u..%u\n", pos.lineno, pos.offset);

typedef struct P4_RuneRange P4_RuneRange

P4_RuneRange specifies a range between two runes.

Example:

 P4_RuneRange range = { .lower='a', .upper='z', .stride=1 };

typedef struct P4_Result P4_Result

The result object that holds either value or errors.

FUNCTIONS

Functions

P4_String P4_Version(void)

Provide the version string for the library.

Example:

 P4_String   version = P4_Version();
 printf("version=%s\n", version);

Returns

a string like “1.0.0”.

size_t P4_ReadRune(P4_String s, ucs4_t *c)

Read a single code point (rune) from an UTF-8 string.

size_t P4_ReadEscapedRune(char *text, ucs4_t *rune)

Read an escaped single code point (rune) from an UTF-8 string.

For example:

 ucs4_t rune;
 P4_ReadEscapedRune("\\u000D", rune); // 0xd

Parameters
  • text – The text.

  • rune – The rune.

Returns

The size of rune.

void *P4_ConcatRune(void *str, ucs4_t chr, size_t n)

Append a rune to str (in-place).

Parameters
  • str – The string to be appended.

  • chr – The rune.

  • n – The size of rune. Should be 1, 2, 3, 4.

Returns

The string starting from appended rune.

P4_Expression *P4_CreateLiteral(const P4_String literal, bool sensitive)

Create a P4_Literal expression.

Example:

 // It can match "let", "Let", "LET", etc.
 P4_Expression* expr = P4_CreateLiteral("let", false);

 // It can only match "let".
 P4_Expression* expr = P4_CreateLiteral("let", true);

The object holds a full copy of the literal.

Parameters
  • literal – The exact string to match.

  • sensitive – Whether the string is case-sensitive.

Returns

A P4_Expression.

P4_Error P4_AddLiteral(P4_Grammar *grammar, P4_String name, const P4_String literal, bool sensitive)

Add a literal expression as grammar rule.

Example:

 P4_AddLiteral(grammar, "R1", "let", true);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • literal – The exact string to match.

  • sensitive – Whether the string is case-sensitive.

Returns

The error code.

P4_Expression *P4_CreateRange(ucs4_t lower, ucs4_t upper, size_t stride)

Create a P4_Range expression.

Example:

 // It can match 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
 P4_Expression* expr = P4_CreateRange('0', '9', 1);

 // It can match any code points starting from U+4E00 to U+9FCC (CJK unified ideographs block).
 P4_Expression* expr = P4_CreateRange(0x4E00, 0x9FFF, 1);

Parameters
  • lower – The lower bound of UTF-8 rule to match (inclusive).

  • upper – The upper bound of UTF-8 rule to match (inclusive).

  • stride – The stride when iterating the characters in range.

Returns

A P4_Expression.

P4_Expression *P4_CreateRanges(size_t count, P4_RuneRange *ranges)

Create a P4_Range expression that holds multiple ranges.

Example:

 P4_RuneRange alphadigits[] = {{'a', 'Z', 1}, {'0', '9', 1}};
 P4_Expression* range = P4_CreateRanges(
     sizeof(alphadigits) / sizeof(P4_RuneRange),
     alphadigits
 );

Parameters
  • count – The total number of ranges.

  • ranges – A list of P4_RuneRange.

Returns

A P4_Expression.

P4_Error P4_AddRange(P4_Grammar *grammar, P4_String name, ucs4_t lower, ucs4_t upper, size_t stride)

Add a range expression as grammar rule.

Example:

 P4_AddRange(grammar, "R1", '0', '9', 1);

 P4_AddRange(grammar, "R1", 0x4E00, 0x9FFF, 1);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • lower – The lower bound of UTF-8 rule to match (inclusive).

  • upper – The upper bound of UTF-8 rule to match (inclusive).

  • stride – The stride when iterating the characters in range.

Returns

The error code.

P4_Error P4_AddRanges(P4_Grammar *grammar, P4_String name, size_t count, P4_RuneRange *ranges)

Add sub-ranges expression as grammar rule.

Example:

 P4_RuneRange alphadigits[] = {{'a', 'Z', 1}, {'0', '9', 1}};
 P4_Error err = P4_AddRanges(
     grammar, "R1",
     sizeof(alphadigits) / sizeof(P4_RuneRange),
     alphadigits
 );

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • count – The total number of ranges.

  • ranges – A list of P4_RuneRange.

Returns

The error code.

P4_Expression *P4_CreateReference(P4_String reference)

Create a P4_Reference expression.

Example:

 P4_Expression* expr = P4_CreateReference("entry");

Parameters
  • reference – The grammar rule name.

Returns

A P4_Expression.

P4_Error P4_AddReference(P4_Grammar *grammar, P4_String name, P4_String ref_name)

Add a reference expression as grammar rule.

Example:

 P4_AddReference(grammar, "R1", 2);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • ref_name – The name of referenced grammar rule.

Returns

The error code.

P4_Expression *P4_CreatePositive(P4_Expression *refexpr)

Create a P4_Positive expression.

Example:

 // If the following text includes "let", the match is successful.
 P4_Expression* expr = P4_CreatePositive(P4_CreateLiteral("let", true));

Parameters
  • refexpr – The positive pattern to check.

Returns

A P4_Expression.

P4_Error P4_AddPositive(P4_Grammar *grammar, P4_String name, P4_Expression *refexpr)

Add a positive expression as grammar rule.

Example:

 P4_AddPositive(grammar, "R1", P4_CreateLiteral("let", true));

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • refexpr – The positive pattern to check.

Returns

The error code.

P4_Expression *P4_CreateNegative(P4_Expression *expr)

Create a P4_Negative expression.

Example:

 // If the following text does not start with "let", the match is successful.
 P4_Expression* expr = P4_CreateNegative(P4_CreateLiteral("let", true));

Parameters
  • expr – The negative pattern to check.

Returns

A P4_Expression.

P4_Error P4_AddNegative(P4_Grammar *grammar, P4_String name, P4_Expression *refexpr)

Add a negative expression as grammar rule.

Example:

 P4_AddNegative(grammar, "R1", P4_CreateLiteral("let", true));

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • refexpr – The negative pattern to check.

Returns

The error code.

P4_Expression *P4_CreateCut()

Create a P4_Cut expression.

Returns

A P4_Expression.

P4_Expression *P4_CreateLeftRecursion(P4_Expression *lhs, P4_Expression *rhs)

Create a P4_LeftRecursion expression.

Parameters
  • lhs – Left-hand side of left recursion.

  • rhs – Right-hand side of left recursion.

Returns

A P4_Expression.

P4_Error P4_AddLeftRecursion(P4_Grammar *grammar, P4_String name, P4_Expression *lhs, P4_Expression *rhs)

Add a P4_LeftRecursion expression as grammar rule.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • lhs – Left-hand side of left recursion.

  • rhs – Right-hand side of left recursion.

Returns

The error code.

P4_Expression *P4_CreateSequence(size_t count)

Create a P4_Sequence expression.

Example:

 P4_Expression* expr = P4_CreateSequence(3);

Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.

This function can be useful if you need to add members dynamically.

Parameters
  • count – The number of sequence members.

Returns

A P4_Expression.

P4_Expression *P4_CreateSequenceWithMembers(size_t count, ...)

Create a P4_Sequence expression.

Example:

 // It can match { BODY }.
 P4_Expression* expr = P4_CreateSequenceWithMembers(3,
     P4_CreateLiteral("{"),
     P4_CreateReference(BODY),
     P4_CreateLiteral("}")
 );

Parameters
  • count – The number of sequence members.

  • ... – The vararg of every sequence member.

Returns

A P4_Expression.

P4_Error P4_AddSequence(P4_Grammar *grammar, P4_String name, size_t count)

Add a sequence expression as grammar rule.

Example:

 P4_AddSequence(grammar, "R1", 3);

Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.

This function can be useful if you need to add members dynamically.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • count – The number of sequence members.

Returns

The error code.

P4_Error P4_AddSequenceWithMembers(P4_Grammar *grammar, P4_String name, size_t count, ...)

Add a sequence expression as grammar rule.

Example:

 P4_AddSequenceWithMembers(grammar, "R1", 3,
     P4_CreateLiteral("{"),
     P4_CreateReference(BODY),
     P4_CreateLiteral("}")
 );

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • count – The number of sequence members.

  • ... – The members.

Returns

The error code.

P4_Expression *P4_CreateChoice(size_t count)

Create a P4_Choice expression.

Example:

 P4_Expression* expr = P4_CreateChoice(3);

Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.

This function can be useful if you need to add members dynamically.

Parameters
  • count – The number of choice members.

Returns

A P4_Expression.

P4_Expression *P4_CreateChoiceWithMembers(size_t count, ...)

Create a P4_Choice expression.

Example:

 // It can match whitespace, tab and newline.
 P4_Expression* expr = P4_CreateChoiceWithMembers(3,
     P4_CreateLiteral(" "),
     P4_CreateReference(\t),
     P4_CreateLiteral("\n")
 );

Parameters
  • count – The number of choice members.

  • ... – The vararg of every choice member.

Returns

A P4_Expression.

P4_Error P4_AddChoice(P4_Grammar *grammar, P4_String name, size_t count)

Add a choice expression as grammar rule.

Example:

 P4_AddChoice(grammar, "R1", 3);

Note that such an expression is useless as its members are all empty. Please set members using P4_SetMembers.

This function can be useful if you need to add members dynamically.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • count – The number of choice members.

Returns

The error code.

P4_Error P4_AddChoiceWithMembers(P4_Grammar *grammar, P4_String name, size_t count, ...)

Add a choice expression as grammar rule.

Example:

 P4_AddChoiceWithMembers(grammar, "R1", 3,
     P4_CreateLiteral(" "),
     P4_CreateReference(\t),
     P4_CreateLiteral("\n")
 );

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • count – The number of choice members.

  • ... – The members.

Returns

The error code.

P4_Expression *P4_CreateRepeatMinMax(P4_Expression *repeat_expr, size_t min, size_t max)

Create a P4_Repeat expression minimal min times and maximal max times.

Example:

 // It can match string "a", "aa", or "aaa".
 P4_Expression* expr = P4_CreateRepeatMinMax(
     P4_CreateLiteral("a", true),
     1, 3
 );

Parameters
  • repeat_expr – The repeated expression.

  • min – The minimum repeat times.

  • max – The maximum repeat times.

Returns

A P4_Expression.

P4_Error P4_AddRepeatMinMax(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t min, size_t max)

Create a P4_Repeat expression minimal min times and maximal max times.

Example:

 // It can match string "a", "aa", or "aaa".
 P4_AddRepeatMinMax(grammar, "R1",
     P4_CreateLiteral("a", true),
     1, 3
 );

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

  • min – The minimum repeat times.

  • max – The maximum repeat times.

Returns

The error code.

P4_Expression *P4_CreateRepeatMin(P4_Expression *repeat_expr, size_t min)

Create a P4_Repeat expression minimal min times and maximal SIZE_MAX times.

Example:

 // It can match string "a", "aa", "aaa", ....
 P4_Expression* expr = P4_CreateRepeatMin(P4_CreateLiteral("a", true), 1);

It’s equivalent to P4_CreateRepeatMinMax(expr, min, SIZE_MAX);

Parameters
  • repeat_expr – The repeated expression.

  • min – The minimum repeat times.

Returns

A P4_Expression.

P4_Error P4_AddRepeatMin(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t min)

Create a RepeatMin expression as grammar rule.

Example:

 // It can match string "a", "aa", "aaa", ....
 P4_AddRepeatMin(grammar, "R1", P4_CreateLiteral("a", true), 1);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

  • min – The minimum repeat times.

Returns

The error code.

P4_Expression *P4_CreateRepeatMax(P4_Expression *repeat_expr, size_t max)

Create a P4_Repeat expression maximal max times.

Example:

 // It can match string "", "a", "aa", "aaa".
 P4_Expression* expr = P4_CreateRepeatMax(P4_CreateLiteral("a", true), 3);

It’s equivalent to P4_CreateRepeatMinMax(expr, 0, max);

Parameters
  • repeat_expr – The repeated expression.

  • max – The maximum repeat times.

Returns

A P4_Expression.

P4_Error P4_AddRepeatMax(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t max)

Add a RepeatMax expression as grammar rule.

Example:

 // It can match string "", "a", "aa", "aaa".
 P4_AddRepeatMax(grammar, "name", P4_CreateLiteral("a", true), 3);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

  • max – The maximum repeat times.

Returns

A P4_Expression.

P4_Expression *P4_CreateRepeatExact(P4_Expression *repeat_expr, size_t times)

Create a P4_Repeat expression exact N times.

Example:

 // It can match string "aaa".
 P4_Expression* expr = P4_CreateRepeatExact(P4_CreateLiteral("a", true), 3);

It’s equivalent to P4_CreateRepeatMinMax(expr, N, N);

Parameters
  • repeat_expr – The repeated expression.

  • times – The repeat times.

Returns

A P4_Expression.

P4_Error P4_AddRepeatExact(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr, size_t times)

Add a RepeatExact expression as grammar rule.

Example:

 // It can match string "aaa".
 P4_AddRepeatExact(grammar, "R1", P4_CreateLiteral("a", true), 3);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

  • times – The repeat times.

Returns

The error code.

P4_Expression *P4_CreateZeroOrOnce(P4_Expression *expr)

Create a P4_Repeat expression zero or once.

Example:

 // It can match string "" or "a".
 P4_Expression* expr = P4_CreateZeroOrOnce(P4_CreateLiteral("a", true));

It’s equivalent to P4_CreateRepeatMinMax(expr, 0, 1);

Parameters
  • expr – The repeated expression.

Returns

A P4_Expression.

P4_Error P4_AddZeroOrOnce(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)

Add a ZeroOrOnce expression as grammar rule.

Example:

 // It can match string "" or "a".
 P4_AddZeroOrOnce(grammar, "R1", P4_CreateLiteral("a", true));

It’s equivalent to P4_CreateRepeatMinMax(expr, 0, 1);

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

Returns

The error code.

P4_Expression *P4_CreateZeroOrMore(P4_Expression *expr)

Create a P4_Repeat expression zero or more times.

Example:

 // It can match string "" or "a".
 P4_Expression* expr = P4_CreateZeroOrMore(P4_CreateLiteral("a", true));

It’s equivalent to P4_CreateRepeatMinMax(expr, 0, SIZE_MAX);

Parameters
  • expr – The repeated expression.

Returns

A P4_Expression.

P4_Error P4_AddZeroOrMore(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)

Add a ZeroOrMore expression as grammar rule.

Example:

 // It can match string "" or "a".
 P4_AddZeroOrMore(grammar, "R1", P4_CreateLiteral("a", true));

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

Returns

The error code.

P4_Expression *P4_CreateOnceOrMore(P4_Expression *expr)

Create a P4_Repeat expression once or more times.

Example:

 // It can match string "a", "aa", "aaa", ....
 P4_Expression* expr = P4_CreateOnceOrMore(P4_CreateLiteral("a", true));

It’s equivalent to P4_CreateRepeatMinMax(expr, 1, SIZE_MAX);

Parameters
  • expr – The repeated expression.

Returns

A P4_Expression.

P4_Error P4_AddOnceOrMore(P4_Grammar *grammar, P4_String name, P4_Expression *repeat_expr)

Add an OnceOrMore expression as grammar rule.

Example:

 // It can match string "a", "aa", "aaa", ....
 P4_AddOnceOrMore(grammar, "R1", P4_CreateLiteral("a", true));

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • repeat_expr – The repeated expression.

Returns

The error code.

P4_Expression *P4_CreateBackReference(size_t backref_index, bool sensitive)

Create a P4_BackReference expression.

Example:

 // It can match string "EOF MULTILINE EOF" or "eof MULTILINE eof", but not "EOF MULTILINE eof".
 P4_Expression* expr = P4_CreateSequenceWithMembers(4,
     P4_CreateLiteral("EOF", false),
     P4_CreateReference(MULTILINE),
     P4_CreateBackReference(0, true)
 );

Parameters
  • backref_index – The index of backref member in the sequence.

  • sensitive – Whether the backref matching is case sensitive.

Returns

A P4_Expression.

P4_Error P4_SetMember(P4_Expression *expr, size_t index, P4_Expression *member)

Set the member of Sequence/Choice at a given index.

Example:

 P4_Error       err;
 P4_Expression* expr = P4_CreateSequence(2);
 if ((err = P4_SetMember(expr, 0, P4_CreateLiteral("a", true))) != P4_Ok) {
     // handle error.
 }
 if ((err = P4_SetMember(expr, 1, P4_CreateLiteral("b", true))) != P4_Ok) {
     // handle error.
 }

Parameters
  • expr – The sequence/choice expression.

  • index – The index of member.

  • member – The member expression.

Returns

The error code.

P4_Error P4_SetReferenceMember(P4_Expression *expr, size_t index, P4_String name)

Set the referenced member of Sequence/Choice at a given index.

It’s equivalent to:

 P4_SetMember(expr, index, P4_CreateReference("member_n"));

Parameters
  • expr – The sequence/choice expression.

  • index – The index of member.

  • name – The reference name of grammar rule for member.

Returns

The error code.

size_t P4_GetMembersCount(P4_Expression *expr)

Get the total number members for sequence/choice.

Parameters
  • expr – The sequence/choice expression.

Returns

The number. If something goes wrong, it returns 0.

 P4_Expression* expr = P4_CreateSequence(3);
 size_t  count = P4_GetMembers(expr); // 3

P4_Expression *P4_GetMember(P4_Expression *expr, size_t index)

Get the member of sequence/choice at a given index.

Example:

 P4_Expression* member = P4_GetMember(expr, 0);

Parameters
  • expr – The sequence/choice expression.

  • index – The index of a member.

Returns

The P4_Expression object of a member.

P4_Expression *P4_CreateStartOfInput()

Create an Start-Of-Input expression.

Example:

 P4_Expression* expr = P4_CreateStartOfInput();

Start-Of-Input can be used to insert whitespace before the actual content.

Example:

 P4_AddSequenceWithMembers(grammar, "R1", 2,
     P4_CreateStartOfInput(),
     P4_CreateLiteral("HELLO", true)
 );

Start-Of-Input is equivalent to `P4_CreatePositive(P4_CreateRange(1, 0x10ffff, 1)).

Returns

A P4_Expression.

P4_Expression *P4_CreateEndOfInput()

Create an End-Of-Input expression.

Example:

 P4_Expression* expr = P4_CreateEndOfInput();

End-Of-Input can be used to insert whitespace after the actual content.

Example:

 P4_AddSequenceWithMembers(grammar, "R1", 2,
     P4_CreateLiteral("HELLO", true),
     P4_CreateEndOfInput()
 );

Start-Of-Input is equivalent to `P4_CreateNegative(P4_CreateRange(1, 0x10ffff, 1)).

Returns

A P4_Expression.

P4_Error P4_AddJoin(P4_Grammar *grammar, P4_String name, const P4_String joiner, P4_String reference)

Add a Join expression as grammar rule.

Example:

 P4_Expression* row = P4_AddJoin(grammar, "R1", ",", "element");

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • joiner – A joiner literal.

  • reference – The repeated pattern rule name.

Returns

The error code.

P4_Expression *P4_CreateJoin(const P4_String joiner, P4_String rule_name)

Create a Join expression.

Example:

 P4_Expression* row = P4_CreateJoin(",", "element");

Join is a syntax sugar to the below structure:

 Sequence(pattern, ZeroOrMore(Sequence(Literal(joiner), pattern)))

Parameters
  • joiner – A joiner literal.

  • rule_name – The repeated pattern rule name.

Returns

A P4_Expression.

void P4_DeleteExpression(P4_Expression *expr)

Free an expression.

Example:

 P4_Expression* expr = P4_CreateLiteral("a", true);
 P4_DeleteExpression(expr);

The members of a sequence/choice expression will be deleted. The ref_expr of a positive/negative expression will be deleted.

P4_Reference only hold a reference so the referenced expression won’t be freed.

Parameters
  • expr – The expression.

bool P4_IsRule(P4_Expression *expr)

Check if an expression is a grammar rule.

Parameters
  • expr – The expression.

Returns

if an expression is a grammar rule.

 P4_AddLiteral(grammar, "R1", "a", true);
 P4_IsRule(P4_GetGrammarRule(grammar, 1)); // true

bool P4_IsSquashed(P4_Expression *expr)

Check if an expression has P4_FLAG_SQUASHED flag.

bool P4_IsLifted(P4_Expression *expr)

Check if an expression has P4_FLAG_LIFTED flag.

bool P4_IsTight(P4_Expression *expr)

Check if an expression has P4_FLAG_TIGHT flag.

bool P4_IsScoped(P4_Expression *expr)

Check if an expression has P4_FLAG_SCOPED flag.

bool P4_IsSpaced(P4_Expression *expr)

Check if an expression has P4_FLAG_SPACED flag.

void P4_SetExpressionFlag(P4_Expression *expr, P4_ExpressionFlag flag)

Set the flag for an expression.

Example:

 P4_SetExpressionFlag(expr, P4_FLAG_SQUASHED);
 P4_SetExpressionFlag(expr, P4_FLAG_TIGHT | P4_FLAG_SQUASHED);

Parameters
  • expr – The expression.

  • flag – The flag to set.

P4_Grammar *P4_CreateGrammar(void)

Create a P4_Grammar object.

Example:

 P4_Grammar*     grammar = P4_CreateGrammar();

Returns

A P4_Grammar object.

void P4_DeleteGrammar(P4_Grammar *grammar)

Delete the grammar object.

It will also free all of the expression rules.

Parameters
  • grammar – The grammar.

P4_Error P4_AddGrammarRule(P4_Grammar *grammar, P4_String name, P4_Expression *expr)

Add a grammar rule.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • expr – The grammar rule expression.

Returns

The error code.

void P4_DeleteGrammarRule(P4_Grammar *grammar, const P4_String name)

Delete a grammar rule.

WARNING: NOT IMPLEMENTED.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

P4_Expression *P4_GetGrammarRule(P4_Grammar *grammar, P4_String name)

Get a grammar rule by its name.

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

Returns

The grammar rule expression. Returns NULL if not found.

 P4_AddLiteral(grammar, "a", true);
 P4_Expression* expr = P4_GetGrammarRule(grammar, "a"); // The literal expression.

P4_Error P4_SetGrammarRuleFlag(P4_Grammar *grammar, P4_String name, P4_ExpressionFlag flag)

Set the flag of a grammar rule.

Example:

 P4_Error err = P4_SetGrammarRuleFlag(grammar, "entry", P4_FLAG_SQUASHED | P4_FLAG_LIFTED | P4_FLAG_TIGHT);
 if (err != P4_Ok) {
     printf("err=%u\n", err);
     exit(1);
 }

Parameters
  • grammar – The grammar.

  • name – The grammar rule name.

  • flag – The bits of P4_ExpressionFlag.

Returns

The error code. If successful, return P4_Ok.

P4_Error P4_SetRecursionLimit(P4_Grammar *grammar, size_t limit)

Set the maximum allowed recursion calls.

Example:

 // on a machine with small memory.
 P4_SetRecursionLimit(grammar, 256);

 // on a machine with large memory.
 P4_SetRecursionLimit(grammar, 1024*20);

Parameters
  • grammar – The grammar.

  • limit – The number of maximum recursion calls.

Returns

An error. If successful, return P4_Ok.

P4_Error P4_SetUserDataFreeFunc(P4_Grammar *grammar, P4_UserDataFreeFunc free_func)

Set free function for the user data.

Parameters
  • grammar – The grammar.

  • free_func – The free function.

Returns

The error code.

size_t P4_GetRecursionLimit(P4_Grammar *grammar)

Get the maximum allowed recursion calls.

Example:

 size_t  limit = P4_GetRecursionLimit(grammar);
 printf("limit=%u\n", limit);

Parameters
  • grammar – The grammar.

Returns

The maximum allowed recursion calls. If something goes wrong, return 0.

P4_Source *P4_CreateSource(P4_String content, P4_String entry_name)

Create a P4_Source* object.

Example:

 char*       content = ... // read from some files.

 P4_Source*  source  = P4_CreateSource(content, "entry");

 // do something...

 P4_DeleteSource(source);

Parameters
  • content – The content of input.

  • entry_name – The entry grammar rule name.

Returns

A source object.

void P4_DeleteSource(P4_Source *source)

Free the allocated memory of a source. If the source has been parsed and has an AST, the entire AST will also be free-ed.

Example:

 P4_DeleteSource(source);

Parameters
  • source – The source.

void P4_ResetSource(P4_Source *source)

Reset the source so it can be called with P4_Parse again. All of its internal states are cleared.

Example:

 // parse full text.
 P4_Parse(grammar, source);

 P4_ResetSource(source);
 P4_SetSourceSlice(source, 1, 10);

 // parse text[1:10].
 P4_Parse(grammar, source);

Parameters
  • source – The source.

P4_Error P4_SetSourceSlice(P4_Source *source, size_t start, size_t stop)

Set the buf size of the source content. If not set, strlen(source->content) is used.

Example:

 P4_String  input = "(a)"
 P4_Source* source = P4_CreateSource(input, "a");
 if (P4_Ok != P4_SetSourceSlice(source, 1, 2)) // only parse "a"
     printf("set buf size error\n");

Parameters
  • source – The source.

  • start – The start position, inclusive.

  • stop – The stop position, exclusive.

Returns

The error code.

P4_Node *P4_GetSourceAst(P4_Source *source)

Get the root node of abstract syntax tree of the source.

Parameters
  • source – The source.

Returns

The root node. If the parse is failed or the root node is lifted, return NULL.

P4_Node *P4_AcquireSourceAst(P4_Source *source)

Transfer the ownership of source ast to the caller.

You’re on your own now to manage the de-allocation of the source ast.

The source is implicitly reset after the source ast is acquired by the caller.

Example:

 P4_Node* root = P4_AcquireSourceAst(source);
 // ...
 P4_DeleteNode(grammar, root);

Parameters
  • source – The source.

Returns

The root node. If the parse is failed or the root node is lifted, return NULL.

size_t P4_GetSourcePosition(P4_Source *source)

Get the last position in the input after a parse.

Parameters
  • source – The source.

Returns

The position in the input.

void P4_JsonifySourceAst(FILE *stream, P4_Node *node, P4_Formatter formatter)

Print the node tree.

Example:

 P4_Node* root = P4_GetSourceAst(source);
 P4_JsonifySourceAst(stdout, root);

Parameters
  • stream – The output stream.

  • node – The root node of source ast. *param formatter A callback function to format node.

P4_Error P4_InspectSourceAst(P4_Node *node, void *userdata, P4_Error (*inspector)(P4_Node*, void*))

Inspect the node tree.

Example:

 void MyInspector(P4_Node* node, void* userdata) {
     printf("%lu\t%lu\t%p\n", node->slice.start.pos, P4_GetRuleName(node), userdata);
     return P4_Ok;
 }

 P4_Error err = P4_InspectSourceAst(root, NULL, MyInspector);

Parameters
  • node – The root node of source ast.

  • userdata – Any additional information you want to pass in.

  • inspector – The inspecting function. It should return P4_Ok for a successful inspection.

P4_Error P4_Parse(P4_Grammar *grammar, P4_Source *source)

Parse the source given a grammar.

Example:

 if ((err = P4_Parse(grammar, source)) != P4_Ok) {
     // error handling ...
 }

Parameters
  • grammar – The grammar.

  • source – The source.

Returns

The error code. If successful, return P4_Ok.

bool P4_HasError(P4_Source *source)

Determine whether the parse is failed.

Example:

 if (P4_HasError(source)) {
     printf("err=%u\n", P4_GetError(source));
     printf("msg=%s\n", P4_GetErrorMessage(source));
 }

Parameters
  • source – The source.

Returns

Whether the parse is failed.

P4_Error P4_GetError(P4_Source *source)

Get the error code if failed to parse the source.

Example:

 if (P4_Ok != P4_Parse(grammar, source)) {
     printf("err=%u\n", P4_GetError(source));
 }

Parameters
  • source – The source.

Returns

The error code.

P4_String P4_GetErrorString(P4_Error err)

Get the error string given an error code.

Example:

 P4_String errstr = P4_GetErrorString(P4_MatchError);
 printf("%s\n", errstr); // "MatchError"

Parameters
  • err – The error code.

Returns

The error string.

P4_String P4_GetErrorMessage(P4_Source *source)

Get the error message if failed to parse the source.

Example:

 if (P4_Ok != P4_Parse(grammar, source)) {
     printf("msg=%s\n", P4_GetErrorMessage(source));
 }

The returned value is a reference to the internal string. Don’t free it after use.

Parameters
  • source – The source.

Returns

The error message.

P4_Node *P4_CreateNode(P4_String text, P4_Position *start, P4_Position *stop, P4_String rule)

Create a node.

Example:

 P4_String       str     = "Hello world";
 size_t          start   = 0;
 size_t          stop    = 11;
 P4_String       rule    = "entry"

 P4_Node* node = P4_CreateNode(text, start, stop, rule);

 // do something.

 P4_DeleteNode(grammar, node);

Parameters
  • text – The source text.

  • start – The starting position of the text.

  • stop – The stopping position of the text.

  • rule – The name of rule expression that matches to the slice of the text.

Returns

The node.

void P4_DeleteNode(P4_Grammar *grammar, P4_Node *node)

Delete the node. This will free the occupied memory for node. The str of the node won’t be free-ed since the node only owns not the string but the slice of a string.

Example:

 P4_DeleteNode(grammar, node);

Parameters
  • grammar – The grammar.

  • node – The node.

void P4_DeleteNodeChildren(P4_Grammar *grammar, P4_Node *node)

Delete the node children. This will free the occupied memory for all node children.

Example:

 P4_DeleteNodeChildren(grammar, node);

Parameters
  • grammar – The grammar.

  • node – The node.

P4_Slice *P4_GetNodeSlice(P4_Node *node)

Get the slice that the node covers. The slice is owned by the node so don’t free it.

Example:

 P4_Slice* slice = P4_GetNodeSlice(node);
 printf("node slice=[%u..%u]\n", slice->i, slice->j);

Parameters
  • node – The node.

Returns

The slice.

size_t P4_GetNodeChildrenCount(P4_Node *node)

Get the total number of node children.

Example:

 P4_Node* root = P4_GetSourceAst(source);
 size_t count = P4_GetNodeChildrenCount(root);
 printf("There are %lu children for root node\n", count);

Parameters
  • node – The node.

Returns

The total number of node children.

P4_String P4_CopyNodeString(P4_Node *node)

Copy the string that the node covers. The caller is responsible for freeing the string.

Example:

 P4_String* str = P4_CopyNodeString(node);
 printf("node str=%s\n", str);
 free(str);

Parameters
  • node – The node.

Returns

The string.

P4_Error P4_SetGrammarCallback(P4_Grammar *grammar, P4_MatchCallback matchcb, P4_ErrorCallback errcb)

Set callback function.

Parameters
  • grammar – The grammar.

  • matchcb – The callback on a successful match.

  • errcb – The callback on a failure match.

Returns

The error code.

P4_Error P4_ReplaceGrammarRule(P4_Grammar *grammar, P4_String name, P4_Expression *expr)

Replace an existing grammar rule.

The original grammar rule will be deleted.

Parameters
  • grammar – The grammar.

  • name – The rule name.

  • expr – The rule expression to replace.

Returns

The error code.

P4_Grammar *P4_CreatePegGrammar()

Create a grammar that can parse other grammars written in PEG syntax.

Example:

 P4_Grammar* peg = P4_CreatePegGrammar();
 P4_DeleteGrammar(peg);

Returns

The grammar object.

P4_Error P4_LoadGrammarResult(P4_String rules, P4_Result *result)

Load peg grammar result from a string.

To get the result grammar, if return P4_Ok, use result->grammar.

Example:

 P4_Result result = {0};

 if (P4_Ok == P4_LoadGrammarResult(RULES, &result)) {
     P4_Grammar* grammar = result.grammar;
 } else {
     printf("%s\n", result.errmsg);
 }

Parameters
  • rules – The rules string.

  • result – The P4_Result object.

Returns

The error code.

P4_Grammar *P4_LoadGrammar(P4_String rules)

Load PEG grammar written in string.

Example:

 P4_Grammar* grammar = P4_LoadGrammar(
     "entry = one one;\n"
     "one   = \"1\";\n"
 );
 P4_Source* source1 = P4_CreateSource("11", "entry");
 P4_Parse(grammar, source1);

 P4_Source* source2 = P4_CreateSource("1", "one");
 P4_Parse(grammar, source2);

This function exits the program when an error occurs.

Parameters
  • rules – The grammar rules string.

Returns

The grammar object.

P4_ConstString P4_GetRuleName(P4_Expression *expr)

Get the rule name.

Example:

 P4_String name = P4_GetRuleName(expr);

Parameters
  • expr – The rule expression.

Returns

The rule name.