Applied to an Rsec grammar, parses the string str. Any syntax errors are reported as INVALID_TOKEN.


Applied to an Rsec grammar, parses the string str. Any synatx errors are reported in detail.


Converts a terminal symbol (regex or string) into a token that can be processed by Rsec,
e.g. as a value assigned to a non-terminal, or a token that can have any of the following operators
applied to it.
utf8_tail = /[\u0080-\u00bf]/.r
utf8_2 = /[\u00c2-\u00df]/.r  | utf8_tail


A lazy parser for a rule is constructed when parsing starts. It is useful to reference a rule 
that has not been defined yet: this applies to forward references (the rule is defined later),
and to recursive references
parser = lazy{future}
future = 'jim'.r
recurse = seq( lazy{recurse} , 'a' ) | 'b'
assert_equal 'jim', parser.parse '12323'
assert_equal 'recurse', parser.parse 'baaaaa'

one_of(str) helper

Parses one of the chars in str
multiplicative = one_of '*/%'
assert_equal '/', multiplicative.parse '/'
assert_equal Rsec::INVALID, actualmultiplicative.parse '+'

one_of_(str) helper

See also #one_of#, with leading and trailing optional breakable spaces
additive = one_of_('+-')
assert_equal '+', additive.parse('  +')

prim(type, options={}) helper

Primitive parser for numbers. The value of the expression is the number, as opposed to the textual value
returned by other terminal symbols. Returns nil if overflow or underflow.
There can be an optional '+' or '-' at the beginning of string except unsinged_int32 | unsinged_int64.
type =
  :double |
  :hex_double |
  :int32 |
  :int64 |
  :unsigned_int32 |
  :allowed_sign => '+' | '-' | '' | '+-' (default '+-')
  :allowed_signs => (same as :allowed_sign)
  :base => integer only (default 10)
p = prim :double
assert_equal 1.23, p.parse('1.23')
p = prim :double, allowed_sign: '-'
assert_equal 1.23, p.parse('1.23')
assert_equal -1.23, p.parse('-1.23')
assert_equal Rsec::INVALID, p.parse('+1.23')
p = prim :int32, base: 36
assert_equal 49713, p.parse('12cx')

seq(*xs) helper

Sequence parser. Processes a sequence of terminal or non-terminal symbols, and returns
a list of their values as evaluated by their respective rules. (Textual strings, for terminal symbols.) 
assert_equal ['a', 'b', 'c'], actualseq('a', 'b', 'c').parse('abc')

seq_(*xs) helper

Sequence parser with skippable pattern (or parser)
  :skip default= /\s*/
assert_equal ['a', 'b', 'c'], actualseq_('a', 'b', 'c', skip: ',').parse('a,b,c')

symbol(pattern, skip=/\s*/) helper

A symbol is a token wrapped with optional space

word(pattern) helper

A word is a token wrapped with word boundaries
assert_equal ['yes', '3'], seq('yes', '3').parse('yes3')
assert_equal INVALID, seq(word('yes'), '3').parse('yes3')


Transform result. Apply a procedure to the list generated by the rule.
Is implicit if a rule is followed by a Ruby block.
parser = /\w+/{|word| word * 2}
assert_equal 'hellohello', parser.parse!('hello')      


"p.join('+')" parses strings like "p+p+p+p+p", and returns
a list of 'p' interspersed with '+'.
Note that at least 1 instance of p appears in the string.
Sometimes it is useful to reverse the joining:
/\s*/.r.join('p').odd parses string like " p p  p "


Branch parser. Note that rsec is a 
PEG parser generator[];
beware of the difference between PEG and CFG. Like all PEGs, 
Rsec selects the first applicable option out of a list of choices,
and ignores all others. 

*(n, begin..end)

Repeat n or in a range.
If range.end < 0, repeat at least range.begin
(Infinity and -Infinity are considered)


Appears 0 or 1 times, result is wrapped in an array
parser = 'a'.r.maybe
assert_equal ['a'], parser.parse('a')
assert_equal [], parser.parse('')


Kleene star, 0 or more any times. Note that like other
PEGs[], Rsec is greedy:
Kleene stars and pluses will behave in counterintuitive ways when followed by
another term, and may need to be replaced by recursion (x = z, replaced by
x = z | y lazy{x} )


Lookahead term, note that other can be a very complex parser. Do not confuse this with
the semantic lookahead predicates of Treetop (blocks of Ruby code affecting parsing).


Negative lookahead predicate. Unlike in Treetop, the negative lookahead term must follow
rather than precede another term: x ^ y means "x, not followed by y". The Treetop
expression (!x y), which means "y, which we have ruled out as an instance of x",
is rendered in Rsec as seq( ''.r ^ x , y): empty string not followed by x, followed by y.
Do not confuse this with
the semantic lookahead predicates of Treetop (blocks of Ruby code affecting parsing).


When parsing fails on the preceding term, show "expect tokens" error for that tokens value.


Short for seq_(parser, other)[1]. Used to ignore a preceding delimiter in evaluating the value
of the term.


Short for seq_(parser, other)[0]. Used to ignore a following delimiter in evaluating the value
of the term.


Should be end of input after parse. Used to identify the root non-terminal of a grammar; e.g.
arithmetic = expr.eof means that the grammar arithmetic has expr as its root, and a parse of
expr can only be followed by end-of-file.


Packrat parser combinator, returns a parser that caches parse result, may optimize performance

[](idx) seq, seq_

Given that the result of a sequence parser is a list of all the terms recognised, this
returns the parse result for the idx-th value in the list. This computation is shorter 
and faster than map{|array| array[idx]}
assert_equal 'b', seq('a', 'b', 'c')[1].parse('abc')

unboxseq, seq_, join, join.even, join.odd

If parse result contains only 1 element, return the element instead of the array.


Think of "innerHTML"! Ignore the initial and the final values in the list returned
by a sequence parser.
parser = seq('<b>', /[\w\s]+/, '</b>').inner
parser.parse('<b>the inside</b>')


Operating on the results of join, keeps only the even (left, token) parts.
For example, unit.join(/\s+/) creates a list of [unit, \s+, unit, \s+, unit...];
unit.join(/\s+/).even only retains [unit, unit, unit...]


Operating on the results of join, keeps only the odd (right, inter) parts


Scan until the pattern happens. (Corresponds to StringScanner.scan_until.) So
x.until parses from the current point up to and including the pattern in x;
x.until corresponds to a non-greedy .*? x


alias for maybe


alias for fail