class RLexer (View source)

Multistate lexer class.

Lexemes can be defined on the fly. If the particular lexer instance is meant to be used with Parle\RParser, the token IDs need to be taken from there. Otherwise, arbitrary token IDs can be supplied. Note, that Parle\Parser is not compatible with this lexer.

Constants

ICASE

DOT_NOT_LF

DOT_NOT_CRLF

SKIP_WS

MATCH_ZERO_LEN

Properties

bool $bol
int $flags
int $state
int $marker
int $cursor

Methods

void
advance()

Processes the next rule and prepares the resulting token data.

void
build()

Finalize the lexer rule set

void
callout(int $id, callable $callback)

Define token callback

void
consume(string $data)

Pass the data for processing

void
dump()

Dump the state machine

getToken()

Retrieve the current token.

void
push(string $state, string $regex, string $newState)

Add a lexer rule

int
pushState(string $state)

Push a new start state This lexer type can have more than one state machine.

void
reset(int $pos)

Reset lexer

Details

void advance()

Processes the next rule and prepares the resulting token data.

Return Value

void

void build()

Finalize the lexer rule set

Rules, previously added with Parle\RLexer::push() are finalized. This method call has to be done after all the necessary rules was pushed. The rule set becomes read only. The lexing can begin.

Return Value

void

See also

RLexer::push

void callout(int $id, callable $callback)

Define token callback

Define a callback to be invoked once lexer encounters a particular token.

Parameters

int $id

Token id.

callable $callback

Callable to be invoked. The callable doesn't receive any arguments and its return value is ignored.

Return Value

void

See also

https://php.net/manual/en/parle-rlexer.callout.php

void consume(string $data)

Pass the data for processing

Consume the data for lexing.

Parameters

string $data

Data to be lexed.

Return Value

void

See also

https://php.net/manual/en/parle-rlexer.consume.php

void dump()

Dump the state machine

Dump the current state machine to stdout.

Token getToken()

Retrieve the current token.

Return Value

Token

void push(string $state, string $regex, string $newState)

Add a lexer rule

Push a pattern for lexeme recognition. A 'start state' and 'exit state' can be specified by using a suitable signature.

Parameters

string $state

State name. If '*' is used as start state, then the rule is applied to all lexer states.

string $regex

Regular expression used for token matching.

string $newState

New state name, after the rule was applied. If '.' is specified as the exit state, then the lexer state is unchanged when that rule matches. An exit state with '>' before the name means push. Use the signature without id for either continuation or to start matching, when a continuation or recursion is required. If '<' is specified as exit state, it means pop. In that case, the signature containing the id can be used to identify the match. Note that even in the case an id is specified, the rule will finish first when all the previous pushes popped.

Return Value

void

int pushState(string $state)

Push a new start state This lexer type can have more than one state machine.

This allows you to lex different tokens depending on context, thus allowing simple parsing to take place. Once a state pushed, it can be used with a suitable Parle\RLexer::push() signature variant.

Parameters

string $state

Name of the state.

Return Value

int

See also

RLexer::push

void reset(int $pos)

Reset lexer

Reset lexing optionally supplying the desired offset.

Parameters

int $pos

Reset position.

Return Value

void