Options
All
  • Public
  • Public/Protected
  • All
Menu

tokenizer-dsl

Index

Type aliases

CharRange: string | number | [number | string, number | string]
Reader<Context>: ReaderFunction<Context> | ReaderCodegen

The reader definition that can be compiled into a function that reads chars from the input string.

Type parameters

  • Context = void

    The context passed by tokenizer.

ReaderFunction<Context>: (input: string, offset: number, context: Context) => number

Type parameters

  • Context = void

    The context passed by tokenizer.

Type declaration

    • (input: string, offset: number, context: Context): number
    • Takes the string input and the offset in this string offset and returns the next offset that is greater or equal to offset if reader matched or returns an offset that is less than offset if reader didn't match. The reader may return offsets that exceed the input length.

      const abcReader: Reader = (input, offset) => {
      return input.startsWith('abc', offset) ? offset + 3 : -1;
      };

      Parameters

      • input: string
      • offset: number
      • context: Context

      Returns number

StageProvider<Stage, Context>: (chunk: string, offset: number, length: number, context: Context, state: TokenizerState) => Stage

Type parameters

  • Stage

    The tokenizer stage type.

  • Context

    The context passed by tokenizer.

Type declaration

    • (chunk: string, offset: number, length: number, context: Context, state: TokenizerState): Stage
    • Returns the stage to which the tokenizer should transition.

      Parameters

      • chunk: string

        The input chunk from which the current token was read.

      • offset: number

        The chunk-relative offset where the current token was read.

      • length: number

        The number of chars read by the rule.

      • context: Context

        The context passed by tokenizer.

      • state: TokenizerState

        The current state of the tokenizer.

      Returns Stage

      The stage to which the tokenizer should transition.

TokenHandler<Type, Context>: (type: Type, chunk: string, offset: number, length: number, context: Context, state: TokenizerState) => void

Type parameters

  • Type = unknown

    The type of tokens emitted by rules.

  • Context = void

    The context passed by tokenizer.

Type declaration

    • (type: Type, chunk: string, offset: number, length: number, context: Context, state: TokenizerState): void
    • Triggered when a token was read from the input stream.

      The substring of the current token:

      const tokenValue = chunk.substr(offset, length);
      

      The offset of this token from the start of the input stream (useful if you're using Tokenizer.write):

      const absoluteOffset = state.chunkOffset + offset;
      

      Parameters

      • type: Type

        The type of the token that was read.

      • chunk: string

        The input chunk from which the token was read.

      • offset: number

        The chunk-relative offset from the start of the input stream where the token starts.

      • length: number

        The number of chars read by the rule.

      • context: Context

        The context passed by the tokenizer.

      • state: TokenizerState

        The current state of the tokenizer.

      Returns void

Variables

never: Reader<any> = ...

The singleton reader that always returns -1.

none: Reader<any> = ...

The singleton reader that always returns the current offset.

see

skip

see

end

Functions

  • Creates a reader that repeatedly reads chars using reader.

    Type parameters

    • Context = any

      The context passed by tokenizer.

    Parameters

    • reader: Reader<Context>

      The reader that reads chars.

    • options: AllOptions = {}

      Reader options.

    Returns Reader<Context>

  • Creates a reader that matches a single char by its code.

    see

    text

    Parameters

    • chars: CharRange[]

      An array of strings (each char from string is used for matching), char codes, or tuples of lower/upper chars (or char codes) that define an inclusive range of codes.

    Returns Reader<any>

  • createTokenizer<Type, Context>(rules: Rule<Type, void, Context>[]): Tokenizer<Type, void, Context>
  • createTokenizer<Type, Stage, Context>(rules: Rule<Type, Stage, Context>[], initialStage: Stage): Tokenizer<Type, Stage, Context>
  • Creates a new pure tokenizer function.

    Type parameters

    • Type

      The type of tokens emitted by the tokenizer.

    • Context = void

      The context that rules may consume.

    Parameters

    • rules: Rule<Type, void, Context>[]

      The list of rules that tokenizer uses to read tokens from the input chunks.

    Returns Tokenizer<Type, void, Context>

  • Creates a new pure tokenizer function.

    Type parameters

    • Type

      The type of tokens emitted by the tokenizer.

    • Stage

      The type of stages at which rules are applied.

    • Context = void

      The context that rules may consume.

    Parameters

    • rules: Rule<Type, Stage, Context>[]

      The list of rules that tokenizer uses to read tokens from the input chunks.

    • initialStage: Stage

      The initial state from which tokenization starts.

    Returns Tokenizer<Type, Stage, Context>

  • end(offset?: number): Reader<any>
  • Creates a reader that returns the input length plus the offset.

    see

    skip

    Parameters

    • offset: number = 0

      The offset added to the input length.

    Returns Reader<any>

  • lookahead<Context>(reader: Reader<Context>): Reader<Context>
  • Creates a reader that returns the current offset if the reader matches.

    Type parameters

    • Context = any

    Parameters

    Returns Reader<Context>

  • Creates a reader that returns reader result or current offset if reader returned didn't match.

    Type parameters

    • Context = any

      The context passed by tokenizer.

    Parameters

    • reader: Reader<Context>

      The reader which match must be considered optional.

    Returns Reader<Context>

  • Creates a reader that returns the result of the first matched reader.

    Type parameters

    • Context = any

      The context passed by tokenizer.

    Parameters

    • Rest ...readers: Reader<Context>[]

      Readers that are called.

    Returns Reader<Context>

  • regex(re: RegExp): Reader<any>
  • Creates a reader that matches a substring.

    Parameters

    • re: RegExp

      The RegExp to match.

    Returns Reader<any>

  • seq<Context>(...readers: Reader<Context>[]): Reader<Context>
  • Creates a reader that applies readers one after another.

    Type parameters

    • Context = any

      The context passed by tokenizer.

    Parameters

    • Rest ...readers: Reader<Context>[]

      Readers that are called.

    Returns Reader<Context>

  • skip(charCount: number): Reader<any>
  • Creates a reader that skips given number of chars.

    see

    end

    Parameters

    • charCount: number

      The number of chars to skip.

    Returns Reader<any>

  • Creates a reader that reads a substring from the input.

    see

    char

    Parameters

    • str: string

      The text to match.

    • options: TextOptions = {}

      Reader options.

    Returns Reader<any>

  • Creates a reader that reads chars until reader matches.

    Type parameters

    • Context = any

      The context passed by tokenizer.

    Parameters

    • reader: Reader<Context>

      The reader that reads chars.

    • options: UntilOptions = {}

      Reader options.

    Returns Reader<Context>

Generated using TypeDoc