Langium - v4.0.0
    Preparing search index...

    Class IndentationAwareTokenBuilder<Terminals, KeywordName>

    A token builder that is sensitive to indentation in the input text. It will generate tokens for indentation and dedentation based on the indentation level.

    The first generic parameter corresponds to the names of terminal tokens, while the second one corresponds to the names of keyword tokens. Both parameters are optional and can be imported from ./generated/ast.js.

    Inspired by https://github.com/chevrotain/chevrotain/blob/master/examples/lexer/python_indentation/python_indentation.js

    Type Parameters

    • Terminals extends string = string
    • KeywordName extends string = string

    Hierarchy (View Summary)

    Index

    Constructors

    Properties

    dedentTokenType: TokenType

    The token type to be used for dedentation tokens

    diagnostics: LexingDiagnostic[] = []

    The list of diagnostics stored during the lexing process of a single text.

    indentationStack: number[] = ...

    The stack stores all the previously matched indentation levels to understand how deeply the next tokens are nested. The stack is valid for lexing

    indentTokenType: TokenType

    The token type to be used for indentation tokens

    whitespaceRegExp: RegExp = ...

    A regular expression to match a series of tabs and/or spaces. Override this to customize what the indentation is allowed to consist of.

    Methods

    • Helper function to create an instance of an indentation token.

      Parameters

      • tokenType: TokenType

        Indent or dedent token type

      • text: string

        Full input string, used to calculate the line number

      • image: string

        The original image of the token (tabs or spaces)

      • offset: number

        Current position in the input string

      Returns IToken

      The indentation token instance

    • A custom pattern for matching dedents

      Parameters

      • text: string

        The full input string.

      • offset: number

        The offset at which to attempt a match

      • tokens: IToken[]

        Previously scanned tokens

      • groups: Record<string, IToken[]>

        Token Groups

      Returns null | RegExpExecArray | CustomPatternMatcherReturn

    • Resets the indentation stack between different runs of the lexer

      Parameters

      • text: string

        Full text that was tokenized

      Returns IToken[]

      Remaining dedent tokens to match all previous indents at the end of the file

    • Helper function to get the line number at a given offset.

      Parameters

      • text: string

        Full input string, used to calculate the line number

      • offset: number

        Current position in the input string

      Returns number

      The line number at the given offset

    • A custom pattern for matching indents

      Parameters

      • text: string

        The full input string.

      • offset: number

        The offset at which to attempt a match

      • tokens: IToken[]

        Previously scanned tokens

      • groups: Record<string, IToken[]>

        Token Groups

      Returns null | RegExpExecArray | CustomPatternMatcherReturn

    • Helper function to check if the current position is the start of a new line.

      Parameters

      • text: string

        The full input string.

      • offset: number

        The current position at which to check

      Returns boolean

      Whether the current position is the start of a new line

    • A helper function used in matching both indents and dedents.

      Parameters

      • text: string

        The full input string.

      • offset: number

        The current position at which to attempt a match

      • tokens: IToken[]

        Previously scanned tokens

      • groups: Record<string, IToken[]>

        Token Groups

      Returns {
          currIndentLevel: number;
          match: null | RegExpExecArray;
          prevIndentLevel: number;
      }

      The current and previous indentation levels and the matched whitespace