Class IndentationAwareTokenBuilder<Terminals, KeywordName>

A token builder that is sensitive to indentation in the input text. It will generate tokens for indentation and dedentation based on the indentation level.

The first generic parameter corresponds to the names of terminal tokens, while the second one corresponds to the names of keyword tokens. Both parameters are optional and can be imported from ./generated/ast.js.

Inspired by https://github.com/chevrotain/chevrotain/blob/master/examples/lexer/python_indentation/python_indentation.js

Type Parameters

Terminals extends string = string
KeywordName extends string = string

Hierarchy (view full)

DefaultTokenBuilder
- IndentationAwareTokenBuilder

Constructors

constructor

new IndentationAwareTokenBuilder<Terminals, KeywordName>(options?): IndentationAwareTokenBuilder<Terminals, KeywordName>
Type Parameters
- Terminals extends string = string
- KeywordName extends string = string
Parameters
- options: Partial<IndentationTokenBuilderOptions<NoInfer<Terminals>, NoInfer<KeywordName>>> = ...
Returns IndentationAwareTokenBuilder<Terminals, KeywordName>
Overrides DefaultTokenBuilder.constructor
- Defined in packages/langium/src/parser/indentation-aware.ts:112

Properties

`Readonly`dedentTokenType

dedentTokenType: TokenType

The token type to be used for dedentation tokens

`Protected`diagnostics

diagnostics: LexingDiagnostic[] = []

The list of diagnostics stored during the lexing process of a single text.

`Protected`indentationStack

indentationStack: number[] = ...

The stack stores all the previously matched indentation levels to understand how deeply the next tokens are nested. The stack is valid for lexing

`Readonly`indentTokenType

indentTokenType: TokenType

The token type to be used for indentation tokens

`Readonly`options

options: IndentationTokenBuilderOptions<Terminals, KeywordName>

`Protected`whitespaceRegExp

whitespaceRegExp: RegExp = ...

A regular expression to match a series of tabs and/or spaces. Override this to customize what the indentation is allowed to consist of.

Methods

`Protected`buildKeywordPattern

buildKeywordPattern(keyword, caseInsensitive): TokenPattern
Parameters
- keyword: GrammarAST.Keyword
- caseInsensitive: boolean
Returns TokenPattern
Inherited from DefaultTokenBuilder.buildKeywordPattern
- Defined in packages/langium/src/parser/token-builder.ts:145

`Protected`buildKeywordToken

buildKeywordToken(keyword, terminalTokens, caseInsensitive): TokenType
Parameters
- keyword: GrammarAST.Keyword
- terminalTokens: TokenType[]
- caseInsensitive: boolean
Returns TokenType
Inherited from DefaultTokenBuilder.buildKeywordToken
- Defined in packages/langium/src/parser/token-builder.ts:130

`Protected`buildKeywordTokens

buildKeywordTokens(rules, terminalTokens, options?): TokenType[]
Parameters
- rules: Stream<GrammarAST.AbstractRule>
- terminalTokens: TokenType[]
- Optionaloptions: TokenBuilderOptions
Returns TokenType[]
Inherited from DefaultTokenBuilder.buildKeywordTokens
- Defined in packages/langium/src/parser/token-builder.ts:119

`Protected`buildTerminalToken

buildTerminalToken(terminal): TokenType
Parameters
- terminal: GrammarAST.TerminalRule
Returns TokenType
Overrides DefaultTokenBuilder.buildTerminalToken
- Defined in packages/langium/src/parser/indentation-aware.ts:332

`Protected`buildTerminalTokens

buildTerminalTokens(rules): TokenType[]
Parameters
- rules: Stream<GrammarAST.AbstractRule>
Returns TokenType[]
Inherited from DefaultTokenBuilder.buildTerminalTokens
- Defined in packages/langium/src/parser/token-builder.ts:76

buildTokens

buildTokens(grammar, options?): TokenVocabulary
Parameters
- grammar: Grammar
- Optionaloptions: TokenBuilderOptions
Returns TokenVocabulary
Overrides DefaultTokenBuilder.buildTokens
- Defined in packages/langium/src/parser/indentation-aware.ts:132

`Protected`createIndentationTokenInstance

createIndentationTokenInstance(tokenType, text, image, offset): IToken
Helper function to create an instance of an indentation token.
Parameters
- tokenType: TokenType
  Indent or dedent token type
- text: string
  Full input string, used to calculate the line number
- image: string
  The original image of the token (tabs or spaces)
- offset: number
  Current position in the input string
Returns IToken
The indentation token instance
- Defined in packages/langium/src/parser/indentation-aware.ts:230

`Protected`dedentMatcher

dedentMatcher(text, offset, tokens, groups): null | RegExpExecArray | CustomPatternMatcherReturn
A custom pattern for matching dedents
Parameters
- text: string
  The full input string.
- offset: number
  The offset at which to attempt a match
- tokens: IToken[]
  Previously scanned tokens
- groups: Record<string, IToken[]>
  Token Groups
Returns null | RegExpExecArray | CustomPatternMatcherReturn
- Defined in packages/langium/src/parser/indentation-aware.ts:286

`Protected`findLongerAlt

findLongerAlt(keyword, terminalTokens): TokenType[]
Parameters
- keyword: GrammarAST.Keyword
- terminalTokens: TokenType[]
Returns TokenType[]
Inherited from DefaultTokenBuilder.findLongerAlt
- Defined in packages/langium/src/parser/token-builder.ts:151

flushLexingReport

flushLexingReport(text): IndentationLexingReport
Produces a lexing report for the given text that was just tokenized using the tokens provided by this builder.
Parameters
- text: string
  The text that was tokenized.
Returns IndentationLexingReport
Overrides DefaultTokenBuilder.flushLexingReport
- Defined in packages/langium/src/parser/indentation-aware.ts:182

flushRemainingDedents

flushRemainingDedents(text): IToken[]
Resets the indentation stack between different runs of the lexer
Parameters
- text: string
  Full text that was tokenized
Returns IToken[]
Remaining dedent tokens to match all previous indents at the end of the file
- Defined in packages/langium/src/parser/indentation-aware.ts:356

`Protected`getLineNumber

getLineNumber(text, offset): number
Helper function to get the line number at a given offset.
Parameters
- text: string
  Full input string, used to calculate the line number
- offset: number
  Current position in the input string
Returns number
The line number at the given offset
- Defined in packages/langium/src/parser/indentation-aware.ts:248

`Protected`indentMatcher

indentMatcher(text, offset, tokens, groups): null | RegExpExecArray | CustomPatternMatcherReturn
A custom pattern for matching indents
Parameters
- text: string
  The full input string.
- offset: number
  The offset at which to attempt a match
- tokens: IToken[]
  Previously scanned tokens
- groups: Record<string, IToken[]>
  Token Groups
Returns null | RegExpExecArray | CustomPatternMatcherReturn
- Defined in packages/langium/src/parser/indentation-aware.ts:260

`Protected`isStartOfLine

isStartOfLine(text, offset): boolean
Helper function to check if the current position is the start of a new line.
Parameters
- text: string
  The full input string.
- offset: number
  The current position at which to check
Returns boolean
Whether the current position is the start of a new line
- Defined in packages/langium/src/parser/indentation-aware.ts:197

`Protected`matchWhitespace

matchWhitespace(text, offset, tokens, groups): {
    currIndentLevel: number;
    match: null | RegExpExecArray;
    prevIndentLevel: number;
}
A helper function used in matching both indents and dedents.
Parameters
- text: string
  The full input string.
- offset: number
  The current position at which to attempt a match
- tokens: IToken[]
  Previously scanned tokens
- groups: Record<string, IToken[]>
  Token Groups
Returns {
    currIndentLevel: number;
    match: null | RegExpExecArray;
    prevIndentLevel: number;
}
The current and previous indentation levels and the matched whitespace
- currIndentLevel: number
- match: null | RegExpExecArray
- prevIndentLevel: number
- Defined in packages/langium/src/parser/indentation-aware.ts:211

`Protected`popDiagnostics

popDiagnostics(): LexingDiagnostic[]
Returns LexingDiagnostic[]
Inherited from DefaultTokenBuilder.popDiagnostics
- Defined in packages/langium/src/parser/token-builder.ts:70

`Protected`regexPatternFunction

regexPatternFunction(regex): CustomPatternMatcherFunc
Parameters
- regex: RegExp
Returns CustomPatternMatcherFunc
Inherited from DefaultTokenBuilder.regexPatternFunction
- Defined in packages/langium/src/parser/token-builder.ts:110

`Protected`requiresCustomPattern

requiresCustomPattern(regex): boolean
Parameters
- regex: RegExp
Returns boolean
Inherited from DefaultTokenBuilder.requiresCustomPattern
- Defined in packages/langium/src/parser/token-builder.ts:98

Class IndentationAwareTokenBuilder<Terminals, KeywordName>

Type Parameters

Hierarchy (view full)

Index

Constructors

Properties

Methods

Constructors

constructor

Type Parameters

Parameters

Returns IndentationAwareTokenBuilder<Terminals, KeywordName>

Properties

ReadonlydedentTokenType

Protecteddiagnostics

ProtectedindentationStack

ReadonlyindentTokenType

Readonlyoptions

ProtectedwhitespaceRegExp

Methods

ProtectedbuildKeywordPattern

Parameters

Returns TokenPattern

ProtectedbuildKeywordToken

Parameters

Returns TokenType

ProtectedbuildKeywordTokens

Parameters

Returns TokenType[]

ProtectedbuildTerminalToken

Parameters

Returns TokenType

ProtectedbuildTerminalTokens

Parameters

Returns TokenType[]

buildTokens

Parameters

Returns TokenVocabulary

ProtectedcreateIndentationTokenInstance

Parameters

Returns IToken

ProtecteddedentMatcher

Parameters

Returns null | RegExpExecArray | CustomPatternMatcherReturn

ProtectedfindLongerAlt

Parameters

Returns TokenType[]

flushLexingReport

Parameters

Returns IndentationLexingReport

flushRemainingDedents

Parameters

Returns IToken[]

ProtectedgetLineNumber

Parameters

Returns number

ProtectedindentMatcher

Parameters

Returns null | RegExpExecArray | CustomPatternMatcherReturn

ProtectedisStartOfLine

Parameters

Returns boolean

ProtectedmatchWhitespace

Parameters

Returns { currIndentLevel: number; match: null | RegExpExecArray; prevIndentLevel: number; }

currIndentLevel: number

match: null | RegExpExecArray

prevIndentLevel: number

ProtectedpopDiagnostics

Returns LexingDiagnostic[]

ProtectedregexPatternFunction

Parameters

Returns CustomPatternMatcherFunc

ProtectedrequiresCustomPattern

Parameters

Returns boolean

Settings

On This Page

`Readonly`dedentTokenType

`Protected`diagnostics

`Protected`indentationStack

`Readonly`indentTokenType

`Readonly`options

`Protected`whitespaceRegExp

`Protected`buildKeywordPattern

`Protected`buildKeywordToken

`Protected`buildKeywordTokens

`Protected`buildTerminalToken

`Protected`buildTerminalTokens

`Protected`createIndentationTokenInstance

`Protected`dedentMatcher

`Protected`findLongerAlt

`Protected`getLineNumber

`Protected`indentMatcher

`Protected`isStartOfLine

`Protected`matchWhitespace

Returns {
currIndentLevel: number;
match: null | RegExpExecArray;
prevIndentLevel: number;
}

`Protected`popDiagnostics

`Protected`regexPatternFunction

`Protected`requiresCustomPattern