Unit PasDoc_Tokenizer

Description

Simple Pascal tokenizer.

The TTokenizer object creates TToken objects (tokens) for the Pascal programming language from a character input stream.

The PasDoc_Scanner unit does the same (it actually uses this unit's tokenizer), with the exception that it evaluates compiler directives, which are comments that start with a dollar sign.

Source: source/component/PasDoc_Tokenizer.pas (line 38).

Uses

Overview

Classes, Interfaces, Objects and Records

Name Description
Class TToken Stores the exact type and additional information on one token.
Class TTokenizer Converts an input TStream to a sequence of TToken objects.

Functions and Procedures

function StandardDirectiveByName(const Name: string): TStandardDirective;
function KeyWordByName(const Name: string): TKeyword;

Types

TTokenType = (...);
TKeyword = (...);
TStandardDirective = (...);
TStandardDirectives = set of TStandardDirective;
TSymbolType = (...);

Constants

TOKEN_TYPE_NAMES: array[TTokenType] of string = ( 'whitespace', 'comment ((**)-style)', 'comment ({}-style)', 'comment (///-style)', 'comment (//-style)', 'identifier', 'number', 'string', 'symbol', 'directive', 'reserved word', 'AT&T assembler register name');
TokenCommentTypes: set of TTokenType = [ TOK_COMMENT_PAS, TOK_COMMENT_EXT, TOK_COMMENT_HELPINSIGHT, TOK_COMMENT_CSTYLE ];
SymbolNames: array[TSymbolType] of string = ( '+', '-', '*', '/', '=', '<', '<=', '>', '>=', '[', ']', ',', '(', ')', ':', ';', 'ˆ', '.', '@', '$', ':=', '..', '**', '\' );
KeyWordArray: array[Low(TKeyword)..High(TKeyword)] of string = ('x', 'AND', 'ARRAY', 'AS', 'ASM', 'BEGIN', 'CASE', 'CLASS', 'OBJCCLASS', 'CONST', 'CONSTRUCTOR', 'DESTRUCTOR', 'DISPINTERFACE', 'DIV', 'DO', 'DOWNTO', 'ELSE', 'END', 'EXCEPT', 'EXPORTS', 'FILE', 'FINALIZATION', 'FINALLY', 'FOR', 'FUNCTION', 'GOTO', 'IF', 'IMPLEMENTATION', 'IN', 'INHERITED', 'INITIALIZATION', 'INLINE', 'INTERFACE', 'IS', 'LABEL', 'LIBRARY', 'MOD', 'NIL', 'NOT', 'OBJECT', 'OF', 'ON', 'OR', 'PACKED', 'PROCEDURE', 'PROGRAM', 'PROPERTY', 'RAISE', 'RECORD', 'REPEAT', 'RESOURCESTRING', 'SET', 'SHL', 'SHR', 'STRING', 'THEN', 'THREADVAR', 'TO', 'TRY', 'TYPE', 'UNIT', 'UNTIL', 'USES', 'VAR', 'WHILE', 'WITH', 'XOR', 'OUT');
StandardDirectiveArray: array[Low(TStandardDirective)..High(TStandardDirective)] of PChar = ('x', 'ABSOLUTE', 'ABSTRACT', 'APIENTRY', 'ASSEMBLER', 'AUTOMATED', 'CDECL', 'CVAR', 'DEFAULT', 'DISPID', 'DYNAMIC', 'EXPERIMENTAL', 'EXPORT', 'EXTERNAL', 'FAR', 'FORWARD', 'GENERIC', 'HELPER', 'INDEX', 'INLINE', 'MESSAGE', 'NAME', 'NEAR', 'NODEFAULT', 'OPERATOR', 'OUT', 'OVERLOAD', 'OVERRIDE', 'PASCAL', 'PRIVATE', 'PROTECTED', 'PUBLIC', 'PUBLISHED', 'READ', 'REFERENCE', 'REGISTER', 'REINTRODUCE', 'RESIDENT', 'SEALED', 'SPECIALIZE', 'STATIC', 'STDCALL', 'STORED', 'STRICT', 'VIRTUAL', 'WRITE', 'DEPRECATED', 'SAFECALL', 'PLATFORM', 'VARARGS', 'FINAL');

Description

Functions and Procedures

function StandardDirectiveByName(const Name: string): TStandardDirective;

Checks is Name (case ignored) some Pascal keyword. Returns SD_INVALIDSTANDARDDIRECTIVE if not.

Source: source/component/PasDoc_Tokenizer.pas (line 463).

function KeyWordByName(const Name: string): TKeyword;

Checks is Name (case ignored) some Pascal standard directive. Returns KEY_INVALIDKEYWORD if not.

Source: source/component/PasDoc_Tokenizer.pas (line 467).

Types

TTokenType = (...);

enumeration type that provides all types of tokens; each token's name starts with TOK_.

TOK_DIRECTIVE is a compiler directive (like $ifdef, $define).

Note that tokenizer is not able to tell whether you used standard directive (e.g. 'Register') as an identifier (e.g. you're declaring procedure named 'Register') or as a real standard directive (e.g. a calling specifier 'register'). So there is no value like TOK_STANDARD_DIRECTIVE here, standard directives are always reported as TOK_IDENTIFIER. You can check TToken.Info.StandardDirective to know whether this identifier is maybe used as real standard directive.

Values
  • TOK_WHITESPACE
  • TOK_COMMENT_PAS
  • TOK_COMMENT_EXT
  • TOK_COMMENT_HELPINSIGHT
  • TOK_COMMENT_CSTYLE
  • TOK_IDENTIFIER
  • TOK_NUMBER
  • TOK_STRING
  • TOK_SYMBOL
  • TOK_DIRECTIVE
  • TOK_KEYWORD
  • TOK_ATT_ASSEMBLER_REGISTER

Source: source/component/PasDoc_Tokenizer.pas (line 67).

TKeyword = (...);

This item has no description.

Values
  • KEY_INVALIDKEYWORD
  • KEY_AND
  • KEY_ARRAY
  • KEY_AS
  • KEY_ASM
  • KEY_BEGIN
  • KEY_CASE
  • KEY_CLASS
  • KEY_OBJCCLASS
  • KEY_CONST
  • KEY_CONSTRUCTOR
  • KEY_DESTRUCTOR
  • KEY_DISPINTERFACE
  • KEY_DIV
  • KEY_DO
  • KEY_DOWNTO
  • KEY_ELSE
  • KEY_END
  • KEY_EXCEPT
  • KEY_EXPORTS
  • KEY_FILE
  • KEY_FINALIZATION
  • KEY_FINALLY
  • KEY_FOR
  • KEY_FUNCTION
  • KEY_GOTO
  • KEY_IF
  • KEY_IMPLEMENTATION
  • KEY_IN
  • KEY_INHERITED
  • KEY_INITIALIZATION
  • KEY_INLINE
  • KEY_INTERFACE
  • KEY_IS
  • KEY_LABEL
  • KEY_LIBRARY
  • KEY_MOD
  • KEY_NIL
  • KEY_NOT
  • KEY_OBJECT
  • KEY_OF
  • KEY_ON
  • KEY_OR
  • KEY_PACKED
  • KEY_PROCEDURE
  • KEY_PROGRAM
  • KEY_PROPERTY
  • KEY_RAISE
  • KEY_RECORD
  • KEY_REPEAT
  • KEY_RESOURCESTRING
  • KEY_SET
  • KEY_SHL
  • KEY_SHR
  • KEY_STRING
  • KEY_THEN
  • KEY_THREADVAR
  • KEY_TO
  • KEY_TRY
  • KEY_TYPE
  • KEY_UNIT
  • KEY_UNTIL
  • KEY_USES
  • KEY_VAR
  • KEY_WHILE
  • KEY_WITH
  • KEY_XOR
  • KEY_OUT

Source: source/component/PasDoc_Tokenizer.pas (line 74).

TStandardDirective = (...);

This item has no description.

Values
  • SD_INVALIDSTANDARDDIRECTIVE
  • SD_ABSOLUTE
  • SD_ABSTRACT
  • SD_APIENTRY
  • SD_ASSEMBLER
  • SD_AUTOMATED
  • SD_CDECL
  • SD_CVAR
  • SD_DEFAULT
  • SD_DISPID
  • SD_DYNAMIC
  • SD_EXPERIMENTAL
  • SD_EXPORT
  • SD_EXTERNAL
  • SD_FAR
  • SD_FORWARD
  • SD_GENERIC
  • SD_HELPER
  • SD_INDEX
  • SD_INLINE
  • SD_MESSAGE
  • SD_NAME
  • SD_NEAR
  • SD_NODEFAULT
  • SD_OPERATOR
  • SD_OUT
  • SD_OVERLOAD
  • SD_OVERRIDE
  • SD_PASCAL
  • SD_PRIVATE
  • SD_PROTECTED
  • SD_PUBLIC
  • SD_PUBLISHED
  • SD_READ
  • SD_REFERENCE
  • SD_REGISTER
  • SD_REINTRODUCE
  • SD_RESIDENT
  • SD_SEALED
  • SD_SPECIALIZE
  • SD_STATIC
  • SD_STDCALL
  • SD_STORED
  • SD_STRICT
  • SD_VIRTUAL
  • SD_WRITE
  • SD_DEPRECATED
  • SD_SAFECALL
  • SD_PLATFORM
  • SD_VARARGS
  • SD_FINAL

Source: source/component/PasDoc_Tokenizer.pas (line 145).

TStandardDirectives = set of TStandardDirective;

This item has no description.

Source: source/component/PasDoc_Tokenizer.pas (line 199).

TSymbolType = (...);

enumeration type that provides all types of symbols; each symbol's name starts with SYM_

Values
  • SYM_PLUS
  • SYM_MINUS
  • SYM_ASTERISK
  • SYM_SLASH
  • SYM_EQUAL
  • SYM_LESS_THAN
  • SYM_LESS_THAN_EQUAL
  • SYM_GREATER_THAN
  • SYM_GREATER_THAN_EQUAL
  • SYM_LEFT_BRACKET
  • SYM_RIGHT_BRACKET
  • SYM_COMMA
  • SYM_LEFT_PARENTHESIS
  • SYM_RIGHT_PARENTHESIS
  • SYM_COLON
  • SYM_SEMICOLON
  • SYM_DEREFERENCE
  • SYM_PERIOD
  • SYM_AT
  • SYM_DOLLAR
  • SYM_ASSIGN
  • SYM_RANGE
  • SYM_POWER
  • SYM_BACKSLASH: SYM_BACKSLASH may occur when writing char constant "ˆ\", see ../../tests/ok_caret_character.pas

Source: source/component/PasDoc_Tokenizer.pas (line 219).

Constants

TOKEN_TYPE_NAMES: array[TTokenType] of string = ( 'whitespace', 'comment ((**)-style)', 'comment ({}-style)', 'comment (///-style)', 'comment (//-style)', 'identifier', 'number', 'string', 'symbol', 'directive', 'reserved word', 'AT&T assembler register name');

Names of the token types. All start with lower letter. They should somehow describe (in a few short words) given TTokenType.

Source: source/component/PasDoc_Tokenizer.pas (line 205).

TokenCommentTypes: set of TTokenType = [ TOK_COMMENT_PAS, TOK_COMMENT_EXT, TOK_COMMENT_HELPINSIGHT, TOK_COMMENT_CSTYLE ];

This item has no description.

Source: source/component/PasDoc_Tokenizer.pas (line 211).

SymbolNames: array[TSymbolType] of string = ( '+', '-', '*', '/', '=', '<', '<=', '>', '>=', '[', ']', ',', '(', ')', ':', ';', 'ˆ', '.', '@', '$', ':=', '..', '**', '\' );

Symbols as strings. They can be useful to have some mapping TSymbolType -> string, but remember that actually some symbols in tokenizer have multiple possible representations, e.g. "right bracket" is usually given as "]" but can also be written as ".)".

Source: source/component/PasDoc_Tokenizer.pas (line 235).

KeyWordArray: array[Low(TKeyword)..High(TKeyword)] of string = ('x', 'AND', 'ARRAY', 'AS', 'ASM', 'BEGIN', 'CASE', 'CLASS', 'OBJCCLASS', 'CONST', 'CONSTRUCTOR', 'DESTRUCTOR', 'DISPINTERFACE', 'DIV', 'DO', 'DOWNTO', 'ELSE', 'END', 'EXCEPT', 'EXPORTS', 'FILE', 'FINALIZATION', 'FINALLY', 'FOR', 'FUNCTION', 'GOTO', 'IF', 'IMPLEMENTATION', 'IN', 'INHERITED', 'INITIALIZATION', 'INLINE', 'INTERFACE', 'IS', 'LABEL', 'LIBRARY', 'MOD', 'NIL', 'NOT', 'OBJECT', 'OF', 'ON', 'OR', 'PACKED', 'PROCEDURE', 'PROGRAM', 'PROPERTY', 'RAISE', 'RECORD', 'REPEAT', 'RESOURCESTRING', 'SET', 'SHL', 'SHR', 'STRING', 'THEN', 'THREADVAR', 'TO', 'TRY', 'TYPE', 'UNIT', 'UNTIL', 'USES', 'VAR', 'WHILE', 'WITH', 'XOR', 'OUT');

all Object Pascal keywords

Source: source/component/PasDoc_Tokenizer.pas (line 435).

StandardDirectiveArray: array[Low(TStandardDirective)..High(TStandardDirective)] of PChar = ('x', 'ABSOLUTE', 'ABSTRACT', 'APIENTRY', 'ASSEMBLER', 'AUTOMATED', 'CDECL', 'CVAR', 'DEFAULT', 'DISPID', 'DYNAMIC', 'EXPERIMENTAL', 'EXPORT', 'EXTERNAL', 'FAR', 'FORWARD', 'GENERIC', 'HELPER', 'INDEX', 'INLINE', 'MESSAGE', 'NAME', 'NEAR', 'NODEFAULT', 'OPERATOR', 'OUT', 'OVERLOAD', 'OVERRIDE', 'PASCAL', 'PRIVATE', 'PROTECTED', 'PUBLIC', 'PUBLISHED', 'READ', 'REFERENCE', 'REGISTER', 'REINTRODUCE', 'RESIDENT', 'SEALED', 'SPECIALIZE', 'STATIC', 'STDCALL', 'STORED', 'STRICT', 'VIRTUAL', 'WRITE', 'DEPRECATED', 'SAFECALL', 'PLATFORM', 'VARARGS', 'FINAL');

Object Pascal directives

Source: source/component/PasDoc_Tokenizer.pas (line 449).

Authors


Generated by PasDoc 0.17.0.snapshot.