Tokenizerclass | tok.t[84] |
Superclass Tree | Subclass Tree | Global Objects | Property Summary | Method Summary | Property Details | Method Details |
class
Tokenizer : object
deleteRule
deleteRuleAt
insertRule
insertRuleAt
tokCvtLower
tokCvtSkip
tokenize
rules_ | tok.t[123] |
The name of a rule is just an arbitrary string to identify the rule. This can be used to insert new rules in order relative to known existing rules, or to delete known existing rules.
If the value computation rule is nil, we'll just use the matching text as the token value. If the value rule is a string, we'll use the string as a replacement pattern (with rexReplace). If it's a property ID, we'll invoke the property of self with the following arguments:
txt, typ, toks
'txt' is the matched text; 'typ' is the token type from the rule; and 'toks' is a vector to which the new token or tokens are to be added. The routine is responsible for adding the appropriate values to the result list. Note that the routine can add more than one token to the results if desired.
If the value test rule is non-nil, it must be either a method or a function; we'll call the method or function to test to see if the matched value is valid. We'll call the method (on self) with the matching text as the argument; if the method returns true, the rule matches, otherwise the rule fails, and we'll continue looking for another rule as though we hadn't matched the rule's regular expression in the first place. This can be used for rules that require more than a simple regular expression match; for example, the value test can be used to look up the match in a dictionary, so that the rule only matches tokens that are defined in the dictionary.
deleteRule (name) | tok.t[195] |
deleteRuleAt (idx) | tok.t[208] |
insertRule (rule, curName, after) | tok.t[154] |
insertRuleAt (rule, idx) | tok.t[185] |
tokCvtLower (txt, typ, toks) | tok.t[215] |
tokCvtSkip (txt, typ, toks) | tok.t[226] |
tokenize (str) | tok.t[248] |
- The first element gives the token's value.
- The second element the token type (given as a token type enum value).
- The third element the original token strings, before any conversions or evaluations were performed. For example, this maintains the original case of strings that are lower-cased for the corresponding token values.