cmdTokenizerobject | tokens.t[66] |
Superclass Tree | Property Summary | Method Summary | Property Details | Method Details |
[Required]
cmdTokenizer : Tokenizer
cmdTokenizer
Tokenizer
object
endAssert
patAlphaDashAlpha
patPunct
patSpelledTens
patSpelledUnits
punctChars
rules_
squote
wordPunct
acceptAbbrTok
buildOrigText
tokCvtAbbr
tokCvtApostropheS
tokCvtSpelledNumber
Inherited from Tokenizer
:
deleteRule
deleteRuleAt
insertRule
insertRuleAt
tokCvtLower
tokCvtSkip
tokenize
endAssert | tokens.t[199] |
patAlphaDashAlpha | tokens.t[258] |
patPunct | tokens.t[375] |
patSpelledTens | tokens.t[371] |
patSpelledUnits | tokens.t[373] |
punctChars | tokens.t[196] |
rules_ OVERRIDDEN | tokens.t[74] |
squote | tokens.t[206] |
wordPunct | tokens.t[212] |
acceptAbbrTok (txt) | tokens.t[270] |
buildOrigText (toks) | tokens.t[311] |
[Required]
tokCvtAbbr (txt, typ, toks) | tokens.t[290] |
When we find an abbreviation, we'll enter it with the abbreviated word minus the trailing period, plus the period as a separate token. We'll mark the period as an "abbreviation period" so that grammar rules will be able to consider treating it as an abbreviation -- but since it's also a regular period, grammar rules that treat periods as regular punctuation will also be able to try to match the result. This will ensure that we try it both ways - as abbreviation and as a word with punctuation - and pick the one that gives us the best result.
tokCvtApostropheS (txt, typ, toks) | tokens.t[220] |
tokCvtSpelledNumber (txt, typ, toks) | tokens.t[244] |