工作从EBNF https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form规范中的语法:
upper ::= ‘A’ | ... | ‘Z’ | ‘$’ | ‘_’ and Unicode category Lu
lower ::= ‘a’ | ... | ‘z’ and Unicode category Ll
letter ::= upper | lower and Unicode categories Lo, Lt, Nl
digit ::= ‘0’ | ... | ‘9’
opchar ::= “all other characters in \u0020-007F and Unicode
categories Sm, So except parentheses ([]) and periods”
但也要考虑到词法语法的一开始就定义了:
Parentheses ‘(’ | ‘)’ | ‘[’ | ‘]’ | ‘{’ | ‘}’.
Delimiter characters ‘‘’ | ‘’’ | ‘"’ | ‘.’ | ‘;’ | ‘,’
这是我的想法。在范围内采用消除法进行工作\u0020-007F
,消除字母、数字、括号和分隔符,我们有opchar
... (击鼓):
! # % & * + - / : < = > ? @ \ ^ | ~
并且Sm http://www.fileformat.info/info/unicode/category/Sm/list.htm and So http://www.fileformat.info/info/unicode/category/So/list.htm- 括号和句点除外。
总之,这里有一些突出显示所有情况的有效示例 - 请注意\
in the REPL https://en.wikipedia.org/wiki/Read-eval-print_loop;我不得不逃跑\\
:
val !#%&*+-/:<=>?@\^|~ = 1 // All simple opchars
val simpleName = 1
val withDigitsAndUnderscores_ab_12_ab12 = 1
val wordEndingInOpChars_!#%&*+-/:<=>?@\^|~ = 1
val !^©® = 1 // opchars and symbols
val abcαβγ_!^©® = 1 // Mixing Unicode letters and symbols
Note 1:
我找到了这个统一码类别索引 http://www.fileformat.info/info/unicode/category/index.htm弄清楚Lu, Ll, Lo, Lt, Nl
:
- 卢(大写字母)
- Ll(小写字母)
- Lo(其他字母)
- Lt(标题)
- Nl(类似罗马数字的字母数字)
- Sm(符号数学)
- 所以(符号其他)
Note 2:
val #^ = 1 // legal - two opchars
val # = 1 // illegal - reserved word like class or => or @
val + = 1 // legal - opchar
val &+ = 1 // legal - two opchars
val &2 = 1 // illegal - opchar and letter do not mix arbitrarily
val £2 = 1 // working - £ is part of Sc (Symbol currency) - undefined by spec
val ¬ = 1 // legal - part of Sm
Note 3:
其他看起来像运算符的保留字:_ : = => <- <: <% >: # @
并且\u21D2
⇒ and \u2190
←