java - write a grammar rule name in unicode [ANTLR 4] -


i still beginner in antlr 4 , wondering if there way write grammar rule name in unicode. example, following rule fine:

atomexp returns [double value] : n=number {$value = double.parsedouble($n.text);} | '(' exp=additionexp ')' {$value = $exp.value;} ;

however, let's want write same rule instead of writing name "atomexp" , want write name arabic word "تعبير"

تعبير returns [double value] : n=number {$value = double.parsedouble($n.text);} | '(' exp=additionexp ')' {$value = $exp.value;} ;

but when try write way "no viable alternative" error. can solve problem please. in advance

when looking @ the lexer grammar antlr4, can see lexer , parser names support unicode chars:

/** allow unicode rule/token names */ id  :   namestartchar namechar*;  fragment namechar     :   namestartchar     |   '0'..'9'     |   '_'     |   '\u00b7'     |   '\u0300'..'\u036f'     |   '\u203f'..'\u2040'     ;  fragment namestartchar     :   'a'..'z'     |   'a'..'z'     |   '\u00c0'..'\u00d6'     |   '\u00d8'..'\u00f6'     |   '\u00f8'..'\u02ff'     |   '\u0370'..'\u037d'     |   '\u037f'..'\u1fff'     |   '\u200c'..'\u200d'     |   '\u2070'..'\u218f'     |   '\u2c00'..'\u2fef'     |   '\u3001'..'\ud7ff'     |   '\uf900'..'\ufdcf'     |   '\ufdf0'..'\ufffd'     ; // ignores | ['\u10000-'\ueffff] ;  int : [0-9]+        ; 

but appears id تعبير not comply namechar* part of id rule.


Comments