Lexical Analysis Libraries
Simple lexical analysis libraries for JavaScript and Python
|
view on
github
|
This is a set of lexical analizers for language tokenizing. Currently there are libraries for processing JavaScript, Python, CSS, and XML/HTML with source code in JavaScript and Python 2/3.
It was primary written to address some edge case JavaScript parsing issues found in several major applications (Notepad++, Firefox, Sublime Text, Github/Ace.) These cases usually involve regular expressions or sign-prefixed numbers.
Files named lex.*
are the base classes; files named lexlang.*
are the language descriptor generation files.
Example: lex.js
and lexpy.js
are the files needed for Python code processing running on a JavaScript interpreter
The general format for using these libraries is:
lex.*
filelexlang.*
filelexlang = lexlang.gen(lex);
lex.Lexer
object with the descriptor as the first argument, and the input string as the secondget_token
method until it returns null
(or language equivalent.)
When not returning a null
value, get_token
will otherwise return a Token
object with 4 fields:
text
– the token string
type
– the type constant of the token
flags
– flags for the tokenLexer
state
– the state the token was generated in
INVALID
,
KEYWORD
,
LITERAL
,
IDENTIFIER
,
NUMBER
,
STRING
,
REGEX
,
OPERATOR
,
WHITESPACE
,
COMMENT
INVALID
,
KEYWORD
,
LITERAL
,
IDENTIFIER
,
NUMBER
,
STRING
,
OPERATOR
,
WHITESPACE
,
COMMENT
INVALID
,
WHITESPACE
,
COMMENT
,
STRING
,
WORD
,
OPERATOR
,
AT_RULE
,
SEL_TAG
,
SEL_CLASS
,
SEL_ID
,
SEL_PSEUDO_CLASS
,
SEL_PSEUDO_ELEMENT
,
SEL_N_EXPRESSION
,
NUMBER
,
COLOR
COMMENT
,
CDATA
,
TEXT
,
RAW_DATA
,
TAG_OPEN
,
TAG_CLOSE
,
TAG_NAME
,
ATTRIBUTE
,
ATTRIBUTE_WHITESPACE
,
ATTRIBUTE_OPERATOR
,
ATTRIBUTE_STRING
Lexer
):flags.MEMBER
, // indicates the word is a member (identifier_word.member_word)
flags.BRACKET
, // this operator is a bracket of some sort
flags.BRACKET_CLOSE
, // this operator is a closing bracket
...
// Additional token flag constants can be found by opening the library's source
For additional help, view some of these test files, as examples are often more useful than wordy documentation.