- Mutexes have been consolidated. Instead of one per DFA (which can easily get to hundreds of them) we only have one mutex in the Recognizer class and all other parties use this for serialization. It's only about protected the DFA anyway, which is stored in a recognizer (lexer/parser).
- ATNState::getStateType() returns a size_t value now (actually an enum).
- Replaced checks via RTTI for transitions by the (serialization) type of the transition, for simplicity.
- Added some missing initialization for fields in certain ATN state classes.
- Fixed mem leak in DFA by shadowing the s0 field. That way still have a ref to the self created instance, even is s0 was replaced later.
- Added variable init in code generation for a rule context declaration (e.g. for labels).
The transaction field was access with trivial getters and setters which server no real purpose, so I made the vector public and remove all the obsolete accessor functions. This simplifies the API (e.g. use .size() instead of getNumberOfTransitions()).
This change required a few other changes in runtime.
Additionally, I radically cut down the ATN::toString method to return only what the Java runtime is doing instead of a full dump of the state and it's transitions.
There are 2 parts in an ANTLR genrated parser where memory is allocated: the actual parsing (with or w/o creating a parse tree) and the prediction part (via DFA/ATN etc.). The first part is highly volatile as it recreates parse tree instances (the class) on each parser run. In fact also lexer tokens belong to that part, but are already managed via unique pointers. This first part works without any smart pointer now. Instead there is a simple tracker class which holds all created references and frees them when the parser is reset or destroyed. This is a bit less optimal if the parser is set to create no parse tree, as created rule context objects are not freed immediately (like with smart pointers), but during reset. On the other hand this change gives (depending on the input) a nice speed up (0%-100%, after the warm up phase). Additionally memory consumption drops by a good amount.
Everything in the simulartors (and interpreters) remains unchanged. This is the shared prediction part.
- Switched most symbolic signed constants to unsigned variants. Redefined EOF in particular to become (size)-1, to avoid having to use signed token type values.
- Introduced INVALID_INDEX for all previous -1 values to indicate e.g. not found indexes etc.
- Added 2 helpers to convert between symbolic and numeric form (mostly for intervals and toString()).
- Removed many no longer needed type casts to size_t.
- Updated templates for these changes.
- Limited runtime tests to C++ tests only, to see how Travis CI copes with that.
- Had to take back some of the message beautifying, as this won't match expected runtime test output.
- Updated C++ test stg file for recent runtime changes. Regenerated tests (only one file changed actually).
- Reworked C++ test preparation. The C++ runtime is now built on first invocation of a test. This works only on Linux + OSX/macOS. Windows needs extra handling.
It's now possible to hide all symbols by default and publish only those marked with the ANTLR4CPP_PUBLIC macro (same as for Windows). The included XCode project makes use of this option now.
The Python implementations are completely synchronous
with the Java version even if some of the constructs
can be expressed with simpler Python solutions. These are
typically the all, any, count, next builtins or the list
comprehensions, etc. Beside using them makes the code
clearer, they are also prefered by the standard and can
result in performance speedup. The patch contains such
equivalent transformations in the Python targets.
- Recommended project updates from XCode 8 applied.
- Bug: ANTLRInputStream not fully initialized when constructing from a stream.
- Account for more than one temporary error token in DefaultErrorStrategy.