There are 2 parts in an ANTLR genrated parser where memory is allocated: the actual parsing (with or w/o creating a parse tree) and the prediction part (via DFA/ATN etc.). The first part is highly volatile as it recreates parse tree instances (the class) on each parser run. In fact also lexer tokens belong to that part, but are already managed via unique pointers. This first part works without any smart pointer now. Instead there is a simple tracker class which holds all created references and frees them when the parser is reset or destroyed. This is a bit less optimal if the parser is set to create no parse tree, as created rule context objects are not freed immediately (like with smart pointers), but during reset. On the other hand this change gives (depending on the input) a nice speed up (0%-100%, after the warm up phase). Additionally memory consumption drops by a good amount.
Everything in the simulartors (and interpreters) remains unchanged. This is the shared prediction part.
- Switched most symbolic signed constants to unsigned variants. Redefined EOF in particular to become (size)-1, to avoid having to use signed token type values.
- Introduced INVALID_INDEX for all previous -1 values to indicate e.g. not found indexes etc.
- Added 2 helpers to convert between symbolic and numeric form (mostly for intervals and toString()).
- Removed many no longer needed type casts to size_t.
- Updated templates for these changes.
- Limited runtime tests to C++ tests only, to see how Travis CI copes with that.
- Had to take back some of the message beautifying, as this won't match expected runtime test output.
- Updated C++ test stg file for recent runtime changes. Regenerated tests (only one file changed actually).
- Reworked C++ test preparation. The C++ runtime is now built on first invocation of a test. This works only on Linux + OSX/macOS. Windows needs extra handling.
It's now possible to hide all symbols by default and publish only those marked with the ANTLR4CPP_PUBLIC macro (same as for Windows). The included XCode project makes use of this option now.
The Python implementations are completely synchronous
with the Java version even if some of the constructs
can be expressed with simpler Python solutions. These are
typically the all, any, count, next builtins or the list
comprehensions, etc. Beside using them makes the code
clearer, they are also prefered by the standard and can
result in performance speedup. The patch contains such
equivalent transformations in the Python targets.
- Recommended project updates from XCode 8 applied.
- Bug: ANTLRInputStream not fully initialized when constructing from a stream.
- Account for more than one temporary error token in DefaultErrorStrategy.
- Lesser use of shared_ptr, e.g. in listeners and some loops.
- Removed useless access methods for children in ParseRuleContext. The child list is public. Fixed initialization for start and stop nodes.
- Simplified parent + child organization in Tree and all derived classes. Instead of using overridable functions in various descendants we have now central parent + child fields in the base tree class (where they belong actually, considering this is about forming a tree). Users have to cast to the appropriate classes if necessary.
- Removed obsolete getChildren() function in Trees helper. We can just return the child vector.
- Changed edges member to an unordered_map, as this is a sparse container. This speeds up certain grammars by 1000% (e.g. highly recursive expression rules) and avoids wasting a lot of memory. This change also simplifies handling significantly.
- Had to escape tabs + linebreaks in DefaultErrorStrategy when generating a text representation. Also removed a few explicit string instance creations on the way.
- Member vars in parser context classes that take (optional) Token references must be initialized.
- Fixed a warning that copyFrom() would hide a virtual function in a ParserRuleContext.
- Another attempt to limit genrating double semicolons.
The Java target initializes the conflictingAlt local variable
based on the conflictingAlts property of the target state.
However, the Python targets resets it to None. The patch makes the
initializations consistent.
At the end of the nextToken() function, setting the eofToken field was
attempted without the 'self' keyword, resulting in accessing
and setting a new local and unused variable. The patch supplements
the missing 'self' keywords for both targets.
In order to be able to build with cmake and to have a complete source package including the demo the deployment scripts were moved to the root Cpp folder and updated.
Also the releasing-antlr.md doc file has been updated.
Apparently context references can disappear while such a ref is held in the (temporary) merge cache. Hence we need to do a full ref for the merge cache key pair.
Closes issue #12.
- Added new rule to test grammar to get code generation for wildcard capture.
- Updated the Cpp.stg template file for that.
- Made the Unicode hack (auto extend 0xFFFF to 0x10FFFF) dependent on a parameter, so we only use this hack when deserializing an ATN. This avoids trouble with intervals used in other contexts (like string offsets).
- Added a few operator != overloads, to fix compilation after recent changes.
- Simplified operands comparison in SemanticContext (uses the Arrays class now). Some cleanup in that class too.
- The abstract parse tree visitor now uses const& for Any references, to avoid reallocating new instances over and over again.
- The lexer counts syntax errors the same way as the parser does. So we can directly determine if there was any error by simply examining that (which avoids having to use a temporary listener).
- Fixed an endless recursion in Any, caused by the removal of one of the (apparently) unneeded copy constructors. As it turned out both are required. That leads to a warning in VS about a duplicate copy c-tor, which had to be suppressed therefore.
- Raised the warning level to W4 in both VS 2013 and VS 2015 and fixed all warnings resulting from that.
The translation from Java generics to templates in C++ lead to the need of virtual template functions, which is not supported by C++. Instead we use now the Any class for results of visits and no longer need templates for that part.
- Only require JRE
- Support out of tree build from antlr repostitory
- Support Superproject build with ExternalAntlr4Cpp cmake module
ExternalAntlr4Cpp module has quickstart documentation for people to be
able to start working quicly with antlr4cpp from the base demo sources
see source file for example.
Document the minimum cmake version needed to build C++ target (if compiling with cmake). Also, fix a missing space in cmake command between directory path and defining where the ANTLR .jar is located.
The root cause was that ATNConfigSet was not using he required custom hashing strategy for ParserATNSimulator.
The commit includes a number of additional fixes, related to code that was never executed before due to the root cause.
A similar issue is also likely to exist in the JavaScript runtime, I'll fix it later.
- The previous approach to load and convert UTF-8 data via a stream didn't work well, so I replaced that with a simple load-to-buffer + convert buffer from UTF-8 to UTF-32.
- Removed deleted Token.cpp file from XCode project.
- Created deployment script for Windows + updated doc/releasing-antlr.md.
- Created projects for both VS2013 and VS2015 to be used by the deployment script.
- Fixed trouble with a bug in VS2015 where std::codecvt_utf8<char32_t> is not properly supported.
- Fixed a few #include paths + a number of warnings.
- Settled on a final library name scheme: base part is "libantlr4-runtime" on MacOS + Linux. The extension determines the type (.a static lib, .dylib dynamic lib in MacOS, .so dynamic lib in Linux). No more mention of target language (cpp) or type (static) in the lib name. On Windows we omit the lib prefix, so the name becomes: antlr4-runtime.dll + antlr4-runtime.lib. We may later want to add version information there, but doing that automatically is difficult.
- Updated XCode project and CMakeLists.txt file for the new naming scheme.
- Added deployment scripts for source code (for Linux + iOS) and MacOS.
- Added C++ section in docs/releasing-antlr.md.
No need to use shared_ptr for management. Listeners are, like the other main classes (parser, lexer, input stream etc.) provided by the application and hence managed there.