Commit Graph

103 Commits

Author SHA1 Message Date
Terence Parr 2db3691f6d added -depend cmd-line option; fixes #71 2012-09-30 18:27:36 -07:00
Terence Parr ac29e6cdac got unbufferedchar working I think. 2012-09-30 12:37:35 -07:00
Terence Parr 3575e9c3c7 fix playground 2012-09-29 17:02:33 -07:00
Terence Parr e78ecd418a rm isGreedy from DecisionState, but allow ATN construction for lexer to be nongreedy. error if '.' in parser. rm unit tests for parser nongreedy 2012-09-29 12:33:00 -07:00
Terence Parr ea652962ea allow "tokens {}" 2012-09-28 16:39:36 -07:00
Terence Parr 332c9f4452 push 2012-09-25 16:39:08 -07:00
parrt faff63366c tweak 2012-09-25 16:33:50 -07:00
Terence Parr 4bde79a666 playing with tests. 2012-09-24 08:54:32 -07:00
Terence Parr f43cd15dd9 rm'd $foo references from lexer actions and added a good error message. 2012-09-23 18:20:04 -07:00
Terence Parr 262a331a5b recursive rule bug in lexer; the lexer ATN simulator was not checking for empty stack at rule stop states. 2012-09-23 18:04:46 -07:00
Terence Parr 3638073efe *.g cmdline works now to topologically sort by tokenVocab dependencies. 2012-09-08 12:26:32 -07:00
Terence Parr b3b02a5449 rm T='literal' in tokens { }. Also it's comma-separated not ';' terminated now. tokens { A,B } 2012-09-06 14:50:44 -07:00
Terence Parr 26bf5acf19 made "a4 -message-format antlr T.g" work; deactivated the options that don't work yet. 2012-09-06 13:55:02 -07:00
Terence Parr e4a9a44671 grammar option cleanup. was a mess. -Doption=value works to override grammar options on cmd-line now. 2012-09-05 18:37:28 -07:00
Terence Parr 201db8b6d0 merge sam's pulls 2012-09-04 18:59:20 -07:00
Terence Parr 60d99e62dc rm ParseListener; tested the tracer with left recursive rules; weird but deterministic for entry events. 2012-08-27 11:22:42 -07:00
Terence Parr e33e355d66 tweak comment 2012-08-26 16:32:28 -07:00
Terence Parr 64c050f233 sets at top level work now: s : A | B ; in lexer and parser. 2012-08-05 10:17:00 -07:00
Terence Parr 799b7afc1c playwith grammar 2012-08-05 09:51:52 -07:00
Terence Parr 8637f31837 rm prints 2012-08-05 09:21:59 -07:00
Terence Parr 1bec176eaa Impl Sam's no viable alt avoidance idea that chooses min alt that dips into outer context. unit test 2012-08-04 14:18:57 -07:00
Terence Parr 4090621beb same literal different modes gens no literal in .tokens, rm warning 2012-08-04 13:51:18 -07:00
Terence Parr e7b65057a6 added var for sll loop tail recursion default value; updated unit tests 2012-08-03 18:12:52 -07:00
Terence Parr e3e739dfc7 The lexer and parser ATN simulators' adaptivePredict now synchronize on the specific DFA of the decision to be simulated. This should prevent a lot of contention that would occur if we synchronize the entire adaptivePredict method. When the individual DFA are created, we also synchronize on the shared DFA[] table quickly to create a DFA and insert it into the array. Code generation modified to have _decisionToDFA generated at the top of both the parser and the lexer. Simulators created now with the recognizer, ATN, DFA[]. Not sure the LexerInterp/ParserInterp work but pushing ahead anyway for the moment. 2012-07-29 09:49:35 -07:00
Terence Parr 109880b7ab added ALIAS_REASSIGNMENT warning so redef of string literal among rules caught. First literal goes to .tokens. 2012-07-28 14:24:39 -07:00
Terence Parr 3c7b4c2a33 big cleanup. 2012-07-26 17:28:10 -07:00
Terence Parr f7eeca274f reorg closure and fix bug where $ in arrayctx wouldn't perform global follow. fix case in array merge that didn't check both a[i], b[i] as $ (only matters in full ctx). unit tests for graph show fewer ctx's created. 2012-07-22 16:09:45 -07:00
Terence Parr 9539572ee7 simplify test. 2012-07-21 16:27:00 -07:00
Terence Parr f220212a95 couldn't get Horstmann's routine to do EPS not PS so had to backtrack. 2012-07-14 16:32:04 -07:00
Terence Parr 1d9aef0a5e replace .tokens file parser with regex to avoid \t becoming tab char. 2012-07-03 12:40:36 -07:00
Terence Parr 5c69d31e88 CommonTokenFactory now knows how to copy the text out of the character stream buffer before they disappear in unbuffered character strengths; added ctor.
Lexer now guarantees that the text of the current token is always available to the emit() method even if the character stream is unbuffered.

Added some hooks to see some of the internal data in the unbuffered character stream so that I can test it better.

Updated LexerInterpreter so that it uses the token factory.

Improved/added unit tests for the unbuffered character string.

Updated various comments
2012-06-30 16:40:16 -07:00
Terence Parr 740208ee4d test code. 2012-06-17 16:56:26 -07:00
Terence Parr b255509e96 fix a bug related to semantic predicates in the lexer and generally cleaned up variable and method names in the simulator. I moved all of the predicates to the right side of lexer rules in the unit tests. Later, we should ensure that predicates only occur on the right edge of lexer rules. We should state that the rule is not been accepted so we can't test things like getText(), we have to use more raw indexes into the character stream. In the lexer simulator, the addDFAState() method now does not try to compute whether there is a predicate in the configurations. That information has already been set into the ATNConfigSet by the getEpsilonTarget() method. [I should also point out that I have not tested the Java parsing in a while and now it hits a landmine on a number of common Java files in jdk :(. I'm not sure where that crept in] 2012-06-07 17:31:18 -07:00
Terence Parr c590ba8fd8 don't look backwards for err msg if EOF is entire input. make sure we don't use -1 rule index for ruleNames[] 2012-04-29 12:12:42 -07:00
Terence Parr 6314b7d31b -> becomes # for alt labels 2012-04-26 11:59:57 -07:00
Terence Parr 9c1e58db7c add {} in primary alt block to prevent ID|INT from becoming SET, which breaks code gen needs. 2012-03-27 16:21:01 -07:00
Terence Parr 9a0aaacbee rm k=1 chk to report early ambiguity. 2012-03-16 14:11:21 -07:00
Terence Parr 102980dffd make T.g same 2012-03-14 13:20:24 -07:00
Terence Parr ae08867ff3 alter visitTerminal interface, add visitErrorNode. 2012-02-26 22:07:45 -08:00
Terence Parr 8a34176d82 added listener unit tests. fixed bug that didn't create ctx getters properly for recursive rules. added Symbol extends Token to parse tree stuff. added visitTerminal to Visitor. recursive alts now track their original, unedited AltAST subtree so we can properly count rule refs etc... later. dup of RuleRefAST was making wrong node. don't gen dispatch methods if no listener. 2012-02-22 12:44:33 -08:00
parrt a923ad8765 Major update to v4. I backed out a change I made on Christmas then mistakenly prevented any lexer DFA creation. Per http://www.antlr.org/wiki/display/~admin/2011/12/29/Flaw+in+ANTLR+v3+LL%28*%29+analysis+algorithm I fixed a major flaw in ANTLR's notion of context. To do that, I needed to create a new LoopEndState, with all of its fanout to the serialization and parser ATN construction. got a very good start on ParserATNPathFinder, which uses basic recursion to find all possible paths and return a tree with the possibilities. I left it in the condition where he would sometimes loop forever; it needs to track sets of configurations in the busy set; it using states at the moment. added a new signal from the interpreter: reportAttemptingFullContext. I fixed a bug where configuration sets derived from a configuration that had reachesIntoOuterContext>0 were not being considered as dipping into the outer context. The ambiguity checker needed to switch so that a check for exact matches not suffixes when doing full context. It's faster at the very least for full context. added some more support routines to DFA. Added TraceTree in support of the new ParserATNPathFinder.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9764]
2011-12-29 17:04:40 -08:00
parrt 299c29d927 more lexer rule specialization in parser. got antlr almost back to working with new [Aa] notation in lexer.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9753]
2011-12-26 17:09:01 -08:00
parrt d9efffd104 Add [abc] syntax to allow set of char in lexer; args aren't allowed so unambig.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9750]
2011-12-26 15:58:40 -08:00
parrt 6fa5e52d5e added => skip, channel(99), more, mode(xx), push(xx), pop lexer syntax. separated lexer rules from others in parser / AST now.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9749]
2011-12-26 15:14:49 -08:00
parrt bb48deb354 tweak to dotgenerator, make parserinterp using new atn sim
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9645]
2011-12-16 18:15:56 -08:00
parrt ebd1fbb63d within 2 or 3 unit test of where I was before I got it the ATN simulator
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9642]
2011-12-16 15:07:28 -08:00
parrt 3d133e9417 broke out fullctx tests, some fixes.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9636]
2011-12-16 09:43:29 -08:00
parrt 5ad1505fdb almost got new ATN engine working; separated .* nongreedy tests, reorg args on reporting methods
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9627]
2011-12-15 11:03:41 -08:00
parrt 92279bd6db almost got prediction working
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9600]
2011-12-13 18:10:04 -08:00
parrt 131e9f7686 added comments, working on parser interpreter (not prediction) reorg. adding ParserInterpreter. adding unit tests.
[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9546]
2011-12-09 16:35:21 -08:00