Merge branch 'master' into token-stream-bugs

This commit is contained in:
Sam Harwell 2012-11-14 15:05:10 -06:00
commit 18f5354d1b
336 changed files with 19704 additions and 12219 deletions

3
.gitignore vendored
View File

@ -1,4 +1,5 @@
/tool/target/
/runtime/Java/target/
/gunit/target/
*.hprof
*.hprof
/antlr4-maven-plugin/target/

View File

@ -1,94 +1,59 @@
ANTLR v4 Honey Badger early access
ANTLR v4 Honey Badger
Feb 17, 2012
November 11, 2012
* added -parse-listener option and differentiated between parse and parse
tree listener interfaces now. Only parse tree listener stuff generated
by default.
* Change version to 4.0b4 (btw, forgot to push 4.0b3 in build.properties when
I made git tag 4.0b3...ooops).
* names changed. visit() -> visitX(). enter/exit() -> enter/exitX()
* capitalizing automatically now. rule s -> SContext not sContext
* no enter/exit method in generic rule context object if rule has alt labels, nor in interfaces.
* dup labels allowed in same rule
* label X or x illegal if rule x exists
November 4, 2012
Feb 14, 2012
* Kill box in tree dialog box makes dialog dispose of itself
* Fixed https://github.com/antlr/antlr4/issues/8 and lots of other little things.
October 29, 2012
Jan 30, 2012
* Sam fixes nongreedy more.
* -Werror added.
* Sam made speed improvement re preds in lexer.
* Moving to github.
October 20, 2012
Jan 28, 2012
* Merged Sam's fix for nongreedy lexer/parser. lots of unit tests. A fix in
prediction ctx merge. https://github.com/parrt/antlr4/pull/99
* ~[] stuff is allowed and works inside sets etc...
October 14, 2012
Jan 22, 2012
* Rebuild how ANTLR detects SLL conflict and failover to full LL. LL is
a bit slower but correct now. Added ability to ask for exact ambiguity
detection.
* Added ranges, escapes to [a-z] notation in lexer:
October 8, 2012
a-z is the inclusive range
escape characters with special meaning: trnbf\'" such as \t
\uXXXX Unicode character with text digits
\- is the - character
\] is the ] character
* Fixed a bug where labeling the alternatives of the start rule caused
a null pointer exception.
Missing final range value gives just first char.
Inverted ranges give nothing
Bad escape sequence gives nothing
October 1, 2012 -- 4.0b2 release
Jan 21, 2012
September 30, 2012
* Added modeNames to gen'd lexers
* added lexer commands
skip
more
popMode
mode(x)
pushMode(x)
type(x)
channel(x)
* Fixed the unbuffered streams, which actually buffered everything
up by mistake. tweaked a few comments.
WS : (' '|'\n')+ -> skip ;
* Added a getter to IntStream for the token factory
use commas to separate commands: "-> skip, mode(FOO)"
* Lexer fields mv from x to _x like type changed to _type.
* Added -depend cmd-line option.
Jan 14, 2012
September 29, 2012
* labels on tokens in left-recursive rules caused codegen exception.
* leave start/stop char index alone in CommonTokenFactory; refers to original text.
* reuse of -> label on multiple alts in rule caused dup ctx object defs.
* no nongreedy or wildcard in parser.
Jan 11, 2012
September 28, 2012
* -> id labels work now for outermost alternatives, even for left-recursive
rules; e.g.,
* empty "tokens {}" is ok now.
| a=e '*' b=e {$v = $a.v * $b.v;} -> mult
* Fixed a bug where visitTerminal got a NPE
* in tree views, spaces/newlines were blanks. I converted to \n and middle dot
for space.
September 22, 2012
Jan 5, 2012
* Rule exception handlers weren't passed to the generated code
* $ruleattribute.foo weren't handled properly
* Added -package option
* Deleted code to call specific listeners by mistake. added back.
* Labels allowed in left-recursive rules:
e returns [int v]
: a=e '*' b=e {$v = $a.v * $b.v;}
| a=e '+' b=e {$v = $a.v + $b.v;}
| INT {$v = $INT.int;}
| '(' x=e ')' {$v = $x.v;}
;
Jan 4, 2012
* '_' was allowed first in symbol names in grammar
* fix unit tests
* Allow labels in left-recursive rules
* lr rules now gen only 1 rule e->e not e->e_ etc... altered tests to build parse trees.
* no more -trace, use Parser.setTrace(true)
* add context rule option; not hooked up
* 1+2*3 now gives new parse tree: (e (e 1) + (e (e 2) * (e 3)))
September 18, 2012 -- 4.0b1 release

View File

@ -1,5 +1,5 @@
[The "BSD license"]
Copyright (c) 2011 Terence Parr
[The "BSD license"]
Copyright (c) 2012 Terence Parr, Sam Harwell
All rights reserved.
Redistribution and use in source and binary forms, with or without

View File

@ -1,4 +1,4 @@
ANTLR v4 early access
ANTLR v4
Terence Parr, parrt at cs usfca edu
ANTLR project lead and supreme dictator for life
@ -6,4 +6,84 @@ University of San Francisco
INTRODUCTION
Coming soon...
Hi and welcome to the Honey Badger 4.0b2 release of ANTLR!
INSTALLATION
$ cd /usr/local/lib
$ curl -O --silent http://www.antlr.org/download/antlr-4.0b2-complete.jar
Or just download from http://www.antlr.org/download/antlr-4.0b2-complete.jar
and put it somewhere rational for your operating system.
You can either add to your CLASSPATH:
$ export CLASSPATH=".:/usr/local/lib/antlr-4.0b2-complete.jar:$CLASSPATH"
and launch org.antlr.v4.Tool directly:
$ java org.antlr.v4.Tool
ANTLR Parser Generator Version 4.0b2
-o ___ specify output directory where all output is generated
-lib ___ specify location of .tokens files
...
or use -jar option on java:
$ java -jar /usr/local/lib/antlr-4.0b2-complete.jar
ANTLR Parser Generator Version 4.0b2
-o ___ specify output directory where all output is generated
-lib ___ specify location of .tokens files
...
You can make a script, /usr/local/bin/antlr4:
#!/bin/sh
java -cp "/usr/local/lib/antlr4-complete.jar:$CLASSPATH" org.antlr.v4.Tool $*
On Windows, you can do something like this (assuming you put the
jar in C:\libraries) for antlr4.bat:
java -cp C:\libraries\antlr-4.0b2-complete.jar;%CLASSPATH% org.antlr.v4.Tool %*
You can also use an alias
$ alias antlr4='java -jar /usr/local/lib/antlr-4.0b2-complete.jar'
Either way, say just antlr4 to run ANTLR now.
The TestRig class is very useful for testing your grammars:
$ alias grun='java org.antlr.v4.runtime.misc.TestRig'
EXAMPLE
In /tmp/Hello.g4, paste this:
// Define a grammar called Hello
// match keyword hello followed by an identifier
// match lower-case identifiers
grammar Hello;
r : 'hello' ID ;
ID : [a-z]+ ;
WS : [ \t\n]+ -> skip ; // skip spaces, tabs, newlines
Then run ANTLR the tool on it:
$ cd /tmp
$ antlr4 Hello.g4
$ javac Hello*.java
Now test it:
$ grun Hello r -tree
hello parrt
^D
(r hello parrt)
(That ^D means EOF on unix; it's ^Z in Windows.) The -tree option prints
the parse tree in LISP notation.
BOOK SOURCE CODE
http://pragprog.com/titles/tpantlr2/source_code

View File

@ -0,0 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<classpath>
<classpathentry kind="src" output="target/classes" path="src/main/java"/>
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-1.6"/>
<classpathentry kind="con" path="org.eclipse.m2e.MAVEN2_CLASSPATH_CONTAINER"/>
<classpathentry kind="output" path="target/classes"/>
</classpath>

View File

@ -0,0 +1,23 @@
<?xml version="1.0" encoding="UTF-8"?>
<projectDescription>
<name>antlr4-maven-plugin</name>
<comment></comment>
<projects>
</projects>
<buildSpec>
<buildCommand>
<name>org.eclipse.jdt.core.javabuilder</name>
<arguments>
</arguments>
</buildCommand>
<buildCommand>
<name>org.eclipse.m2e.core.maven2Builder</name>
<arguments>
</arguments>
</buildCommand>
</buildSpec>
<natures>
<nature>org.eclipse.jdt.core.javanature</nature>
<nature>org.eclipse.m2e.core.maven2Nature</nature>
</natures>
</projectDescription>

View File

@ -0,0 +1,2 @@
eclipse.preferences.version=1
encoding/<project>=UTF-8

View File

@ -0,0 +1,5 @@
eclipse.preferences.version=1
org.eclipse.jdt.core.compiler.codegen.targetPlatform=1.6
org.eclipse.jdt.core.compiler.compliance=1.6
org.eclipse.jdt.core.compiler.problem.forbiddenReference=warning
org.eclipse.jdt.core.compiler.source=1.6

View File

@ -0,0 +1,4 @@
activeProfiles=
eclipse.preferences.version=1
resolveWorkspaceProjects=true
version=1

View File

@ -0,0 +1,36 @@
<?xml version="1.0" encoding="UTF-8"?>
<project-shared-configuration>
<!--
This file contains additional configuration written by modules in the NetBeans IDE.
The configuration is intended to be shared among all the users of project and
therefore it is assumed to be part of version control checkout.
Without this configuration present, some functionality in the IDE may be limited or fail altogether.
-->
<properties xmlns="http://www.netbeans.org/ns/maven-properties-data/1">
<!--
Properties that influence various parts of the IDE, especially code formatting and the like.
You can copy and paste the single properties, into the pom.xml file and the IDE will pick them up.
That way multiple projects can share the same settings (useful for formatting rules for example).
Any value defined here will override the pom.xml file value but is only applicable to the current project.
-->
<org-netbeans-modules-editor-indent.CodeStyle.usedProfile>project</org-netbeans-modules-editor-indent.CodeStyle.usedProfile>
<org-netbeans-modules-editor-indent.CodeStyle.project.spaces-per-tab>4</org-netbeans-modules-editor-indent.CodeStyle.project.spaces-per-tab>
<org-netbeans-modules-editor-indent.CodeStyle.project.tab-size>4</org-netbeans-modules-editor-indent.CodeStyle.project.tab-size>
<org-netbeans-modules-editor-indent.CodeStyle.project.indent-shift-width>4</org-netbeans-modules-editor-indent.CodeStyle.project.indent-shift-width>
<org-netbeans-modules-editor-indent.CodeStyle.project.expand-tabs>true</org-netbeans-modules-editor-indent.CodeStyle.project.expand-tabs>
<org-netbeans-modules-editor-indent.CodeStyle.project.text-limit-width>80</org-netbeans-modules-editor-indent.CodeStyle.project.text-limit-width>
<org-netbeans-modules-editor-indent.CodeStyle.project.text-line-wrap>none</org-netbeans-modules-editor-indent.CodeStyle.project.text-line-wrap>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indentCasesFromSwitch>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indentCasesFromSwitch>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.spaces-per-tab>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.spaces-per-tab>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.tab-size>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.tab-size>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indent-shift-width>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indent-shift-width>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.expand-tabs>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.expand-tabs>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-limit-width>80</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-limit-width>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-line-wrap>none</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-line-wrap>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.continuationIndentSize>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.continuationIndentSize>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStarImport>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStarImport>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStaticStarImport>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStaticStarImport>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.importGroupsOrder>*;java</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.importGroupsOrder>
<netbeans.compile.on.save>test</netbeans.compile.on.save>
</properties>
</project-shared-configuration>

361
antlr4-maven-plugin/pom.xml Normal file
View File

@ -0,0 +1,361 @@
<!--
[The "BSD license"]
ANTLR - Copyright (c) 2005-2010 Terence Parr
Maven Plugin - Copyright (c) 2009 Jim Idle
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<!-- Maven model we are inheriting from
-->
<modelVersion>4.0.0</modelVersion>
<!--
Now that the ANTLR project has adopted Maven with a vengence,
all ANTLR tools will be grouped under org.antlr and will be
controlled by a project member.
-->
<groupId>org.antlr</groupId>
<!--
This is the ANTLR plugin for ANTLR version 4.0 and above. It might
have been best to change the name of the plugin as the 4.0 plugins
behave a little differently, however for the sake of one transitional
phase to a much better plugin, it was decided that the name should
remain the same.
-->
<artifactId>antlr4-maven-plugin</artifactId>
<packaging>maven-plugin</packaging>
<!-- Note that as this plugin depends on the ANTLR tool itself
we cannot use the paren pom to control the version number
and MUST update <version> in this pom manually!
-->
<version>4.0-SNAPSHOT</version>
<name>Maven plugin for ANTLR V4</name>
<prerequisites>
<maven>3.0</maven>
</prerequisites>
<!--
Where does our actual project live on the interwebs.
-->
<url>http://antlr.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<description>
This is the brand new, re-written from scratch plugin for ANTLR v4.
Previous valiant efforts all suffered from being unable to modify the ANTLR Tool
itself to provide support not just for Maven oriented things but any other tool
that might wish to invoke ANTLR without resorting to the command line interface.
Rather than try to shoe-horn new code into the existing Mojo (in fact I think that
by incorporating a patch supplied by someone I ended up with tow versions of the
Mojo, I elected to rewrite everything from scratch, including the documentation, so
that we might end up with a perfect Mojo that can do everything that ANTLR v4 supports
such as imported grammar processing, proper support for library directories and
locating token files from generated sources, and so on.
In the end I decided to also change the the ANTLR Tool.java code so that it
would be the provider of all the things that a build tool needs, rather than
delegating things to 5 different tools. So, things like dependencies, dependency
sorting, option tracking, generating sources and so on are all folded back
in to ANTLR's Tool.java code, where they belong, and they now provide a
public interface to anyone that might want to interface with them.
One other goal of this rewrite was to completely document the whole thing
to death. Hence even this pom has more comments than funcitonal elements,
in case I get run over by a bus or fall off a cliff while skiing.
Jim Idle - March 2009
</description>
<developers>
<developer>
<name>Jim Idle</name>
<url>http://www.temporal-wave.com</url>
<roles>
<role>Originator, version 4.0</role>
</roles>
</developer>
<developer>
<name>Terence Parr</name>
<url>http://antlr.org/wiki/display/~admin/Home</url>
<roles>
<role>Project lead - ANTLR</role>
</roles>
</developer>
<developer>
<name>David Holroyd</name>
<url>http://david.holroyd.me.uk/</url>
<roles>
<role>Originator - prior version</role>
</roles>
</developer>
<developer>
<name>Kenny MacDermid</name>
<url>mailto:kenny "at" kmdconsulting.ca</url>
<roles>
<role>Contributor - prior versions</role>
</roles>
</developer>
</developers>
<!-- Where do we track bugs for this project?
-->
<issueManagement>
<system>JIRA</system>
<url>http://antlr.org/jira/browse/ANTLR</url>
</issueManagement>
<!-- Location of the license description for this project
-->
<licenses>
<license>
<distribution>repo</distribution>
<name>The BSD License</name>
<url>http://www.antlr.org/LICENSE.txt </url>
</license>
</licenses>
<distributionManagement>
<repository>
<id>antlr-repo</id>
<name>ANTLR Testing repository</name>
<url>scpexe://antlr.org/home/mavensync/antlr-repo</url>
</repository>
<snapshotRepository>
<id>antlr-snapshot</id>
<name>ANTLR Testing Snapshot Repository</name>
<url>scpexe://antlr.org/home/mavensync/antlr-snapshot</url>
</snapshotRepository>
<site>
<id>antlr-repo</id>
<name>ANTLR Maven Plugin Web Site</name>
<url>scpexe://antlr.org/home/mavensync/antlr-maven-webs/antlr4-maven-plugin</url>
</site>
</distributionManagement>
<!--
Inform Maven of the ANTLR snapshot repository, which it will
need to consult to get the latest snapshot build of the runtime and tool
if it was not built and installed locally.
-->
<repositories>
<!--
This is the ANTLR repository.
-->
<repository>
<id>antlr-snapshot</id>
<name>ANTLR Testing Snapshot Repository</name>
<url>http://antlr.org/antlr-snapshot</url>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
<releases>
<enabled>false</enabled>
</releases>
</repository>
</repositories>
<!-- Ancilliary information for completeness
-->
<inceptionYear>2009</inceptionYear>
<mailingLists>
<mailingList>
<archive>http://antlr.markmail.org/</archive>
<otherArchives>
<otherArchive>http://www.antlr.org/pipermail/antlr-interest/</otherArchive>
</otherArchives>
<name>ANTLR Users</name>
<subscribe>http://www.antlr.org/mailman/listinfo/antlr-interest/</subscribe>
<unsubscribe>http://www.antlr.org/mailman/options/antlr-interest/</unsubscribe>
<post>antlr-interest@antlr.org</post>
</mailingList>
</mailingLists>
<organization>
<name>ANTLR.org</name>
<url>http://www.antlr.org</url>
</organization>
<!-- ============================================================================= -->
<!--
What are we depedent on for the Mojos to execute? We need the
plugin API itself and of course we need the ANTLR Tool and runtime
and any of their dependencies, which we inherit. The Tool itself provides
us with all the dependencies, so we need only name it here.
-->
<dependencies>
<!--
The things we need to build the target language recognizer
-->
<dependency>
<groupId>org.apache.maven</groupId>
<artifactId>maven-plugin-api</artifactId>
<version>2.0</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.maven</groupId>
<artifactId>maven-project</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>org.codehaus.plexus</groupId>
<artifactId>plexus-compiler-api</artifactId>
<version>1.8.6</version>
</dependency>
<!--
The version of ANTLR tool that this version of the plugin controls.
We have decided that this should be in lockstep with ANTLR itself, other
than -1 -2 -3 etc patch releases.
-->
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4</artifactId>
<version>4.0-SNAPSHOT</version>
</dependency>
<!--
Testing requirements...
-->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.maven.shared</groupId>
<artifactId>maven-plugin-testing-harness</artifactId>
<version>1.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<defaultGoal>install</defaultGoal>
<extensions>
<extension>
<groupId>org.apache.maven.wagon</groupId>
<artifactId>wagon-ssh-external</artifactId>
<version>2.2</version>
</extension>
</extensions>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<showDeprecation>true</showDeprecation>
<showWarnings>true</showWarnings>
<compilerArguments>
<Xlint/>
</compilerArguments>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
<version>3.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-project-info-reports-plugin</artifactId>
<version>2.4</version>
<configuration>
<dependencyLocationsEnabled>false</dependencyLocationsEnabled>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>2.1.2</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

View File

@ -0,0 +1,84 @@
/**
[The "BSD licence"]
ANTLR - Copyright (c) 2005-2008 Terence Parr
Maven Plugin - Copyright (c) 2009 Jim Idle
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.mojo.antlr4;
import org.antlr.v4.tool.ANTLRMessage;
import org.antlr.v4.tool.ANTLRToolListener;
import org.apache.maven.plugin.logging.Log;
/**
* The Maven plexus container gives us a Log logging provider
* which we can use to install an error listener for the ANTLR
* tool to report errors by.
*/
public class Antlr4ErrorLog implements ANTLRToolListener {
private Log log;
/**
* Instantiate an ANTLR ErrorListner that communicates any messages
* it receives to the Maven error sink.
*
* @param log The Maven Error Log
*/
public Antlr4ErrorLog(Log log) {
this.log = log;
}
/**
* Sends an informational message to the Maven log sink.
* @param s The message to send to Maven
*/
@Override
public void info(String message) {
log.info(message);
}
/**
* Sends an error message from ANTLR analysis to the Maven Log sink.
*
* @param message The message to send to Maven.
*/
@Override
public void error(ANTLRMessage message) {
log.error(message.toString());
}
/**
* Sends a warning message to the Maven log sink.
*
* @param message
*/
@Override
public void warning(ANTLRMessage message) {
log.warn(message.toString());
}
}

View File

@ -0,0 +1,572 @@
/**
[The "BSD licence"]
ANTLR - Copyright (c) 2005-2008 Terence Parr
Maven Plugin - Copyright (c) 2009 Jim Idle
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/* ========================================================================
* This is the definitive ANTLR4 Mojo set. All other sets are belong to us.
*/
package org.antlr.mojo.antlr4;
import antlr.RecognitionException;
import antlr.TokenStreamException;
import org.antlr.v4.Tool;
import org.antlr.v4.codegen.CodeGenerator;
import org.antlr.v4.tool.Grammar;
import org.apache.maven.plugin.AbstractMojo;
import org.apache.maven.plugin.MojoExecutionException;
import org.apache.maven.plugin.MojoFailureException;
import org.apache.maven.plugin.logging.Log;
import org.apache.maven.project.MavenProject;
import org.codehaus.plexus.compiler.util.scan.InclusionScanException;
import org.codehaus.plexus.compiler.util.scan.SimpleSourceInclusionScanner;
import org.codehaus.plexus.compiler.util.scan.SourceInclusionScanner;
import org.codehaus.plexus.compiler.util.scan.mapping.SourceMapping;
import org.codehaus.plexus.compiler.util.scan.mapping.SuffixMapping;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;
import java.net.URI;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
/**
* Goal that picks up all the ANTLR grammars in a project and moves those that
* are required for generation of the compilable sources into the location
* that we use to compile them, such as target/generated-sources/antlr4 ...
*
* @goal antlr
*
* @phase process-sources
* @requiresDependencyResolution compile
* @requiresProject true
*
* @author <a href="mailto:jimi@temporal-wave.com">Jim Idle</a>
*/
public class Antlr4Mojo
extends AbstractMojo {
// First, let's deal with the options that the ANTLR tool itself
// can be configured by.
//
/**
* If set to true, then after the tool has processed an input grammar file
* it will report variaous statistics about the parser, such as information
* on cyclic DFAs, which rules may use backtracking, and so on.
*
* @parameter default-value="false"
*/
protected boolean report;
/**
* If set to true, then the ANTLR tool will print a version of the input
* grammar which is devoid of any actions that may be present in the input file.
*
* @parameter default-value="false"
*/
protected boolean printGrammar;
/**
* If set to true, then the code generated by the ANTLR code generator will
* be set to debug mode. This means that when run, the code will 'hang' and
* wait for a debug connection on a TCP port (49100 by default).
*
* @parameter default-value="false"
*/
protected boolean debug;
/**
* If set to true, then then the generated parser will compute and report on
* profile information at runtime.
*
* @parameter default-value="false"
*/
protected boolean profile;
/**
* If set to true then the ANTLR tool will generate a description of the atn
* for each rule in <a href="http://www.graphviz.org">Dot format</a>
*
* @parameter default-value="false"
*/
protected boolean atn;
/**
* If set to true, the generated parser code will log rule entry and exit points
* to stdout as an aid to debugging.
*
* @parameter default-value="false"
*/
protected boolean trace;
/**
* If this parameter is set, it indicates that any warning or error messages returned
* by ANLTR, shoould be formatted in the specified way. Currently, ANTLR suports the
* built-in formats of antlr, gnu and vs2005.
*
* @parameter default-value="antlr"
*/
protected String messageFormat;
/**
* If this parameter is set to true, then ANTLR will report all sorts of things
* about what it is doing such as the names of files and the version of ANTLR and so on.
*
* @parameter default-value="true"
*/
protected boolean verbose;
protected boolean verbose_dfa;
protected boolean force_atn;
protected boolean abstract_recognizer;
/**
* The number of alts, beyond which ANTLR will not generate a switch statement
* for the DFA.
*
* @parameter default-value="300"
*/
private int maxSwitchCaseLabels;
/**
* The number of alts, below which ANTLR will not choose to generate a switch
* statement over an if statement.
*/
private int minSwitchAlts;
/* --------------------------------------------------------------------
* The following are Maven specific parameters, rather than specificlly
* options that the ANTLR tool can use.
*/
/**
* Provides an explicit list of all the grammars that should
* be included in the generate phase of the plugin. Note that the plugin
* is smart enough to realize that imported grammars should be included but
* not acted upon directly by the ANTLR Tool.
*
* Unless otherwise specified, the include list scans for and includes all
* files that end in ".g" in any directory beneath src/main/antlr4. Note that
* this version of the plugin looks for the directory antlr4 and not the directory
* antlr, so as to avoid clashes and confusion for projects that use both v3 and v4 grammars
* such as ANTLR itself.
*
* @parameter
*/
protected Set<String> includes = new HashSet<String>();
/**
* Provides an explicit list of any grammars that should be excluded from
* the generate phase of the plugin. Files listed here will not be sent for
* processing by the ANTLR tool.
*
* @parameter
*/
protected Set<String> excludes = new HashSet<String>();
/**
* @parameter expression="${project}"
* @required
* @readonly
*/
protected MavenProject project;
/**
* Specifies the Antlr directory containing grammar files. For
* antlr version 4.x we default this to a directory in the tree
* called antlr4 because the antlr3 directory is occupied by version
* 3.x grammars.
*
* @parameter default-value="${basedir}/src/main/antlr4"
* @required
*/
private File sourceDirectory;
/**
* Location for generated Java files. For antlr version 4.x we default
* this to a directory in the tree called antlr4 because the antlr
* directory is occupied by version 2.x grammars.
*
* @parameter default-value="${project.build.directory}/generated-sources/antlr4"
* @required
*/
private File outputDirectory;
/**
* Location for imported token files, e.g. <code>.tokens</code> and imported grammars.
* Note that ANTLR will not try to process grammars that it finds to be imported
* into other grammars (in the same processing session).
*
* @parameter default-value="${basedir}/src/main/antlr4/imports"
*/
private File libDirectory;
public File getSourceDirectory() {
return sourceDirectory;
}
public File getOutputDirectory() {
return outputDirectory;
}
public File getLibDirectory() {
return libDirectory;
}
void addSourceRoot(File outputDir) {
project.addCompileSourceRoot(outputDir.getPath());
}
/**
* An instance of the ANTLR tool build
*/
protected Tool tool;
/**
* The main entry point for this Mojo, it is responsible for converting
* ANTLR 4.x grammars into the target language specified by the grammar.
*
* @throws org.apache.maven.plugin.MojoExecutionException When something is disvocered such as a missing source
* @throws org.apache.maven.plugin.MojoFailureException When something really bad happesn such as not being able to create the ANTLR Tool
*/
@Override
public void execute()
throws MojoExecutionException, MojoFailureException {
Log log = getLog();
// Check to see if the user asked for debug information, then dump all the
// parameters we have picked up if they did.
//
if (log.isDebugEnabled()) {
// Excludes
//
for (String e : excludes) {
log.debug("ANTLR: Exclude: " + e);
}
// Includes
//
for (String e : includes) {
log.debug("ANTLR: Include: " + e);
}
// Output location
//
log.debug("ANTLR: Output: " + outputDirectory);
// Library directory
//
log.debug("ANTLR: Library: " + libDirectory);
// Flags
//
log.debug("ANTLR: report : " + report);
log.debug("ANTLR: printGrammar : " + printGrammar);
log.debug("ANTLR: debug : " + debug);
log.debug("ANTLR: profile : " + profile);
log.debug("ANTLR: atn : " + atn);
log.debug("ANTLR: trace : " + trace);
log.debug("ANTLR: messageFormat : " + messageFormat);
log.debug("ANTLR: maxSwitchCaseLabels : " + maxSwitchCaseLabels);
log.debug("ANTLR: minSwitchAlts : " + minSwitchAlts);
log.debug("ANTLR: verbose : " + verbose);
}
// Ensure that the output directory path is all in tact so that
// ANTLR can just write into it.
//
File outputDir = getOutputDirectory();
if (!outputDir.exists()) {
outputDir.mkdirs();
}
List<String> args = new ArrayList<String>();
if (getOutputDirectory() != null) {
args.add("-o");
args.add(outputDir.getAbsolutePath());
}
// Where do we want ANTLR to look for .tokens and import grammars?
//
if (getLibDirectory() != null && getLibDirectory().exists()) {
args.add("-lib");
args.add(libDirectory.getAbsolutePath());
}
// Next we need to set the options given to us in the pom into the
// tool instance we have created.
//
if (debug) {
args.add("-debug");
}
if (atn) {
args.add("-atn");
}
if (profile) {
args.add("-profile");
}
if (report) {
args.add("-report");
}
if (printGrammar) {
args.add("-print");
}
if (verbose_dfa) {
args.add("-Xverbose-dfa");
}
if (messageFormat != null && !"".equals(messageFormat)) {
args.add("-message-format");
args.add(messageFormat);
}
if (force_atn) {
args.add("-Xforce-atn");
}
if (abstract_recognizer) {
args.add("-abstract");
}
try {
// Now pick up all the files and process them with the Tool
//
processGrammarFiles(args, sourceDirectory, outputDirectory);
} catch (InclusionScanException ie) {
log.error(ie);
throw new MojoExecutionException("Fatal error occured while evaluating the names of the grammar files to analyze");
} catch (Exception e) {
getLog().error(e);
throw new MojoExecutionException(e.getMessage());
}
// Create an instance of the ANTLR 4 build tool
//
try {
tool = new Tool(args.toArray(new String[args.size()])) {
@Override
public void process(Grammar g, boolean gencode) {
getLog().info("Processing grammar: " + g.fileName);
super.process(g, gencode);
}
@Override
public Writer getOutputFileWriter(Grammar g, String fileName) throws IOException {
if (outputDirectory == null) {
return new StringWriter();
}
// output directory is a function of where the grammar file lives
// for subdir/T.g4, you get subdir here. Well, depends on -o etc...
// But, if this is a .tokens file, then we force the output to
// be the base output directory (or current directory if there is not a -o)
//
File outputDir;
if ( fileName.endsWith(CodeGenerator.VOCAB_FILE_EXTENSION) ) {
outputDir = new File(outputDirectory);
}
else {
outputDir = getOutputDirectory(g.fileName);
}
File outputFile = new File(outputDir, fileName);
if (!outputDir.exists()) {
outputDir.mkdirs();
}
URI relativePath = project.getBasedir().toURI().relativize(outputFile.toURI());
getLog().info(" Writing file: " + relativePath);
FileWriter fw = new FileWriter(outputFile);
return new BufferedWriter(fw);
}
};
tool.addListener(new Antlr4ErrorLog(log));
// we set some options directly
tool.trace = trace;
// Where do we want ANTLR to produce its output? (Base directory)
//
if (log.isDebugEnabled())
{
log.debug("Output directory base will be " + outputDirectory.getAbsolutePath());
}
// Tell ANTLR that we always want the output files to be produced in the output directory
// using the same relative path as the input file was to the input directory.
//
// tool.setForceRelativeOutput(true);
// Set working directory for ANTLR to be the base source directory
//
tool.inputDirectory = sourceDirectory;
if (!sourceDirectory.exists()) {
if (log.isInfoEnabled()) {
log.info("No ANTLR 4 grammars to compile in " + sourceDirectory.getAbsolutePath());
}
return;
} else {
if (log.isInfoEnabled()) {
log.info("ANTLR 4: Processing source directory " + sourceDirectory.getAbsolutePath());
}
}
} catch (Exception e) {
log.error("The attempt to create the ANTLR 4 build tool failed, see exception report for details", e);
throw new MojoFailureException("Jim failed you!");
}
tool.processGrammarsOnCommandLine();
// If any of the grammar files caused errors but did nto throw exceptions
// then we should have accumulated errors in the counts
//
if (tool.getNumErrors() > 0) {
throw new MojoExecutionException("ANTLR 4 caught " + tool.getNumErrors() + " build errors.");
}
// All looks good, so we need to tel Maven about the sources that
// we just created.
//
if (project != null) {
// Tell Maven that there are some new source files underneath
// the output directory.
//
addSourceRoot(this.getOutputDirectory());
}
}
/**
*
* @param sourceDirectory
* @param outputDirectory
* @throws antlr.TokenStreamException
* @throws antlr.RecognitionException
* @throws java.io.IOException
* @throws org.codehaus.plexus.compiler.util.scan.InclusionScanException
*/
private void processGrammarFiles(List<String> args, File sourceDirectory, File outputDirectory)
throws TokenStreamException, RecognitionException, IOException, InclusionScanException {
// Which files under the source set should we be looking for as grammar files
//
SourceMapping mapping = new SuffixMapping("g4", Collections.EMPTY_SET);
// What are the sets of includes (defaulted or otherwise).
//
Set<String> includes = getIncludesPatterns();
// Now, to the excludes, we need to add the imports directory
// as this is autoscanned for importd grammars and so is auto-excluded from the
// set of gramamr fiels we shuold be analyzing.
//
excludes.add("imports/**");
SourceInclusionScanner scan = new SimpleSourceInclusionScanner(includes, excludes);
scan.addSourceMapping(mapping);
Set<?> grammarFiles = scan.getIncludedSources(sourceDirectory, null);
if (grammarFiles.isEmpty()) {
if (getLog().isInfoEnabled()) {
getLog().info("No grammars to process");
}
} else {
// Tell the ANTLR tool that we want sorted build mode
//
// tool.setMake(true);
// Iterate each grammar file we were given and add it into the tool's list of
// grammars to process.
//
for (Object grammarObject : grammarFiles) {
if (!(grammarObject instanceof File)) {
getLog().error(String.format("Expected %s from %s.getIncludedSources, found %s.",
File.class.getName(),
grammarObject != null ? grammarObject.getClass().getName() : "null"));
}
File grammarFile = (File)grammarObject;
if (getLog().isDebugEnabled()) {
getLog().debug("Grammar file '" + grammarFile.getPath() + "' detected.");
}
String relPath = findSourceSubdir(sourceDirectory, grammarFile.getPath()) + grammarFile.getName();
if (getLog().isDebugEnabled()) {
getLog().debug(" ... relative path is: " + relPath);
}
args.add(relPath);
}
}
}
public Set<String> getIncludesPatterns() {
if (includes == null || includes.isEmpty()) {
return Collections.singleton("**/*.g4");
}
return includes;
}
/**
* Given the source directory File object and the full PATH to a
* grammar, produce the path to the named grammar file in relative
* terms to the sourceDirectory. This will then allow ANTLR to
* produce output relative to the base of the output directory and
* reflect the input organization of the grammar files.
*
* @param sourceDirectory The source directory File object
* @param grammarFileName The full path to the input grammar file
* @return The path to the grammar file relative to the source directory
*/
private String findSourceSubdir(File sourceDirectory, String grammarFileName) {
String srcPath = sourceDirectory.getPath() + File.separator;
if (!grammarFileName.startsWith(srcPath)) {
throw new IllegalArgumentException("expected " + grammarFileName + " to be prefixed with " + sourceDirectory);
}
File unprefixedGrammarFileName = new File(grammarFileName.substring(srcPath.length()));
return unprefixedGrammarFileName.getParent() + File.separator;
}
}

View File

@ -0,0 +1,8 @@
Imported Grammar Files
In order to have the ANTLR plugin automatically locate and use grammars used
as imports in your main .g4 files, you need to place the imported grammar
files in the imports directory beneath the root directory of your grammar
files (which is <<<src/main/antlr4>>> by default of course).
For a default layout, place your import grammars in the directory: <<<src/main/antlr4/imports>>>

View File

@ -0,0 +1,47 @@
Libraries
The introduction of the import directive in a grammar allows reuse of common grammar files
as well as the ability to divide up functional components of large grammars. However it has
caused some confusion in regard to the fact that generated vocab files (<<<xxx.tokens>>>) can also
be searched for with the <<<<libDirectory>>>> directive.
This has confused two separate functions and imposes a structure upon the layout of
your grammar files in certain cases. If you have grammars that both use the import
directive and also require the use of a vocab file then you will need to locate
the grammar that generates the .tokens file alongside the grammar that uses it. This
is because you will need to use the <<<<libDirectory>>>> directive to specify the
location of your imported grammars and ANTLR will not find any vocab files in
this directory.
The .tokens files for any grammars are generated within the same output directory structure
as the .java files. So, whereever the .java files are generated, you will also find the .tokens
files. ANTLR looks for .tokens files in both the <<<<libDirectory>>>> and the output directory
where it is placing the geenrated .java files. Hence when you locate the grammars that generate
.tokens files in the same source directory as the ones that use the .tokens files, then
the Maven plugin will find the expected .tokens files.
The <<<<libDirectory>>>> is specified like any other directory parameter in Maven. Here is an
example:
+--
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<version>4.0-SNAPSHOT</version>
<executions>
<execution>
<configuration>
<goals>
<goal>antlr</goal>
</goals>
<libDirectory>src/main/antlr4_imports</libDirectory>
</configuration>
</execution>
</executions>
</plugin>
+--

View File

@ -0,0 +1,40 @@
Simple configuration
If your grammar files are organized into the default locations as described in the {{{../index.html}introduction}},
then configuring the pom.xml file for your project is as simple as adding this to it
+--
<plugins>
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<version>4.0-SNAPSHOT</version>
<executions>
<execution>
<goals>
<goal>antlr</goal>
</goals>
</execution>
</executions>
</plugin>
...
</plugins>
+--
When the mvn command is executed all grammar files under <<<src/main/antlr4>>>, except any
import grammars under <<<src/main/antlr4/imports>>> will be analyzed and converted to
java source code in the output directory <<<target/generated-sources/antlr4>>>.
Your input files under <<<antlr4>>> should be stored in sub directories that
reflect the package structure of your java parsers. If your grammar file parser.g4 contains:
+---
@header {
package org.jimi.themuss;
}
+---
Then the .g4 file should be stored in: <<<src/main/antlr4/org/jimi/themuss/parser.g4>>>. THis way
the generated .java files will correctly reflect the package structure in which they will
finally rest as classes.

View File

@ -0,0 +1,63 @@
-------------
ANTLR v4 Maven Plugin
-------------
Jim Idle
-------------
March 2009
-------------
ANTLR v4 Maven plugin
The ANTLR v4 Maven plugin is completely re-written as of version 4.0; if you are familiar
with prior versions, you should note that there are some behavioral differences that make
it worthwhile reading this documentation.
The job of the plugin is essentially to tell the standard ANTLR parser generator where the
input grammar files are and where the output files should be generated. As with all Maven
plugins, there are defaults, which you are advised to comply to, but are not forced to
comply to.
This version of the plugin allows full control over ANTLR and allows configuration of all
options that are useful for a build system. The code required to calculate dependencies,
check the build order, and otherwise work with your grammar files is built into the ANTLR
tool as of version 4.0 of ANTLR and this plugin.
* Plugin Versioning
The plugin version tracks the version of the ANTLR tool that it controls. Hence if you
use version 4.0 of the plugin, you will build your grammars using version 4.0 of the
ANTLR tool, version 4.2 of the plugin will use version 4.2 of the ANTLR tool and so on.
You may also find that there are patch versions of the plugin suchas 4.0-1 4.0-2 and
so on. Use the latest patch release of the plugin.
The current version of the plugin is shown at the top of this page after the <<Last Deployed>> date.
* Default directories
As with all Maven plugins, this plugin will automatically default to standard locations
for your grammar and import files. Organizing your source code to reflect this standard
layout will greatly reduce the configuration effort required. The standard layout lookd
like this:
+--
src/main/
|
+--- antlr4/... .g4 files organized in the required package structure
|
+--- imports/ .g4 files that are imported by other grammars.
+--
If your grammar is intended to be part of a package called org.foo.bar then you would
place it in the directory <<<src/main/antlr4/org/foo/bar>>>. The plugin will then produce
.java and .tokens files in the output directory <<<target/generated-sources/antlr4/org/foo/bar>>>
When the Java files are compiled they will be in the correct location for the javac
compiler without any special configuration. The generated java files are automatically
submitted for compilation by the plugin.
The <<<src/main/antlr4/imports>>> directory is treated in a special way. It should contain
any grammar files that are imported by other grammar files (do not make subdirectories here.)
Such files are never built on their own, but the plugin will automatically tell the ANTLR
tool to look in this directory for library files.

View File

@ -0,0 +1,193 @@
Usage
The Maven plugin for antlr is simple to use but is at its simplest when you use the default
layouts for your grammars, as so:
+--
src/main/
|
+--- antlr4/... .g4 files organized in the required package structure
|
+--- imports/ .g4 files that are imported by other grammars.
+--
However, if you are not able to use this structure for whatever reason, you
can configure the locations of the grammar files, where library/import files
are located and where the output files should be generated.
* Plugin Descriptor
The current version of the plugin is shown at the top of this page after the <<Last Deployed>> date.
The full layout of the descriptor (at least, those parts that are not standard Maven things),
showing the default values of the configuration options, is as follows:
+--
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<version>4.0-SNAPSHOT</version>
<executions>
<execution>
<configuration>
<goals>
<goal>antlr</goal>
</goals>
<conversionTimeout>10000</conversionTimeout>
<debug>false</debug>
<dfa>false</dfa>
<nfa>false</nfa>
<excludes><exclude/></excludes>
<includes><include/></includes>
<libDirectory>src/main/antlr4/imports</libDirectory>
<messageFormat>antlr</messageFormat>
<outputDirectory>target/generated-sources/antlr4</outputDirectory>
<printGrammar>false</printGrammar>
<profile>false</profile>
<report>false</report>
<sourceDirectory>src/main/antlr4</sourceDirectory>
<trace>false</trace>
<verbose>true</verbose>
</configuration>
</execution>
</executions>
</plugin>
+--
Note that you can create multiple executions, and thus build some grammars with different
options to others (such as setting the debug option for instance).
** Configuration parameters
*** report
If set to true, then after the tool has processed an input grammar file
it will report variaous statistics about the parser, such as information
on cyclic DFAs, which rules may use backtracking, and so on.
default-value="false"
*** printGrammar
If set to true, then the ANTLR tool will print a version of the input
grammar which is devoid of any actions that may be present in the input file.
default-value = "false"
*** debug
If set to true, then the code generated by the ANTLR code generator will
be set to debug mode. This means that when run, the code will 'hang' and
wait for a debug connection on a TCP port (49100 by default).
default-value="false"
*** profile
If set to true, then then the generated parser will compute and report on
profile information at runtime.
default-value="false"
*** nfa
If set to true then the ANTLR tool will generate a description of the nfa
for each rule in <a href="http://www.graphviz.org">Dot format</a>
default-value="false"
protected boolean nfa;
*** dfa
If set to true then the ANTLR tool will generate a description of the DFA
for each decision in the grammar in <a href="http://www.graphviz.org">Dot format</a>
default-value="false"
*** trace
If set to true, the generated parser code will log rule entry and exit points
to stdout as an aid to debugging.
default-value="false"
*** messageFormat
If this parameter is set, it indicates that any warning or error messages returned
by ANLTR, shoould be formatted in the specified way. Currently, ANTLR supports the
built-in formats of antlr, gnu and vs2005.
default-value="antlr"
*** verbose
If this parameter is set to true, then ANTLR will report all sorts of things
about what it is doing such as the names of files and the version of ANTLR and so on.
default-value="true"
*** conversionTimeout
The number of milliseconds ANTLR will wait for analysis of each
alternative in the grammar to complete before giving up. You may raise
this value if ANTLR gives up on a complicated alt and tells you that
there are lots of ambiguties, but you know that it just needed to spend
more time on it. Note that this is an absolute time and not CPU time.
default-value="10000"
*** includes
Provides an explicit list of all the grammars that should
be included in the generate phase of the plugin. Note that the plugin
is smart enough to realize that imported grammars should be included but
not acted upon directly by the ANTLR Tool.
Unless otherwise specified, the include list scans for and includes all
files that end in ".g4" in any directory beneath src/main/antlr4. Note that
this version of the plugin looks for the directory antlr4 and not the directory
antlr, so as to avoid clashes and confusion for projects that use both v3 and v4 grammars
such as ANTLR itself.
*** excludes
Provides an explicit list of any grammars that should be excluded from
the generate phase of the plugin. Files listed here will not be sent for
processing by the ANTLR tool.
*** sourceDirectory
Specifies the Antlr directory containing grammar files. For
antlr version 4.x we default this to a directory in the tree
called antlr4 because the antlr directory is occupied by version
2.x grammars.
<<NB>> Take careful note that the default location for antlr grammars
is now <<antlr4>> and NOT <<antlr>>
default-value="<<<${basedir}/src/main/antlr4>>>"
*** outputDirectory
Location for generated Java files. For antlr version 4.x we default
this to a directory in the tree called antlr4 because the antlr
directory is occupied by version 2.x grammars.
default-value="<<<${project.build.directory}/generated-sources/antlr4>>>"
*** libDirectory
Location for imported token files, e.g. <code>.tokens</code> and imported grammars.
Note that ANTLR will not try to process grammars that it finds in this directory, but
will include this directory in the search for .tokens files and import grammars.
<<NB>> If you change the lib directory from the default but the directory is
still under<<<${basedir}/src/main/antlr4>>>, then you will need to exclude
the grammars from processing specifically, using the <<<<excludes>>>> option.
default-value="<<<${basedir}/src/main/antlr4/imports>>>"

View File

@ -0,0 +1,33 @@
<?xml version="1.0" encoding="UTF-8"?>
<project name="ANTLR v4 Maven plugin">
<publishDate position="left"/>
<version position="left"/>
<poweredBy>
<logo name="ANTLR Web Site" href="http://antlr.org/"
img="http://www.antlr.org/wiki/download/attachments/292/ANTLR4"/>
</poweredBy>
<body>
<links>
<item name="Antlr Web Site" href="http://www.antlr.org/"/>
</links>
<menu name="Overview">
<item name="Introduction" href="index.html"/>
<item name="Usage" href="usage.html"/>
</menu>
<menu name="Examples">
<item name="Simple configurations" href="examples/simple.html"/>
<item name="Using library directories" href="examples/libraries.html"/>
<item name="Using imported grammars" href="examples/import.html"/>
</menu>
<menu ref="reports" />
<menu ref="modules" />
</body>
</project>

View File

@ -1,4 +1,4 @@
version=4.0ea
version=4.0b4
antlr3.jar=/usr/local/lib/antlr-3.4-complete.jar
antlr3.jar=/usr/local/lib/antlr-3.5-complete.jar
build.sysclasspath=ignore

View File

@ -4,9 +4,9 @@
Make build.properties like this:
version=4.0ea
version=4.0b2
antlr3.jar=/usr/local/lib/antlr-3.4-complete.jar
antlr3.jar=/usr/local/lib/antlr-3.5-complete.jar
build.sysclasspath=ignore
-->
@ -87,6 +87,7 @@ build.sysclasspath=ignore
</classpath>
</java>
<!--
<echo>gunit grammars</echo>
<java classname="org.antlr.Tool" fork="true" failonerror="false" maxmemory="300m"
dir="${basedir}/gunit/src/org/antlr/v4/gunit">
@ -100,6 +101,7 @@ build.sysclasspath=ignore
<pathelement path="${java.class.path}"/>
</classpath>
</java>
-->
</target>
<target name="compile" depends="antlr" description="Compile for generic OS">
@ -108,7 +110,7 @@ build.sysclasspath=ignore
<copy todir="${build.dir}/src" >
<fileset dir="${basedir}/tool/src/"/>
<fileset dir="${basedir}/runtime/Java/src/"/>
<fileset dir="${basedir}/gunit/src/"/>
<!-- <fileset dir="${basedir}/gunit/src/"/> -->
</copy>
<replace dir="${build.dir}/src" token="@version@" value="${version}"/>
<javac
@ -161,12 +163,14 @@ build.sysclasspath=ignore
<include name="**/*.st"/>
<include name="**/*.stg"/>
</fileset>
<!--
<fileset dir="${basedir}/gunit/src/">
<include name="**/*.java"/>
<include name="**/*.g"/>
<include name="**/*.st"/>
<include name="**/*.stg"/>
</fileset>
-->
</copy>
<copy todir="${install.root.dir}">

53
contributors.txt Normal file
View File

@ -0,0 +1,53 @@
ANTLR Project Contributors Certification of Origin and Rights
All contributors to ANTLR v4 must formally agree to abide by this
certificate of origin by signing on the bottom with their github
userid, full name, email address (you can obscure your e-mail, but it
must be computable by human), and date.
By signing this agreement, you are warranting and representing that
you have the right to release code contributions or other content free
of any obligations to third parties and are granting Terence Parr and
ANTLR project contributors, henceforth referred to as The ANTLR
Project, a license to incorporate it into The ANTLR Project tools
(such as ANTLRWorks and StringTemplate) or related works under the BSD
license. You understand that The ANTLR Project may or may not
incorporate your contribution and you warrant and represent the
following:
1. I am the creator of all my contributions. I am the author of all
contributed work submitted and further warrant and represent that
such work is my original creation and I have the right to license
it to The ANTLR Project for release under the 3-clause BSD
license. I hereby grant The ANTLR Project a nonexclusive,
irrevocable, royalty-free, worldwide license to reproduce,
distribute, prepare derivative works, and otherwise use this
contribution as part of the ANTLR project, associated
documentation, books, and tools at no cost to The ANTLR Project.
2. I have the right to submit. This submission does not violate the
rights of any person or entity and that I have legal authority over
this submission and to make this certification.
3. If I violate another's rights, liability lies with me. I agree to
defend, indemnify, and hold The ANTLR Project and ANTLR users
harmless from any claim or demand, including reasonable attorney
fees, made by any third party due to or arising out of my violation
of these terms and conditions or my violation of the rights of
another person or entity.
4. I understand and agree that this project and the contribution are
public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license indicated in the file.
I have read this agreement and do so certify by adding my signoff to
the end of the following contributors list.
CONTRIBUTORS:
YYYY/MM/DD, github id, Full name, email
2012/07/12, parrt, Terence Parr, parrt@antlr.org
2012/09/18, sharwell, Sam Harwell, sam@tunnelvisionlabs.com
2012/10/10, stephengaito, Stephen Gaito, stephen@percepitsys.co.uk

View File

@ -1,80 +0,0 @@
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.antlr</groupId>
<artifactId>antlr4-gunit</artifactId>
<version>4.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>antlr4-gunit</name>
<url>http://www.antlr.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4-runtime</artifactId>
<version>4.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr-runtime</artifactId>
<version>3.4.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
<artifactId>ST4</artifactId>
<version>4.0.4</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src</sourceDirectory>
<resources>
<resource>
<directory>resources</directory>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.antlr</groupId>
<artifactId>antlr3-maven-plugin</artifactId>
<version>3.4</version>
<configuration>
<sourceDirectory>src</sourceDirectory>
<verbose>true</verbose>
</configuration>
<executions>
<execution>
<goals>
<goal>antlr</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</build>
</project>

View File

@ -1,43 +0,0 @@
group jUnit;
jUnitClass(className, header, options, suites) ::= <<
<header>
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.junit.Test;
import org.junit.Before;
import static org.junit.Assert.*;
public class <className> extends org.antlr.v4.gunit.gUnitBase {
@Before public void setup() {
lexerClassName = "<options.lexer>";
parserClassName = "<options.parser>";
<if(options.adaptor)>
adaptorClassName = "<options.adaptor>";
<endif>
}
<suites>
}
>>
header(action) ::= "<action>"
testSuite(name,cases) ::= <<
<cases:{c | <c>}; separator="\n\n"> <! use {...} iterator to get <i> !>
>>
parserRuleTestSuccess(input,expecting) ::= <<
>>
parserRuleTestAST(ruleName,scriptLine,input,expecting) ::= <<
@Test public void test_<name><i>() throws Exception {
// gunit test on line <scriptLine>
RuleReturnScope rstruct = (RuleReturnScope)execParser("<ruleName>", "<input>", <scriptLine>);
Object actual = ((Tree)rstruct.getTree()).toStringTree();
Object expecting = "<expecting>";
assertEquals("testing rule <ruleName>", expecting, actual);
}
>>
string(s) ::= "<s>"

View File

@ -1,46 +0,0 @@
tree grammar ASTVerifier;
options {
ASTLabelType=CommonTree;
tokenVocab = gUnit;
}
@header {
package org.antlr.v4.gunit;
}
gUnitDef
: ^('gunit' ID DOC_COMMENT? (optionsSpec|header)* testsuite+)
;
optionsSpec
: ^(OPTIONS option+)
;
option
: ^('=' ID ID)
| ^('=' ID STRING)
;
header : ^('@header' ACTION);
testsuite
: ^(SUITE ID ID DOC_COMMENT? testcase+)
| ^(SUITE ID DOC_COMMENT? testcase+)
;
testcase
: ^(TEST_OK DOC_COMMENT? input)
| ^(TEST_FAIL DOC_COMMENT? input)
| ^(TEST_RETVAL DOC_COMMENT? input RETVAL)
| ^(TEST_STDOUT DOC_COMMENT? input STRING)
| ^(TEST_STDOUT DOC_COMMENT? input ML_STRING)
| ^(TEST_TREE DOC_COMMENT? input TREE)
| ^(TEST_ACTION DOC_COMMENT? input ACTION)
;
input
: STRING
| ML_STRING
| FILENAME
;

View File

@ -1,151 +0,0 @@
package org.antlr.v4.gunit;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.BufferedTreeNodeStream;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.CommonTreeNodeStream;
import org.antlr.stringtemplate.AutoIndentWriter;
import org.antlr.stringtemplate.StringTemplate;
import org.antlr.stringtemplate.StringTemplateGroup;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class Gen {
// TODO: don't hardcode
public static final String TEMPLATE_FILE =
"/Users/parrt/antlr/code/antlr4/main/gunit/resources/org/antlr/v4/gunit/jUnit.stg";
public static void main(String[] args) throws Exception {
if ( args.length==0 ) System.exit(0);
String outputDirName = ".";
String fileName = args[0];
if ( args[0].equals("-o") ) {
if ( args.length<3 ) {
help();
System.exit(0);
}
outputDirName = args[1];
fileName = args[2];
}
new Gen().process(fileName, outputDirName);
}
public void process(String fileName, String outputDirName) throws Exception {
// PARSE SCRIPT
ANTLRFileStream fs = new ANTLRFileStream(fileName);
gUnitLexer lexer = new gUnitLexer(fs);
CommonTokenStream tokens = new CommonTokenStream(lexer);
gUnitParser parser = new gUnitParser(tokens);
RuleReturnScope r = parser.gUnitDef();
CommonTree scriptAST = (CommonTree)r.getTree();
System.out.println(scriptAST.toStringTree());
// ANALYZE
CommonTreeNodeStream nodes = new CommonTreeNodeStream(r.getTree());
Semantics sem = new Semantics(nodes);
sem.downup(scriptAST);
System.out.println("options="+sem.options);
// GENERATE CODE
FileReader fr = new FileReader(TEMPLATE_FILE);
StringTemplateGroup templates =
new StringTemplateGroup(fr);
fr.close();
BufferedTreeNodeStream bnodes = new BufferedTreeNodeStream(scriptAST);
jUnitGen gen = new jUnitGen(bnodes);
gen.setTemplateLib(templates);
RuleReturnScope r2 = gen.gUnitDef();
StringTemplate st = (StringTemplate)r2.getTemplate();
st.setAttribute("options", sem.options);
FileWriter fw = new FileWriter(outputDirName+"/"+sem.name+".java");
BufferedWriter bw = new BufferedWriter(fw);
st.write(new AutoIndentWriter(bw));
bw.close();
}
/** Borrowed from Leon Su in gunit v3 */
public static String escapeForJava(String inputString) {
// Gotta escape literal backslash before putting in specials that use escape.
inputString = inputString.replace("\\", "\\\\");
// Then double quotes need escaping (singles are OK of course).
inputString = inputString.replace("\"", "\\\"");
// note: replace newline to String ".\n", replace tab to String ".\t"
inputString = inputString.replace("\n", "\\n").replace("\t", "\\t").replace("\r", "\\r").replace("\b", "\\b").replace("\f", "\\f");
return inputString;
}
public static String normalizeTreeSpec(String t) {
List<String> words = new ArrayList<String>();
int i = 0;
StringBuilder word = new StringBuilder();
while ( i<t.length() ) {
if ( t.charAt(i)=='(' || t.charAt(i)==')' ) {
if ( word.length()>0 ) {
words.add(word.toString());
word.setLength(0);
}
words.add(String.valueOf(t.charAt(i)));
i++;
continue;
}
if ( Character.isWhitespace(t.charAt(i)) ) {
// upon WS, save word
if ( word.length()>0 ) {
words.add(word.toString());
word.setLength(0);
}
i++;
continue;
}
// ... "x" or ...("x"
if ( t.charAt(i)=='"' && (i-1)>=0 &&
(t.charAt(i-1)=='(' || Character.isWhitespace(t.charAt(i-1))) )
{
i++;
while ( i<t.length() && t.charAt(i)!='"' ) {
if ( t.charAt(i)=='\\' &&
(i+1)<t.length() && t.charAt(i+1)=='"' ) // handle \"
{
word.append('"');
i+=2;
continue;
}
word.append(t.charAt(i));
i++;
}
i++; // skip final "
words.add(word.toString());
word.setLength(0);
continue;
}
word.append(t.charAt(i));
i++;
}
if ( word.length()>0 ) {
words.add(word.toString());
}
//System.out.println("words="+words);
StringBuilder buf = new StringBuilder();
for (int j=0; j<words.size(); j++) {
if ( j>0 && !words.get(j).equals(")") &&
!words.get(j-1).equals("(") ) {
buf.append(' ');
}
buf.append(words.get(j));
}
return buf.toString();
}
public static void help() {
System.err.println("org.antlr.v4.gunit.Gen [-o output-dir] gunit-file");
}
}

View File

@ -1,21 +0,0 @@
package org.antlr.v4.gunit;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.BufferedTreeNodeStream;
import org.antlr.runtime.tree.Tree;
public class Interp {
public static void main(String[] args) throws Exception {
String fileName = args[0];
ANTLRFileStream fs = new ANTLRFileStream(fileName);
gUnitLexer lexer = new gUnitLexer(fs);
CommonTokenStream tokens = new CommonTokenStream(lexer);
gUnitParser parser = new gUnitParser(tokens);
RuleReturnScope r = parser.gUnitDef();
System.out.println(((Tree)r.getTree()).toStringTree());
BufferedTreeNodeStream nodes = new BufferedTreeNodeStream(r.getTree());
ASTVerifier verifier = new ASTVerifier(nodes);
verifier.gUnitDef();
}
}

View File

@ -1,36 +0,0 @@
tree grammar Semantics;
options {
filter=true;
ASTLabelType=CommonTree;
tokenVocab = gUnit;
}
@header {
package org.antlr.v4.gunit;
import java.util.Map;
import java.util.HashMap;
}
@members {
public String name;
public Map<String,String> options = new HashMap<String,String>();
}
topdown
: optionsSpec
| gUnitDef
;
gUnitDef
: ^('gunit' ID .*) {name = $ID.text;}
;
optionsSpec
: ^(OPTIONS option+)
;
option
: ^('=' o=ID v=ID) {options.put($o.text, $v.text);}
| ^('=' o=ID v=STRING) {options.put($o.text, $v.text);}
;

View File

@ -1,155 +0,0 @@
grammar gUnit;
options {
output=AST;
ASTLabelType=CommonTree;
}
tokens { SUITE; TEST_OK; TEST_FAIL; TEST_RETVAL; TEST_STDOUT; TEST_TREE; TEST_ACTION; }
@header {
package org.antlr.v4.gunit;
}
@lexer::header {
package org.antlr.v4.gunit;
}
gUnitDef
: DOC_COMMENT? 'gunit' ID ';' (optionsSpec|header)* testsuite+
-> ^('gunit' ID DOC_COMMENT? optionsSpec? header? testsuite+)
;
optionsSpec
: OPTIONS (option ';')+ '}' -> ^(OPTIONS option+)
;
option
: ID '=' optionValue -> ^('=' ID optionValue)
;
optionValue
: ID
| STRING
;
header : '@header' ACTION -> ^('@header' ACTION);
testsuite
: DOC_COMMENT? treeRule=ID 'walks' parserRule=ID ':' testcase+
-> ^(SUITE $treeRule $parserRule DOC_COMMENT? testcase+)
| DOC_COMMENT? ID ':' testcase+ -> ^(SUITE ID DOC_COMMENT? testcase+)
;
testcase
: DOC_COMMENT? input 'OK' -> ^(TEST_OK DOC_COMMENT? input)
| DOC_COMMENT? input 'FAIL' -> ^(TEST_FAIL DOC_COMMENT? input)
| DOC_COMMENT? input 'returns' RETVAL -> ^(TEST_RETVAL DOC_COMMENT? input RETVAL)
| DOC_COMMENT? input '->' STRING -> ^(TEST_STDOUT DOC_COMMENT? input STRING)
| DOC_COMMENT? input '->' ML_STRING -> ^(TEST_STDOUT DOC_COMMENT? input ML_STRING)
| DOC_COMMENT? input '->' TREE -> ^(TEST_TREE DOC_COMMENT? input TREE)
| DOC_COMMENT? input '->' ACTION -> ^(TEST_ACTION DOC_COMMENT? input ACTION)
;
input
: STRING
| ML_STRING
| FILENAME
;
ACTION
: '{' ('\\}'|'\\' ~'}'|~('\\'|'}'))* '}' {setText(getText().substring(1, getText().length()-1));}
;
RETVAL
: NESTED_RETVAL {setText(getText().substring(1, getText().length()-1));}
;
fragment
NESTED_RETVAL :
'['
( options {greedy=false;}
: NESTED_RETVAL
| .
)*
']'
;
TREE : NESTED_AST (' '? NESTED_AST)*;
fragment
NESTED_AST
: '('
( NESTED_AST
| STRING_
| ~('('|')'|'"')
)*
')'
;
OPTIONS : 'options' WS* '{' ;
ID : ID_ ('.' ID_)* ;
fragment
ID_ : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
SL_COMMENT
: '//' ~('\r'|'\n')* '\r'? '\n' {$channel=HIDDEN;}
;
DOC_COMMENT
: '/**' (options {greedy=false;}:.)* '*/'
;
ML_COMMENT
: '/*' ~'*' (options {greedy=false;}:.)* '*/' {$channel=HIDDEN;}
;
STRING : STRING_ {setText(getText().substring(1, getText().length()-1));} ;
fragment
STRING_
: '"' ('\\"'|'\\' ~'"'|~('\\'|'"'))+ '"'
;
ML_STRING
: '<<' .* '>>' {setText(getText().substring(2, getText().length()-2));}
;
FILENAME
: '/' ID ('/' ID)*
| ID ('/' ID)+
;
/*
fragment
ESC : '\\'
( 'n'
| 'r'
| 't'
| 'b'
| 'f'
| '"'
| '\''
| '\\'
| '>'
| 'u' XDIGIT XDIGIT XDIGIT XDIGIT
| . // unknown, leave as it is
)
;
*/
fragment
XDIGIT :
'0' .. '9'
| 'a' .. 'f'
| 'A' .. 'F'
;

View File

@ -1,49 +0,0 @@
package org.antlr.v4.gunit;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.TreeAdaptor;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
public class gUnitBase {
public String lexerClassName;
public String parserClassName;
public String adaptorClassName;
public Object execParser(
String ruleName,
String input,
int scriptLine)
throws Exception
{
ANTLRStringStream is = new ANTLRStringStream(input);
Class lexerClass = Class.forName(lexerClassName);
Class[] lexArgTypes = new Class[]{CharStream.class};
Constructor lexConstructor = lexerClass.getConstructor(lexArgTypes);
Object[] lexArgs = new Object[]{is};
TokenSource lexer = (TokenSource)lexConstructor.newInstance(lexArgs);
is.setLine(scriptLine);
CommonTokenStream tokens = new CommonTokenStream(lexer);
Class parserClass = Class.forName(parserClassName);
Class[] parArgTypes = new Class[]{TokenStream.class};
Constructor parConstructor = parserClass.getConstructor(parArgTypes);
Object[] parArgs = new Object[]{tokens};
Parser parser = (Parser)parConstructor.newInstance(parArgs);
// set up customized tree adaptor if necessary
if ( adaptorClassName!=null ) {
parArgTypes = new Class[]{TreeAdaptor.class};
Method m = parserClass.getMethod("setTreeAdaptor", parArgTypes);
Class adaptorClass = Class.forName(adaptorClassName);
m.invoke(parser, adaptorClass.newInstance());
}
Method ruleMethod = parserClass.getMethod(ruleName);
// INVOKE RULE
return ruleMethod.invoke(parser);
}
}

View File

@ -1,53 +0,0 @@
tree grammar jUnitGen;
options {
output=template;
ASTLabelType=CommonTree;
tokenVocab = gUnit;
}
@header {
package org.antlr.v4.gunit;
}
gUnitDef
: ^('gunit' ID DOC_COMMENT? (optionsSpec|header)* suites+=testsuite+)
-> jUnitClass(className={$ID.text}, header={$header.st}, suites={$suites})
;
optionsSpec
: ^(OPTIONS option+)
;
option
: ^('=' ID ID)
| ^('=' ID STRING)
;
header : ^('@header' ACTION) -> header(action={$ACTION.text});
testsuite
: ^(SUITE rule=ID ID DOC_COMMENT? cases+=testcase[$rule.text]+)
| ^(SUITE rule=ID DOC_COMMENT? cases+=testcase[$rule.text]+)
-> testSuite(name={$rule.text}, cases={$cases})
;
testcase[String ruleName]
: ^(TEST_OK DOC_COMMENT? input)
| ^(TEST_FAIL DOC_COMMENT? input)
| ^(TEST_RETVAL DOC_COMMENT? input RETVAL)
| ^(TEST_STDOUT DOC_COMMENT? input STRING)
| ^(TEST_STDOUT DOC_COMMENT? input ML_STRING)
| ^(TEST_TREE DOC_COMMENT? input TREE)
-> parserRuleTestAST(ruleName={$ruleName},
input={$input.st},
expecting={Gen.normalizeTreeSpec($TREE.text)},
scriptLine={$input.start.getLine()})
| ^(TEST_ACTION DOC_COMMENT? input ACTION)
;
input
: STRING -> string(s={Gen.escapeForJava($STRING.text)})
| ML_STRING -> string(s={Gen.escapeForJava($ML_STRING.text)})
| FILENAME
;

238
runtime/Java/doxyfile Normal file
View File

@ -0,0 +1,238 @@
# Doxyfile 1.5.2
#---------------------------------------------------------------------------
# Project related configuration options
#---------------------------------------------------------------------------
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = "ANTLR v4 API"
PROJECT_NUMBER = 4.0
OUTPUT_DIRECTORY = api
CREATE_SUBDIRS = NO
OUTPUT_LANGUAGE = English
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ABBREVIATE_BRIEF = "The $name class" \
"The $name widget" \
"The $name file" \
is \
provides \
specifies \
contains \
represents \
a \
an \
the
ALWAYS_DETAILED_SEC = YES
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = YES
STRIP_FROM_PATH = /Applications/
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
DETAILS_AT_TOP = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 8
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = NO
OPTIMIZE_OUTPUT_JAVA = YES
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
DISTRIBUTE_GROUP_DOC = NO
SUBGROUPING = YES
#---------------------------------------------------------------------------
# Build related configuration options
#---------------------------------------------------------------------------
EXTRACT_ALL = YES
EXTRACT_PRIVATE = YES
EXTRACT_STATIC = YES
EXTRACT_LOCAL_CLASSES = YES
EXTRACT_LOCAL_METHODS = NO
HIDE_UNDOC_MEMBERS = NO
HIDE_UNDOC_CLASSES = NO
HIDE_FRIEND_COMPOUNDS = NO
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = NO
HIDE_SCOPE_NAMES = NO
SHOW_INCLUDE_FILES = YES
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_BY_SCOPE_NAME = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = NO
GENERATE_BUGLIST = NO
GENERATE_DEPRECATEDLIST= NO
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_DIRECTORIES = NO
FILE_VERSION_FILTER =
#---------------------------------------------------------------------------
# configuration options related to warning and progress messages
#---------------------------------------------------------------------------
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
#---------------------------------------------------------------------------
# configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = /Users/parrt/antlr/code/antlr4/runtime/Java/src
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.java
RECURSIVE = YES
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS = java::util \
java::io
EXAMPLE_PATH =
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
#---------------------------------------------------------------------------
# configuration options related to source browsing
#---------------------------------------------------------------------------
SOURCE_BROWSER = YES
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = NO
REFERENCES_RELATION = NO
REFERENCES_LINK_SOURCE = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
#---------------------------------------------------------------------------
# configuration options related to the alphabetical class index
#---------------------------------------------------------------------------
ALPHABETICAL_INDEX = NO
COLS_IN_ALPHA_INDEX = 5
IGNORE_PREFIX =
#---------------------------------------------------------------------------
# configuration options related to the HTML output
#---------------------------------------------------------------------------
GENERATE_HTML = YES
HTML_OUTPUT = .
HTML_FILE_EXTENSION = .html
HTML_HEADER =
HTML_FOOTER =
HTML_STYLESHEET =
HTML_ALIGN_MEMBERS = YES
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
BINARY_TOC = NO
TOC_EXPAND = NO
DISABLE_INDEX = NO
ENUM_VALUES_PER_LINE = 4
GENERATE_TREEVIEW = NO
TREEVIEW_WIDTH = 250
#---------------------------------------------------------------------------
# configuration options related to the LaTeX output
#---------------------------------------------------------------------------
GENERATE_LATEX = NO
LATEX_OUTPUT = latex
LATEX_CMD_NAME = latex
MAKEINDEX_CMD_NAME = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4wide
EXTRA_PACKAGES =
LATEX_HEADER =
PDF_HYPERLINKS = NO
USE_PDFLATEX = YES
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
#---------------------------------------------------------------------------
# configuration options related to the RTF output
#---------------------------------------------------------------------------
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
#---------------------------------------------------------------------------
# configuration options related to the man page output
#---------------------------------------------------------------------------
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_LINKS = NO
#---------------------------------------------------------------------------
# configuration options related to the XML output
#---------------------------------------------------------------------------
GENERATE_XML = NO
XML_OUTPUT = xml
XML_SCHEMA =
XML_DTD =
XML_PROGRAMLISTING = YES
#---------------------------------------------------------------------------
# configuration options for the AutoGen Definitions output
#---------------------------------------------------------------------------
GENERATE_AUTOGEN_DEF = NO
#---------------------------------------------------------------------------
# configuration options related to the Perl module output
#---------------------------------------------------------------------------
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the preprocessor
#---------------------------------------------------------------------------
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::additions related to external references
#---------------------------------------------------------------------------
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
PERL_PATH = /usr/bin/perl
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
#---------------------------------------------------------------------------
CLASS_DIAGRAMS = NO
MSCGEN_PATH = /Applications/Doxygen.app/Contents/Resources/
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = YES
CLASS_GRAPH = YES
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = YES
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
DOT_PATH = /Applications/Doxygen.app/Contents/Resources/
DOTFILE_DIRS =
DOT_GRAPH_MAX_NODES = 50
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES
#---------------------------------------------------------------------------
# Configuration::additions related to the search engine
#---------------------------------------------------------------------------
SEARCHENGINE = NO

View File

@ -0,0 +1,36 @@
<?xml version="1.0" encoding="UTF-8"?>
<project-shared-configuration>
<!--
This file contains additional configuration written by modules in the NetBeans IDE.
The configuration is intended to be shared among all the users of project and
therefore it is assumed to be part of version control checkout.
Without this configuration present, some functionality in the IDE may be limited or fail altogether.
-->
<properties xmlns="http://www.netbeans.org/ns/maven-properties-data/1">
<!--
Properties that influence various parts of the IDE, especially code formatting and the like.
You can copy and paste the single properties, into the pom.xml file and the IDE will pick them up.
That way multiple projects can share the same settings (useful for formatting rules for example).
Any value defined here will override the pom.xml file value but is only applicable to the current project.
-->
<org-netbeans-modules-editor-indent.CodeStyle.usedProfile>project</org-netbeans-modules-editor-indent.CodeStyle.usedProfile>
<org-netbeans-modules-editor-indent.CodeStyle.project.spaces-per-tab>4</org-netbeans-modules-editor-indent.CodeStyle.project.spaces-per-tab>
<org-netbeans-modules-editor-indent.CodeStyle.project.tab-size>4</org-netbeans-modules-editor-indent.CodeStyle.project.tab-size>
<org-netbeans-modules-editor-indent.CodeStyle.project.indent-shift-width>4</org-netbeans-modules-editor-indent.CodeStyle.project.indent-shift-width>
<org-netbeans-modules-editor-indent.CodeStyle.project.expand-tabs>true</org-netbeans-modules-editor-indent.CodeStyle.project.expand-tabs>
<org-netbeans-modules-editor-indent.CodeStyle.project.text-limit-width>80</org-netbeans-modules-editor-indent.CodeStyle.project.text-limit-width>
<org-netbeans-modules-editor-indent.CodeStyle.project.text-line-wrap>none</org-netbeans-modules-editor-indent.CodeStyle.project.text-line-wrap>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indentCasesFromSwitch>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indentCasesFromSwitch>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.spaces-per-tab>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.spaces-per-tab>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.tab-size>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.tab-size>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indent-shift-width>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.indent-shift-width>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.expand-tabs>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.expand-tabs>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-limit-width>80</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-limit-width>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-line-wrap>none</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.text-line-wrap>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.continuationIndentSize>4</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.continuationIndentSize>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStarImport>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStarImport>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStaticStarImport>false</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.allowConvertToStaticStarImport>
<org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.importGroupsOrder>*;java</org-netbeans-modules-editor-indent.text.x-java.CodeStyle.project.importGroupsOrder>
<netbeans.compile.on.save>test</netbeans.compile.on.save>
</properties>
</project-shared-configuration>

View File

@ -19,21 +19,21 @@
<dependency>
<groupId>org.antlr</groupId>
<artifactId>ST4</artifactId>
<version>4.0.4-SNAPSHOT</version>
<version>4.0.4</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.abego</groupId>
<artifactId>treelayout.core</artifactId>
<version>1.0</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/org.abego.treelayout.core.jar</systemPath>
<groupId>org.abego.treelayout</groupId>
<artifactId>org.abego.treelayout.core</artifactId>
<version>1.0.1</version>
<scope>compile</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src</sourceDirectory>
<resources/>
<plugins>
<plugin>
@ -42,6 +42,12 @@
<configuration>
<source>1.6</source>
<target>1.6</target>
<showWarnings>true</showWarnings>
<showDeprecation>true</showDeprecation>
<compilerArguments>
<Xlint/>
</compilerArguments>
<compilerArgument>-Xlint:-serial</compilerArgument>
</configuration>
</plugin>
</plugins>

View File

@ -29,30 +29,36 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.dfa.DFA;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.BitSet;
/** How to emit recognition errors */
public interface ANTLRErrorListener<Symbol> {
/** Upon syntax error, notify any interested parties. This is not how to
* recover from errors or compute error messages. The parser
* ANTLRErrorStrategy specifies how to recover from syntax errors
* and how to compute error messages. This listener's job is simply to
* emit a computed message, though it has enough information to
* create its own message in many cases.
public interface ANTLRErrorListener {
/** Upon syntax error, notify any interested parties. This is not
* how to recover from errors or compute error messages. The
* parser ANTLRErrorStrategy specifies how to recover from syntax
* errors and how to compute error messages. This listener's job
* is simply to emit a computed message, though it has enough
* information to create its own message in many cases.
*
* The RecognitionException is non-null for all syntax errors except
* when we discover mismatched token errors that we can recover from
* in-line, without returning from the surrounding rule (via the
* single token insertion and deletion mechanism).
* The RecognitionException is non-null for all syntax errors
* except when we discover mismatched token errors that we can
* recover from in-line, without returning from the surrounding
* rule (via the single token insertion and deletion mechanism).
*
* @param recognizer
* What parser got the error. From this object, you
* can access the context as well as the input stream.
* What parser got the error. From this
* object, you can access the context as well
* as the input stream.
* @param offendingSymbol
* The offending token in the input token stream, unless recognizer
* is a lexer (then it's null)
* If no viable alternative error, e has token
* at which we started production for the decision.
* The offending token in the input token
* stream, unless recognizer is a lexer (then it's null) If
* no viable alternative error, e has token at which we
* started production for the decision.
* @param line
* At what line in input to the error occur? This always refers to
* stopTokenIndex
@ -66,10 +72,40 @@ public interface ANTLRErrorListener<Symbol> {
* the parser was able to recover in line without exiting the
* surrounding rule.
*/
public <T extends Symbol> void error(Recognizer<T, ?> recognizer,
@Nullable T offendingSymbol,
int line,
int charPositionInLine,
String msg,
@Nullable RecognitionException e);
public void syntaxError(Recognizer<?, ?> recognizer,
@Nullable Object offendingSymbol,
int line,
int charPositionInLine,
String msg,
@Nullable RecognitionException e);
/** Called when the parser detects a true ambiguity: an input
* sequence can be matched literally by two or more pass through
* the grammar. ANTLR resolves the ambiguity in favor of the
* alternative appearing first in the grammar. The start and stop
* index are zero-based absolute indices into the token
* stream. ambigAlts is a set of alternative numbers that can
* match the input sequence. This method is only called when we
* are parsing with full context.
*/
void reportAmbiguity(@NotNull Parser recognizer,
DFA dfa, int startIndex, int stopIndex,
@NotNull BitSet ambigAlts,
@NotNull ATNConfigSet configs);
void reportAttemptingFullContext(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs);
/** Called by the parser when it find a conflict that is resolved
* by retrying the parse with full context. This is not a
* warning; it simply notifies you that your grammar is more
* complicated than Strong LL can handle. The parser moved up to
* full context parsing for that input sequence.
*/
void reportContextSensitivity(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs);
}

View File

@ -1,10 +1,5 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.atn.DecisionState;
import org.antlr.v4.runtime.atn.SemanticContext;
import org.antlr.v4.runtime.dfa.DFA;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
@ -114,30 +109,4 @@ public interface ANTLRErrorStrategy {
void reportError(@NotNull Parser recognizer,
@Nullable RecognitionException e)
throws RecognitionException;
/** Called when the parser detects a true ambiguity: an input sequence can be matched
* literally by two or more pass through the grammar. ANTLR resolves the ambiguity in
* favor of the alternative appearing first in the grammar. The start and stop index are
* zero-based absolute indices into the token stream. ambigAlts is a set of alternative numbers
* that can match the input sequence. This method is only called when we are parsing with
* full context.
*/
void reportAmbiguity(@NotNull Parser recognizer,
DFA dfa, int startIndex, int stopIndex, @NotNull IntervalSet ambigAlts,
@NotNull ATNConfigSet configs);
void reportAttemptingFullContext(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs);
/** Called by the parser when it find a conflict that is resolved by retrying the parse
* with full context. This is not a warning; it simply notifies you that your grammar
* is more complicated than Strong LL can handle. The parser moved up to full context
* parsing for that input sequence.
*/
void reportContextSensitivity(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs);
}

View File

@ -28,6 +28,8 @@
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
@ -72,33 +74,37 @@ public class ANTLRInputStream implements CharStream {
this(r, INITIAL_BUFFER_SIZE, READ_BUFFER_SIZE);
}
public ANTLRInputStream(Reader r, int size) throws IOException {
this(r, size, READ_BUFFER_SIZE);
public ANTLRInputStream(Reader r, int initialSize) throws IOException {
this(r, initialSize, READ_BUFFER_SIZE);
}
public ANTLRInputStream(Reader r, int size, int readChunkSize) throws IOException {
load(r, size, readChunkSize);
public ANTLRInputStream(Reader r, int initialSize, int readChunkSize) throws IOException {
load(r, initialSize, readChunkSize);
}
public ANTLRInputStream(InputStream input) throws IOException {
this(new InputStreamReader(input), INITIAL_BUFFER_SIZE);
}
public ANTLRInputStream(InputStream input) throws IOException {
this(new InputStreamReader(input), INITIAL_BUFFER_SIZE);
}
public ANTLRInputStream(InputStream input, int size) throws IOException {
this(new InputStreamReader(input), size);
}
public ANTLRInputStream(InputStream input, int initialSize) throws IOException {
this(new InputStreamReader(input), initialSize);
}
public void load(Reader r, int size, int readChunkSize)
throws IOException
{
if ( r==null ) {
return;
}
if ( size<=0 ) {
size = INITIAL_BUFFER_SIZE;
}
if ( readChunkSize<=0 ) {
readChunkSize = READ_BUFFER_SIZE;
public ANTLRInputStream(InputStream input, int initialSize, int readChunkSize) throws IOException {
this(new InputStreamReader(input), initialSize, readChunkSize);
}
public void load(Reader r, int size, int readChunkSize)
throws IOException
{
if ( r==null ) {
return;
}
if ( size<=0 ) {
size = INITIAL_BUFFER_SIZE;
}
if ( readChunkSize<=0 ) {
readChunkSize = READ_BUFFER_SIZE;
}
// System.out.println("load "+size+" in chunks of "+readChunkSize);
try {
@ -138,6 +144,11 @@ public class ANTLRInputStream implements CharStream {
@Override
public void consume() {
if (p >= n) {
assert LA(1) == CharStream.EOF;
throw new IllegalStateException("cannot consume EOF");
}
//System.out.println("prev p="+p+", c="+(char)data[p]);
if ( p < n ) {
p++;
@ -153,13 +164,13 @@ public class ANTLRInputStream implements CharStream {
if ( i<0 ) {
i++; // e.g., translate LA(-1) to use offset i=0; then data[p+0-1]
if ( (p+i-1) < 0 ) {
return CharStream.EOF; // invalid; no char before first char
return IntStream.EOF; // invalid; no char before first char
}
}
if ( (p+i-1) >= n ) {
//System.out.println("char LA("+i+")=EOF; p="+p);
return CharStream.EOF;
return IntStream.EOF;
}
//System.out.println("char LA("+i+")="+(char)data[p+i-1]+"; p="+p);
//System.out.println("LA("+i+"); p="+p+" n="+n+" data.length="+data.length);
@ -210,7 +221,9 @@ public class ANTLRInputStream implements CharStream {
}
@Override
public String substring(int start, int stop) {
public String getText(Interval interval) {
int start = interval.a;
int stop = interval.b;
if ( stop >= n ) stop = n-1;
int count = stop - start + 1;
if ( start >= n ) return "";

View File

@ -29,18 +29,25 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.ParseCancellationException;
/** Bail out of parser at first syntax error. Do this to use it:
* myparser.setErrorHandler(new BailErrorStrategy<Token>());
* <p/>
* {@code myparser.setErrorHandler(new BailErrorStrategy());}
*/
public class BailErrorStrategy extends DefaultErrorStrategy {
/** Instead of recovering from exception e, Re-throw wrote it wrapped
* in a generic RuntimeException so it is not caught by the
* rule function catches. Exception e is the "cause" of the
* RuntimeException.
/** Instead of recovering from exception {@code e}, re-throw it wrapped
* in a {@link ParseCancellationException} so it is not caught by the
* rule function catches. Use {@link Exception#getCause()} to get the
* original {@link RecognitionException}.
*/
@Override
public void recover(Parser recognizer, RecognitionException e) {
throw new RuntimeException(e);
for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
context.exception = e;
}
throw new ParseCancellationException(e);
}
/** Make sure we don't attempt to recover inline; if the parser
@ -50,7 +57,12 @@ public class BailErrorStrategy extends DefaultErrorStrategy {
public Token recoverInline(Parser recognizer)
throws RecognitionException
{
throw new RuntimeException(new InputMismatchException(recognizer));
InputMismatchException e = new InputMismatchException(recognizer);
for (ParserRuleContext context = recognizer.getContext(); context != null; context = context.getParent()) {
context.exception = e;
}
throw new ParseCancellationException(e);
}
/** Make sure we don't attempt to recover from problems in subrules. */

View File

@ -0,0 +1,77 @@
/*
[The "BSD license"]
Copyright (c) 2012 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.dfa.DFA;
import java.util.BitSet;
/**
* @author Sam Harwell
*/
public class BaseErrorListener implements ANTLRErrorListener {
@Override
public void syntaxError(Recognizer<?, ?> recognizer,
Object offendingSymbol,
int line,
int charPositionInLine,
String msg,
RecognitionException e)
{
}
@Override
public void reportAmbiguity(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
BitSet ambigAlts,
ATNConfigSet configs)
{
}
@Override
public void reportAttemptingFullContext(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
ATNConfigSet configs)
{
}
@Override
public void reportContextSensitivity(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
ATNConfigSet configs)
{
}
}

View File

@ -29,9 +29,13 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
import java.util.*;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
/** Buffer all input tokens but do on-demand fetching of new tokens from
* lexer. Useful when the parser or lexer has to set context/mode info before
@ -47,7 +51,7 @@ import java.util.*;
* This is not a subclass of UnbufferedTokenStream because I don't want
* to confuse small moving window of tokens it uses for the full buffer.
*/
public class BufferedTokenStream<T extends Token> implements TokenStream {
public class BufferedTokenStream implements TokenStream {
@NotNull
protected TokenSource tokenSource;
@ -56,7 +60,7 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
* as its moving window moves through the input. This list captures
* everything so we can access complete input text.
*/
protected List<T> tokens = new ArrayList<T>(100);
protected List<Token> tokens = new ArrayList<Token>(100);
/** The index into the tokens list of the current token (next token
* to consume). tokens[p] should be LT(1). p=-1 indicates need
@ -153,7 +157,7 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
}
for (int i = 0; i < n; i++) {
T t = (T)tokenSource.nextToken();
Token t = tokenSource.nextToken();
if ( t instanceof WritableToken ) {
((WritableToken)t).setTokenIndex(tokens.size());
}
@ -168,21 +172,21 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
}
@Override
public T get(int i) {
public Token get(int i) {
if ( i < 0 || i >= tokens.size() ) {
throw new NoSuchElementException("token index "+i+" out of range 0.."+(tokens.size()-1));
throw new IndexOutOfBoundsException("token index "+i+" out of range 0.."+(tokens.size()-1));
}
return tokens.get(i);
}
/** Get all tokens from start..stop inclusively */
public List<T> get(int start, int stop) {
public List<Token> get(int start, int stop) {
if ( start<0 || stop<0 ) return null;
lazyInit();
List<T> subset = new ArrayList<T>();
List<Token> subset = new ArrayList<Token>();
if ( stop>=tokens.size() ) stop = tokens.size()-1;
for (int i = start; i <= stop; i++) {
T t = tokens.get(i);
Token t = tokens.get(i);
if ( t.getType()==Token.EOF ) break;
subset.add(t);
}
@ -192,13 +196,13 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
@Override
public int LA(int i) { return LT(i).getType(); }
protected T LB(int k) {
protected Token LB(int k) {
if ( (p-k)<0 ) return null;
return tokens.get(p-k);
}
@Override
public T LT(int k) {
public Token LT(int k) {
lazyInit();
if ( k==0 ) return null;
if ( k < 0 ) return LB(-k);
@ -248,9 +252,9 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
p = -1;
}
public List<T> getTokens() { return tokens; }
public List<Token> getTokens() { return tokens; }
public List<T> getTokens(int start, int stop) {
public List<Token> getTokens(int start, int stop) {
return getTokens(start, stop, null);
}
@ -258,16 +262,20 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
* the token type BitSet. Return null if no tokens were found. This
* method looks at both on and off channel tokens.
*/
public List<T> getTokens(int start, int stop, Set<Integer> types) {
public List<Token> getTokens(int start, int stop, Set<Integer> types) {
lazyInit();
if ( stop>=tokens.size() ) stop=tokens.size()-1;
if ( start<0 ) start=0;
if ( start<0 || stop>=tokens.size() ||
stop<0 || start>=tokens.size() )
{
throw new IndexOutOfBoundsException("start "+start+" or stop "+stop+
" not in 0.."+(tokens.size()-1));
}
if ( start>stop ) return null;
// list = tokens[start:stop]:{T t, t.getType() in types}
List<T> filteredTokens = new ArrayList<T>();
List<Token> filteredTokens = new ArrayList<Token>();
for (int i=start; i<=stop; i++) {
T t = tokens.get(i);
Token t = tokens.get(i);
if ( types==null || types.contains(t.getType()) ) {
filteredTokens.add(t);
}
@ -278,43 +286,155 @@ public class BufferedTokenStream<T extends Token> implements TokenStream {
return filteredTokens;
}
public List<T> getTokens(int start, int stop, int ttype) {
public List<Token> getTokens(int start, int stop, int ttype) {
HashSet<Integer> s = new HashSet<Integer>(ttype);
s.add(ttype);
return getTokens(start,stop, s);
}
@Override
/** Given a starting index, return the index of the next token on channel.
* Return i if tokens[i] is on channel. Return -1 if there are no tokens
* on channel between i and EOF.
*/
protected int nextTokenOnChannel(int i, int channel) {
sync(i);
Token token = tokens.get(i);
if ( i>=size() ) return -1;
while ( token.getChannel()!=channel ) {
if ( token.getType()==Token.EOF ) return -1;
i++;
sync(i);
token = tokens.get(i);
}
return i;
}
/** Given a starting index, return the index of the previous token on channel.
* Return i if tokens[i] is on channel. Return -1 if there are no tokens
* on channel between i and 0.
*/
protected int previousTokenOnChannel(int i, int channel) {
while ( i>=0 && tokens.get(i).getChannel()!=channel ) {
i--;
}
return i;
}
/** Collect all tokens on specified channel to the right of
* the current token up until we see a token on DEFAULT_TOKEN_CHANNEL or
* EOF. If channel is -1, find any non default channel token.
*/
public List<Token> getHiddenTokensToRight(int tokenIndex, int channel) {
lazyInit();
if ( tokenIndex<0 || tokenIndex>=tokens.size() ) {
throw new IndexOutOfBoundsException(tokenIndex+" not in 0.."+(tokens.size()-1));
}
int nextOnChannel =
nextTokenOnChannel(tokenIndex + 1, Lexer.DEFAULT_TOKEN_CHANNEL);
int to;
int from = tokenIndex+1;
// if none onchannel to right, nextOnChannel=-1 so set to = last token
if ( nextOnChannel == -1 ) to = size()-1;
else to = nextOnChannel;
return filterForChannel(from, to, channel);
}
/** Collect all hidden tokens (any off-default channel) to the right of
* the current token up until we see a token on DEFAULT_TOKEN_CHANNEL
* of EOF.
*/
public List<Token> getHiddenTokensToRight(int tokenIndex) {
return getHiddenTokensToRight(tokenIndex, -1);
}
/** Collect all tokens on specified channel to the left of
* the current token up until we see a token on DEFAULT_TOKEN_CHANNEL.
* If channel is -1, find any non default channel token.
*/
public List<Token> getHiddenTokensToLeft(int tokenIndex, int channel) {
lazyInit();
if ( tokenIndex<0 || tokenIndex>=tokens.size() ) {
throw new IndexOutOfBoundsException(tokenIndex+" not in 0.."+(tokens.size()-1));
}
int prevOnChannel =
previousTokenOnChannel(tokenIndex - 1, Lexer.DEFAULT_TOKEN_CHANNEL);
if ( prevOnChannel == tokenIndex - 1 ) return null;
// if none onchannel to left, prevOnChannel=-1 then from=0
int from = prevOnChannel+1;
int to = tokenIndex-1;
return filterForChannel(from, to, channel);
}
/** Collect all hidden tokens (any off-default channel) to the left of
* the current token up until we see a token on DEFAULT_TOKEN_CHANNEL.
*/
public List<Token> getHiddenTokensToLeft(int tokenIndex) {
return getHiddenTokensToLeft(tokenIndex, -1);
}
protected List<Token> filterForChannel(int from, int to, int channel) {
List<Token> hidden = new ArrayList<Token>();
for (int i=from; i<=to; i++) {
Token t = tokens.get(i);
if ( channel==-1 ) {
if ( t.getChannel()!= Lexer.DEFAULT_TOKEN_CHANNEL ) hidden.add(t);
}
else {
if ( t.getChannel()==channel ) hidden.add(t);
}
}
if ( hidden.size()==0 ) return null;
return hidden;
}
@Override
public String getSourceName() { return tokenSource.getSourceName(); }
/** Grab *all* tokens from stream and return string */
@Override
public String toString() {
/** Get the text of all tokens in this buffer. */
@NotNull
@Override
public String getText() {
lazyInit();
fill();
return toString(0, tokens.size()-1);
}
fill();
return getText(Interval.of(0,size()-1));
}
@NotNull
@Override
public String toString(int start, int stop) {
public String getText(Interval interval) {
int start = interval.a;
int stop = interval.b;
if ( start<0 || stop<0 ) return "";
lazyInit();
if ( stop>=tokens.size() ) stop = tokens.size()-1;
StringBuilder buf = new StringBuilder();
for (int i = start; i <= stop; i++) {
T t = tokens.get(i);
if ( t.getType()==Token.EOF ) break;
buf.append(t.getText());
}
return buf.toString();
StringBuilder buf = new StringBuilder();
for (int i = start; i <= stop; i++) {
Token t = tokens.get(i);
if ( t.getType()==Token.EOF ) break;
buf.append(t.getText());
}
return buf.toString();
}
@NotNull
@Override
public String getText(RuleContext ctx) {
return getText(ctx.getSourceInterval());
}
@NotNull
@Override
public String toString(Token start, Token stop) {
public String getText(Token start, Token stop) {
if ( start!=null && stop!=null ) {
return toString(start.getTokenIndex(), stop.getTokenIndex());
return getText(Interval.of(start.getTokenIndex(), stop.getTokenIndex()));
}
return null;
return "";
}
/** Get all tokens from lexer until EOF */

View File

@ -1,42 +1,73 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
/** A source of characters for an ANTLR lexer */
public interface CharStream extends IntStream {
public static final int EOF = -1;
public static final int MIN_CHAR = Character.MIN_VALUE;
public static final int MAX_CHAR = Character.MAX_VALUE-1; // FFFE is max
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
/** For unbuffered streams, you can't use this; primarily I'm providing
* a useful interface for action code. Just make sure actions don't
* use this on streams that don't support it.
/** A source of characters for an ANTLR lexer. */
public interface CharStream extends IntStream {
/**
* The minimum allowed value for a character in a {@code CharStream}.
*/
public String substring(int start, int stop);
public static final int MIN_CHAR = Character.MIN_VALUE;
/**
* The maximum allowed value for a character in a {@code CharStream}.
* <p/>
* This value is {@code Character.MAX_VALUE - 1}, which reserves the value
* {@code Character.MAX_VALUE} for special use within an implementing class.
* For some implementations, the data buffers required for supporting the
* marked ranges of {@link IntStream} are stored as {@code char[]} instead
* of {@code int[]}, with {@code Character.MAX_VALUE} being used instead of
* {@code -1} to mark the end of the stream internally.
*/
public static final int MAX_CHAR = Character.MAX_VALUE-1;
/**
* This method returns the text for a range of characters within this input
* stream. This method is guaranteed to not throw an exception if the
* specified {@code interval} lies entirely within a marked range. For more
* information about marked ranges, see {@link IntStream#mark}.
*
* @param interval an interval within the stream
* @return the text of the specified interval
*
* @throws NullPointerException if {@code interval} is {@code null}
* @throws IllegalArgumentException if {@code interval.a < 0}, or if
* {@code interval.b < interval.a - 1}, or if {@code interval.b} lies at or
* past the end of the stream
* @throws UnsupportedOperationException if the stream does not support
* getting the text of the specified interval
*/
@NotNull
public String getText(@NotNull Interval interval);
}

View File

@ -28,6 +28,8 @@
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
import java.io.Serializable;
public class CommonToken implements WritableToken, Serializable {
@ -109,7 +111,7 @@ public class CommonToken implements WritableToken, Serializable {
if ( input==null ) return null;
int n = input.size();
if ( start<n && stop<n) {
return input.substring(start,stop);
return input.getText(Interval.of(start,stop));
}
else {
return "<EOF>";

View File

@ -29,9 +29,24 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
public class CommonTokenFactory implements TokenFactory<CommonToken> {
public static final TokenFactory<CommonToken> DEFAULT = new CommonTokenFactory();
/** Copy text for token out of input char stream. Useful when input
* stream is unbuffered.
* @see UnbufferedCharStream
*/
protected final boolean copyText;
/** Create factory and indicate whether or not the factory copy
* text out of the char stream.
*/
public CommonTokenFactory(boolean copyText) { this.copyText = copyText; }
public CommonTokenFactory() { this(false); }
@Override
public CommonToken create(TokenSource source, int type, String text,
int channel, int start, int stop,
@ -43,6 +58,12 @@ public class CommonTokenFactory implements TokenFactory<CommonToken> {
if ( text!=null ) {
t.setText(text);
}
else {
if ( copyText ) {
CharStream input = source.getInputStream();
t.setText(input.getText(Interval.of(start,stop)));
}
}
return t;
}

View File

@ -46,7 +46,7 @@ package org.antlr.v4.runtime;
* @see UnbufferedTokenStream
* @see BufferedTokenStream
*/
public class CommonTokenStream extends BufferedTokenStream<Token> {
public class CommonTokenStream extends BufferedTokenStream {
/** Skip tokens on any channel but this one; this is how we skip whitespace... */
protected int channel = Token.DEFAULT_CHANNEL;
@ -61,7 +61,7 @@ public class CommonTokenStream extends BufferedTokenStream<Token> {
@Override
protected int adjustSeekIndex(int i) {
return skipOffTokenChannels(i);
return nextTokenOnChannel(i, channel);
}
@Override
@ -73,7 +73,7 @@ public class CommonTokenStream extends BufferedTokenStream<Token> {
// find k good tokens looking backwards
while ( n<=k ) {
// skip off-channel tokens
i = skipOffTokenChannelsReverse(i-1);
i = previousTokenOnChannel(i - 1, channel);
n++;
}
if ( i<0 ) return null;
@ -92,7 +92,7 @@ public class CommonTokenStream extends BufferedTokenStream<Token> {
while ( n<k ) {
// skip off-channel tokens, but make sure to not look past EOF
if (sync(i + 1)) {
i = skipOffTokenChannels(i+1);
i = nextTokenOnChannel(i + 1, channel);
}
n++;
}
@ -100,27 +100,6 @@ public class CommonTokenStream extends BufferedTokenStream<Token> {
return tokens.get(i);
}
/** Given a starting index, return the index of the first on-channel
* token.
*/
protected int skipOffTokenChannels(int i) {
sync(i);
Token token = tokens.get(i);
while ( token.getType()!=Token.EOF && token.getChannel()!=channel ) {
i++;
sync(i);
token = tokens.get(i);
}
return i;
}
protected int skipOffTokenChannelsReverse(int i) {
while ( i>=0 && tokens.get(i).getChannel()!=channel ) {
i--;
}
return i;
}
/** Count EOF just once. */
public int getNumberOfOnChannelTokens() {
int n = 0;

View File

@ -32,16 +32,16 @@ package org.antlr.v4.runtime;
*
* @author Sam Harwell
*/
public class ConsoleErrorListener implements ANTLRErrorListener<Object> {
public class ConsoleErrorListener extends BaseErrorListener {
public static final ConsoleErrorListener INSTANCE = new ConsoleErrorListener();
@Override
public <T extends Object> void error(Recognizer<T, ?> recognizer,
T offendingSymbol,
int line,
int charPositionInLine,
String msg,
RecognitionException e)
public void syntaxError(Recognizer<?, ?> recognizer,
Object offendingSymbol,
int line,
int charPositionInLine,
String msg,
RecognitionException e)
{
System.err.println("line " + line + ":" + charPositionInLine + " " + msg);
}

View File

@ -29,8 +29,14 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.*;
import org.antlr.v4.runtime.dfa.DFA;
import org.antlr.v4.runtime.atn.ATN;
import org.antlr.v4.runtime.atn.ATNState;
import org.antlr.v4.runtime.atn.BlockStartState;
import org.antlr.v4.runtime.atn.PlusBlockStartState;
import org.antlr.v4.runtime.atn.PlusLoopbackState;
import org.antlr.v4.runtime.atn.RuleTransition;
import org.antlr.v4.runtime.atn.StarLoopEntryState;
import org.antlr.v4.runtime.atn.StarLoopbackState;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
@ -50,7 +56,7 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
/** The index into the input stream where the last error occurred.
* This is used to prevent infinite loops where an error is found
* but no token is consumed during recovery...another error is found,
* ad naseum. This is a failsafe mechanism to guarantee that at least
* ad nauseum. This is a failsafe mechanism to guarantee that at least
* one token/tree node is consumed for two errors.
*/
protected int lastErrorIndex = -1;
@ -104,7 +110,7 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
else {
System.err.println("unknown recognition error type: "+e.getClass().getName());
if ( recognizer!=null ) {
recognizer.notifyErrorListeners((Token) e.offendingToken, e.getMessage(), e);
recognizer.notifyErrorListeners(e.getOffendingToken(), e.getMessage(), e);
}
}
}
@ -120,7 +126,8 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
// lastErrorIndex+
// ", states="+lastErrorStates);
if ( lastErrorIndex==recognizer.getInputStream().index() &&
lastErrorStates.contains(recognizer._ctx.s) ) {
lastErrorStates != null &&
lastErrorStates.contains(recognizer._ctx.s) ) {
// uh oh, another error at same token index and previously-visited
// state in ATN; must be a case where LT(1) is in the recovery
// token set so nothing got consumed. Consume a single token
@ -159,7 +166,7 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
// If already recovering, don't try to sync
if ( errorRecoveryMode ) return;
SymbolStream<Token> tokens = recognizer.getInputStream();
TokenStream tokens = recognizer.getInputStream();
int la = tokens.LA(1);
// try cheaper subset first; might get lucky. seems to shave a wee bit off
@ -195,26 +202,26 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
NoViableAltException e)
throws RecognitionException
{
SymbolStream<Token> tokens = recognizer.getInputStream();
TokenStream tokens = recognizer.getInputStream();
String input;
if (tokens instanceof TokenStream) {
if ( e.startToken.getType()==Token.EOF ) input = "<EOF>";
else input = ((TokenStream)tokens).toString(e.startToken, e.offendingToken);
if ( e.getStartToken().getType()==Token.EOF ) input = "<EOF>";
else input = tokens.getText(e.getStartToken(), e.getOffendingToken());
}
else {
input = "<unknown input>";
}
String msg = "no viable alternative at input "+escapeWSAndQuote(input);
recognizer.notifyErrorListeners((Token) e.offendingToken, msg, e);
recognizer.notifyErrorListeners(e.getOffendingToken(), msg, e);
}
public void reportInputMismatch(Parser recognizer,
InputMismatchException e)
throws RecognitionException
{
String msg = "mismatched input "+getTokenErrorDisplay((Token)e.offendingToken)+
String msg = "mismatched input "+getTokenErrorDisplay(e.getOffendingToken())+
" expecting "+e.getExpectedTokens().toString(recognizer.getTokenNames());
recognizer.notifyErrorListeners((Token) e.offendingToken, msg, e);
recognizer.notifyErrorListeners(e.getOffendingToken(), msg, e);
}
public void reportFailedPredicate(Parser recognizer,
@ -222,8 +229,8 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
throws RecognitionException
{
String ruleName = recognizer.getRuleNames()[recognizer._ctx.getRuleIndex()];
String msg = "rule "+ruleName+" "+e.msg;
recognizer.notifyErrorListeners((Token) e.offendingToken, msg, e);
String msg = "rule "+ruleName+" "+e.getMessage();
recognizer.notifyErrorListeners(e.getOffendingToken(), msg, e);
}
public void reportUnwantedToken(Parser recognizer) {
@ -311,7 +318,8 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
// is free to conjure up and insert the missing token
ATNState currentState = recognizer.getInterpreter().atn.states.get(recognizer._ctx.s);
ATNState next = currentState.transition(0).target;
IntervalSet expectingAtLL2 = recognizer.getInterpreter().atn.nextTokens(next, recognizer._ctx);
ATN atn = recognizer.getInterpreter().atn;
IntervalSet expectingAtLL2 = atn.nextTokens(next, recognizer._ctx);
// System.out.println("LT(2) set="+expectingAtLL2.toString(recognizer.getTokenNames()));
if ( expectingAtLL2.contains(currentSymbolType) ) {
reportMissingToken(recognizer);
@ -367,8 +375,9 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
if ( expectedTokenType== Token.EOF ) tokenText = "<missing EOF>";
else tokenText = "<missing "+recognizer.getTokenNames()[expectedTokenType]+">";
Token current = currentSymbol;
if ( current.getType() == Token.EOF ) {
current = recognizer.getInputStream().LT(-1);
Token lookback = recognizer.getInputStream().LT(-1);
if ( current.getType() == Token.EOF && lookback!=null ) {
current = lookback;
}
return
_factory.create(current.getTokenSource(), expectedTokenType, tokenText,
@ -404,21 +413,11 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
}
protected String getSymbolText(@NotNull Token symbol) {
if (symbol instanceof Token) {
return ((Token)symbol).getText();
}
else {
return symbol.toString();
}
return symbol.getText();
}
protected int getSymbolType(@NotNull Token symbol) {
if (symbol instanceof Token) {
return ((Token)symbol).getType();
}
else {
return Token.INVALID_TYPE;
}
return symbol.getType();
}
protected String escapeWSAndQuote(String s) {
@ -549,25 +548,4 @@ public class DefaultErrorStrategy implements ANTLRErrorStrategy {
ttype = recognizer.getInputStream().LA(1);
}
}
@Override
public void reportAmbiguity(@NotNull Parser recognizer,
DFA dfa, int startIndex, int stopIndex, @NotNull IntervalSet ambigAlts,
@NotNull ATNConfigSet configs)
{
}
@Override
public void reportAttemptingFullContext(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs)
{
}
@Override
public void reportContextSensitivity(@NotNull Parser recognizer, @NotNull DFA dfa,
int startIndex, int stopIndex, @NotNull ATNConfigSet configs)
{
}
}

View File

@ -31,17 +31,21 @@ package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.dfa.DFA;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
public class DiagnosticErrorStrategy extends DefaultErrorStrategy {
import java.util.BitSet;
public class DiagnosticErrorListener extends BaseErrorListener {
@Override
public void reportAmbiguity(@NotNull Parser recognizer,
DFA dfa, int startIndex, int stopIndex, @NotNull IntervalSet ambigAlts,
DFA dfa, int startIndex, int stopIndex,
@NotNull BitSet ambigAlts,
@NotNull ATNConfigSet configs)
{
recognizer.notifyErrorListeners("reportAmbiguity d=" + dfa.decision + ": ambigAlts=" + ambigAlts + ":" + configs + ", input='" +
recognizer.getInputString(startIndex, stopIndex) + "'");
recognizer.notifyErrorListeners("reportAmbiguity d=" + dfa.decision +
": ambigAlts=" + ambigAlts + ", input='" +
recognizer.getTokenStream().getText(Interval.of(startIndex, stopIndex)) + "'");
}
@Override
@ -50,15 +54,19 @@ public class DiagnosticErrorStrategy extends DefaultErrorStrategy {
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs)
{
recognizer.notifyErrorListeners("reportAttemptingFullContext d=" + dfa.decision + ": " + configs + ", input='" +
recognizer.getInputString(startIndex, stopIndex) + "'");
recognizer.notifyErrorListeners("reportAttemptingFullContext d=" +
dfa.decision + ", input='" +
recognizer.getTokenStream().getText(Interval.of(startIndex, stopIndex)) + "'");
}
@Override
public void reportContextSensitivity(@NotNull Parser recognizer, @NotNull DFA dfa,
int startIndex, int stopIndex, @NotNull ATNConfigSet configs)
public void reportContextSensitivity(@NotNull Parser recognizer,
@NotNull DFA dfa,
int startIndex, int stopIndex,
@NotNull ATNConfigSet configs)
{
recognizer.notifyErrorListeners("reportContextSensitivity d=" + dfa.decision + ": " + configs + ", input='" +
recognizer.getInputString(startIndex, stopIndex) + "'");
recognizer.notifyErrorListeners("reportContextSensitivity d=" +
dfa.decision + ", input='" +
recognizer.getTokenStream().getText(Interval.of(startIndex, stopIndex)) + "'");
}
}

View File

@ -30,6 +30,7 @@ package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNState;
import org.antlr.v4.runtime.atn.PredicateTransition;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
/** A semantic predicate failed during validation. Validation of predicates
@ -38,22 +39,50 @@ import org.antlr.v4.runtime.misc.Nullable;
* prediction.
*/
public class FailedPredicateException extends RecognitionException {
public int ruleIndex;
public int predIndex;
public String msg;
private final int ruleIndex;
private final int predicateIndex;
private final String predicate;
public FailedPredicateException(Parser recognizer) {
public FailedPredicateException(@NotNull Parser recognizer) {
this(recognizer, null);
}
public FailedPredicateException(Parser recognizer, @Nullable String predicate) {
super(recognizer, recognizer.getInputStream(), recognizer._ctx);
public FailedPredicateException(@NotNull Parser recognizer, @Nullable String predicate) {
this(recognizer, predicate, null);
}
public FailedPredicateException(@NotNull Parser recognizer,
@Nullable String predicate,
@Nullable String message)
{
super(formatMessage(predicate, message), recognizer, recognizer.getInputStream(), recognizer._ctx);
ATNState s = recognizer.getInterpreter().atn.states.get(recognizer._ctx.s);
PredicateTransition trans = (PredicateTransition)s.transition(0);
ruleIndex = trans.ruleIndex;
predIndex = trans.predIndex;
this.msg = String.format("failed predicate: {%s}?", predicate);
Token la = recognizer.getCurrentToken();
this.offendingToken = la;
this.ruleIndex = trans.ruleIndex;
this.predicateIndex = trans.predIndex;
this.predicate = predicate;
this.setOffendingToken(recognizer.getCurrentToken());
}
public int getRuleIndex() {
return ruleIndex;
}
public int getPredIndex() {
return predicateIndex;
}
@Nullable
public String getPredicate() {
return predicate;
}
@NotNull
private static String formatMessage(@Nullable String predicate, @Nullable String message) {
if (message != null) {
return message;
}
return String.format("failed predicate: {%s}?", predicate);
}
}

View File

@ -1,12 +1,39 @@
/*
[The "BSD license"]
Copyright (c) 2012 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
/** This signifies any kind of mismatched input exceptions such as
* when the current input does not match the expected token or tree node.
* when the current input does not match the expected token.
*/
public class InputMismatchException extends RecognitionException {
public InputMismatchException(Parser recognizer) {
super(recognizer, recognizer.getInputStream(), recognizer._ctx);
Token la = recognizer.getCurrentToken();
this.offendingToken = la;
this.setOffendingToken(recognizer.getCurrentToken());
}
}

View File

@ -1,93 +1,242 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
/** A simple stream of integers used when all I care about is the char
* or token type sequence (such as interpretation).
import org.antlr.v4.runtime.misc.NotNull;
/**
* A simple stream of symbols whose values are represented as integers. This
* interface provides <em>marked ranges</em> with support for a minimum level
* of buffering necessary to implement arbitrary lookahead during prediction.
* For more information on marked ranges, see {@link #mark}.
* <p/>
* <strong>Initializing Methods:</strong> Some methods in this interface have
* unspecified behavior if no call to an initializing method has occurred after
* the stream was constructed. The following is a list of initializing methods:
*
* Do like Java IO and have new BufferedStream(new TokenStream) rather than
* using inheritance?
* <ul>
* <li>{@link #LA}</li>
* <li>{@link #consume}</li>
* <li>{@link #size}</li>
* </ul>
*/
public interface IntStream {
/**
* The value returned by {@link #LA LA()} when the end of the stream is
* reached.
*/
public static final int EOF = -1;
/**
* The value returned by {@link #getSourceName} when the actual name of the
* underlying source is not known.
*/
public static final String UNKNOWN_SOURCE_NAME = "<unknown>";
/**
* Consumes the current symbol in the stream. This method has the following
* effects:
*
* <ul>
* <li><strong>Forward movement:</strong> The value of {@link #index index()}
* before calling this method is less than the value of {@code index()}
* after calling this method.</li>
* <li><strong>Ordered lookahead:</strong> The value of {@code LA(1)} before
* calling this method becomes the value of {@code LA(-1)} after calling
* this method.</li>
* </ul>
*
* Note that calling this method does not guarantee that {@code index()} is
* incremented by exactly 1, as that would preclude the ability to implement
* filtering streams (e.g. {@link CommonTokenStream} which distinguishes
* between "on-channel" and "off-channel" tokens).
*
* @throws IllegalStateException if an attempt is made to consume the the
* end of the stream (i.e. if {@code LA(1)==}{@link #EOF EOF} before calling
* {@code consume}).
*/
void consume();
/** Get int at current input pointer + i ahead where i=1 is next int.
* Negative indexes are allowed. LA(-1) is previous token (token
* just matched). LA(-i) where i is before first token should
* yield -1, invalid char / EOF.
/**
* Gets the value of the symbol at offset {@code i} from the current
* position. When {@code i==1}, this method returns the value of the current
* symbol in the stream (which is the next symbol to be consumed). When
* {@code i==-1}, this method returns the value of the previously read
* symbol in the stream. It is not valid to call this method with
* {@code i==0}, but the specific behavior is unspecified because this
* method is frequently called from performance-critical code.
* <p/>
* This method is guaranteed to succeed if any of the following are true:
*
* <ul>
* <li>{@code i>0}</li>
* <li>{@code i==-1} and {@link #index index()} returns a value greater
* than the value of {@code index()} after the stream was constructed
* and {@code LA(1)} was called in that order. Specifying the current
* {@code index()} relative to the index after the stream was created
* allows for filtering implementations that do not return every symbol
* from the underlying source. Specifying the call to {@code LA(1)}
* allows for lazily initialized streams.</li>
* <li>{@code LA(i)} refers to a symbol consumed within a marked region
* that has not yet been released.</li>
* </ul>
*
* If {@code i} represents a position at or beyond the end of the stream,
* this method returns {@link #EOF}.
* <p/>
* The return value is unspecified if {@code i&lt;0} and fewer than {@code -i}
* calls to {@link #consume consume()} have occurred from the beginning of
* the stream before calling this method.
*
* @throws UnsupportedOperationException if the stream does not support
* retrieving the value of the specified symbol
*/
int LA(int i);
/** Tell the stream to start buffering if it hasn't already. Return
* a marker, usually current input position, index().
* Calling release(mark()) should not affect the input cursor.
* Can seek to any index between where we were when mark() was called
* and current index() until we release this marker.
*/
/**
* A mark provides a guarantee that {@link #seek seek()} operations will be
* valid over a "marked range" extending from the index where {@code mark()}
* was called to the current {@link #index index()}. This allows the use of
* streaming input sources by specifying the minimum buffering requirements
* to support arbitrary lookahead during prediction.
* <p/>
* The returned mark is an opaque handle (type {@code int}) which is passed
* to {@link #release release()} when the guarantees provided by the marked
* range are no longer necessary. When calls to
* {@code mark()}/{@code release()} are nested, the marks must be released
* in reverse order of which they were obtained. Since marked regions are
* used during performance-critical sections of prediction, the specific
* behavior of invalid usage is unspecified (i.e. a mark is not released, or
* a mark is released twice, or marks are not released in reverse order from
* which they were created).
* <p/>
* The behavior of this method is unspecified if no call to an
* {@link IntStream initializing method} has occurred after this stream was
* constructed.
* <p/>
* This method does not change the current position in the input stream.
* <p/>
* The following example shows the use of {@link #mark mark()},
* {@link #release release(mark)}, {@link #index index()}, and
* {@link #seek seek(index)} as part of an operation to safely work within a
* marked region, then restore the stream position to its original value and
* release the mark.
* <pre>
* IntStream stream = ...;
* int index = -1;
* int mark = stream.mark();
* try {
* index = stream.index();
* // perform work here...
* } finally {
* if (index != -1) {
* stream.seek(index);
* }
* stream.release(mark);
* }
* </pre>
*
* @return An opaque marker which should be passed to
* {@link #release release()} when the marked range is no longer required.
*/
int mark();
/** Release requirement that stream holds tokens from marked location
* to current index(). Must release in reverse order (like stack)
* of mark() otherwise undefined behavior.
/**
* This method releases a marked range created by a call to
* {@link #mark mark()}. Calls to {@code release()} must appear in the
* reverse order of the corresponding calls to {@code mark()}. If a mark is
* released twice, or if marks are not released in reverse order of the
* corresponding calls to {@code mark()}, the behavior is unspecified.
* <p/>
* For more information and an example, see {@link #mark}.
*
* @param marker A marker returned by a call to {@code mark()}.
* @see #mark
*/
void release(int marker);
/** Return the current input symbol index 0..n where n indicates the
* last symbol has been read. The index is the symbol about to be
* read not the most recently read symbol.
*/
/**
* Return the index into the stream of the input symbol referred to by
* {@code LA(1)}.
* <p/>
* The behavior of this method is unspecified if no call to an
* {@link IntStream initializing method} has occurred after this stream was
* constructed.
*/
int index();
/** Set the input cursor to the position indicated by index. This is
* normally used to rewind the input stream but can move forward as well.
* It's up to the stream implementation to make sure that symbols are
* buffered as necessary to make seek land on a valid symbol.
* Or, they should avoid moving the input cursor.
/**
* Set the input cursor to the position indicated by {@code index}. If the
* specified index lies past the end of the stream, the operation behaves as
* though {@code index} was the index of the EOF symbol. After this method
* returns without throwing an exception, the at least one of the following
* will be true.
*
* The index is 0..n-1. A seek to position i means that LA(1) will
* return the ith symbol. So, seeking to 0 means LA(1) will return the
* first element in the stream.
*
* For unbuffered streams, index i might not be in buffer. That throws
* index exception
* <ul>
* <li>{@link #index index()} will return the index of the first symbol
* appearing at or after the specified {@code index}. Specifically,
* implementations which filter their sources should automatically
* adjust {@code index} forward the minimum amount required for the
* operation to target a non-ignored symbol.</li>
* <li>{@code LA(1)} returns {@link #EOF}</li>
* </ul>
*
* This operation is guaranteed to not throw an exception if {@code index}
* lies within a marked region. For more information on marked regions, see
* {@link #mark}. The behavior of this method is unspecified if no call to
* an {@link IntStream initializing method} has occurred after this stream
* was constructed.
*
* @param index The absolute index to seek to.
*
* @throws IllegalArgumentException if {@code index} is less than 0
* @throws UnsupportedOperationException if the stream does not support
* seeking to the specified index
*/
void seek(int index);
/** Only makes sense for streams that buffer everything up probably, but
* might be useful to display the entire stream or for testing. This
* value includes a single EOF.
/**
* Returns the total number of symbols in the stream, including a single EOF
* symbol.
*
* @throws UnsupportedOperationException if the size of the stream is
* unknown.
*/
int size();
/** Where are you getting symbols from? Normally, implementations will
* pass the buck all the way to the lexer who can ask its input stream
* for the file name or whatever.
/**
* Gets the name of the underlying symbol source. This method returns a
* non-null, non-empty string. If such a name is not known, this method
* returns {@link #UNKNOWN_SOURCE_NAME}.
*/
@NotNull
public String getSourceName();
}

View File

@ -29,8 +29,10 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.LexerATNSimulator;
import org.antlr.v4.runtime.misc.IntegerStack;
import org.antlr.v4.runtime.misc.Interval;
import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.EmptyStackException;
import java.util.List;
@ -89,7 +91,7 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
/** The token type for the current token */
public int _type;
public ArrayDeque<Integer> _modeStack = new ArrayDeque<Integer>();
public final IntegerStack _modeStack = new IntegerStack();
public int _mode = Lexer.DEFAULT_MODE;
/** You can set the text for the current token to override what is in
@ -97,6 +99,8 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
*/
public String _text;
public Lexer() { }
public Lexer(CharStream input) {
this._input = input;
}
@ -126,40 +130,57 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
*/
@Override
public Token nextToken() {
if (_hitEOF) return anEOF();
if (_input == null) {
throw new IllegalStateException("nextToken requires a non-null input stream.");
}
outer:
while (true) {
_token = null;
_channel = Token.DEFAULT_CHANNEL;
_tokenStartCharIndex = _input.index();
_tokenStartCharPositionInLine = getInterpreter().getCharPositionInLine();
_tokenStartLine = getInterpreter().getLine();
_text = null;
do {
_type = Token.INVALID_TYPE;
// Mark start location in char stream so unbuffered streams are
// guaranteed at least have text of current token
int tokenStartMarker = _input.mark();
try{
outer:
while (true) {
if (_hitEOF) {
emitEOF();
return _token;
}
_token = null;
_channel = Token.DEFAULT_CHANNEL;
_tokenStartCharIndex = _input.index();
_tokenStartCharPositionInLine = getInterpreter().getCharPositionInLine();
_tokenStartLine = getInterpreter().getLine();
_text = null;
do {
_type = Token.INVALID_TYPE;
// System.out.println("nextToken line "+tokenStartLine+" at "+((char)input.LA(1))+
// " in mode "+mode+
// " at index "+input.index());
int ttype;
try {
ttype = getInterpreter().match(_input, _mode);
}
catch (LexerNoViableAltException e) {
notifyListeners(e); // report error
recover(e);
ttype = SKIP;
}
if ( _input.LA(1)==CharStream.EOF ) {
_hitEOF = true;
}
if ( _type == Token.INVALID_TYPE ) _type = ttype;
if ( _type ==SKIP ) {
continue outer;
}
} while ( _type ==MORE );
if ( _token ==null ) emit();
return _token;
int ttype;
try {
ttype = getInterpreter().match(_input, _mode);
}
catch (LexerNoViableAltException e) {
notifyListeners(e); // report error
recover(e);
ttype = SKIP;
}
if ( _input.LA(1)==IntStream.EOF ) {
_hitEOF = true;
}
if ( _type == Token.INVALID_TYPE ) _type = ttype;
if ( _type ==SKIP ) {
continue outer;
}
} while ( _type ==MORE );
if ( _token == null ) emit();
return _token;
}
}
finally {
// make sure we release marker after match or
// unbuffered char stream will keep buffering
_input.release(tokenStartMarker);
}
}
@ -183,7 +204,6 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
public void pushMode(int m) {
if ( LexerATNSimulator.debug ) System.out.println("pushMode "+m);
getInterpreter().tracePushMode(m);
_modeStack.push(_mode);
mode(m);
}
@ -191,7 +211,6 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
public int popMode() {
if ( _modeStack.isEmpty() ) throw new EmptyStackException();
if ( LexerATNSimulator.debug ) System.out.println("popMode back to "+ _modeStack.peek());
getInterpreter().tracePopMode();
mode( _modeStack.pop() );
return _mode;
}
@ -201,6 +220,10 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
this._factory = factory;
}
public TokenFactory<? extends Token> getTokenFactory() {
return _factory;
}
/** Set the char stream and reset the lexer */
@Override
public void setInputStream(IntStream input) {
@ -219,13 +242,12 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
return _input;
}
/** Currently does not support multiple emits per nextToken invocation
* for efficiency reasons. Subclass and override this method and
* nextToken (to push tokens into a list and pull from that list rather
* than a single variable as this implementation does).
/** By default does not support multiple emits per nextToken invocation
* for efficiency reasons. Subclass and override this method, nextToken,
* and getToken (to push tokens into a list and pull from that list
* rather than a single variable as this implementation does).
*/
public void emit(Token token) {
getInterpreter().traceEmit(token);
//System.err.println("emit "+token);
this._token = token;
}
@ -243,7 +265,7 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
return t;
}
public Token anEOF() {
public Token emitEOF() {
int cpos = getCharPositionInLine();
// The character position for EOF is one beyond the position of
// the previous token's last character
@ -253,6 +275,7 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
}
Token eof = _factory.create(this, Token.EOF, null, Token.DEFAULT_CHANNEL, _input.index(), _input.index()-1,
getLine(), cpos);
emit(eof);
return eof;
}
@ -266,6 +289,14 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
return getInterpreter().getCharPositionInLine();
}
public void setLine(int line) {
getInterpreter().setLine(line);
}
public void setCharPositionInLine(int charPositionInLine) {
getInterpreter().setCharPositionInLine(charPositionInLine);
}
/** What is the index of the current character of lookahead? */
public int getCharIndex() {
return _input.index();
@ -279,7 +310,6 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
return _text;
}
return getInterpreter().getText(_input);
// return ((CharStream)input).substring(tokenStartCharIndex,getCharIndex()-1);
}
/** Set the complete text of this token; it wipes any previous
@ -289,6 +319,29 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
this._text = text;
}
/** Override if emitting multiple tokens. */
public Token getToken() { return _token; }
public void setToken(Token _token) {
this._token = _token;
}
public void setType(int ttype) {
_type = ttype;
}
public int getType() {
return _type;
}
public void setChannel(int channel) {
_channel = channel;
}
public int getChannel() {
return _channel;
}
public String[] getModeNames() {
return null;
}
@ -302,17 +355,32 @@ public abstract class Lexer extends Recognizer<Integer, LexerATNSimulator>
return null;
}
/** Return a list of all Token objects in input char stream.
* Forces load of all tokens. Does not include EOF token.
*/
public List<? extends Token> getAllTokens() {
List<Token> tokens = new ArrayList<Token>();
Token t = nextToken();
while ( t.getType()!=Token.EOF ) {
tokens.add(t);
t = nextToken();
}
return tokens;
}
public void recover(LexerNoViableAltException e) {
getInterpreter().consume(_input); // skip a char and try again
if (_input.LA(1) != IntStream.EOF) {
// skip a char and try again
getInterpreter().consume(_input);
}
}
public void notifyListeners(LexerNoViableAltException e) {
String msg = "token recognition error at: '"+
_input.substring(_tokenStartCharIndex, _input.index())+"'";
List<? extends ANTLRErrorListener<? super Integer>> listeners = getErrorListeners();
for (ANTLRErrorListener<? super Integer> listener : listeners) {
listener.error(this, null, _tokenStartLine, _tokenStartCharPositionInLine, msg, e);
}
_input.getText(Interval.of(_tokenStartCharIndex, _input.index()))+"'";
ANTLRErrorListener listener = getErrorListenerDispatch();
listener.syntaxError(this, null, _tokenStartLine, _tokenStartCharPositionInLine, msg, e);
}
public String getCharErrorDisplay(int c) {

View File

@ -30,24 +30,37 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.misc.Utils;
public class LexerNoViableAltException extends RecognitionException {
/** Matching attempted at what input index? */
public int startIndex;
private final int startIndex;
/** Which configurations did we try at input.index() that couldn't match input.LA(1)? */
public ATNConfigSet deadEndConfigs;
@Nullable
private final ATNConfigSet deadEndConfigs;
public LexerNoViableAltException(Lexer lexer,
CharStream input,
public LexerNoViableAltException(@Nullable Lexer lexer,
@NotNull CharStream input,
int startIndex,
ATNConfigSet deadEndConfigs) {
@Nullable ATNConfigSet deadEndConfigs) {
super(lexer, input, null);
this.startIndex = startIndex;
this.deadEndConfigs = deadEndConfigs;
}
public int getStartIndex() {
return startIndex;
}
@Nullable
public ATNConfigSet getDeadEndConfigs() {
return deadEndConfigs;
}
@Override
public CharStream getInputStream() {
return (CharStream)super.getInputStream();
@ -56,11 +69,11 @@ public class LexerNoViableAltException extends RecognitionException {
@Override
public String toString() {
String symbol = "";
if (startIndex >= 0 && startIndex < input.size()) {
symbol = getInputStream().substring(startIndex, startIndex);
if (startIndex >= 0 && startIndex < getInputStream().size()) {
symbol = getInputStream().getText(Interval.of(startIndex,startIndex));
symbol = Utils.escapeWhitespace(symbol, false);
}
return "NoViableAltException('" + symbol + "')";
return String.format("%s('%s')", LexerNoViableAltException.class.getSimpleName(), symbol);
}
}

View File

@ -29,39 +29,57 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
/** The parser could not decide which path in the decision to take based
* upon the remaining input.
/** Indicates that the parser could not decide which of two or more paths
* to take based upon the remaining input. It tracks the starting token
* of the offending input and also knows where the parser was
* in the various paths when the error. Reported by reportNoViableAlternative()
*/
public class NoViableAltException extends RecognitionException {
/** Which configurations did we try at input.index() that couldn't match input.LT(1)? */
public ATNConfigSet deadEndConfigs;
@Nullable
private final ATNConfigSet deadEndConfigs;
/** The token object at the start index; the input stream might
* not be buffering tokens so get a reference to it. (At the
* time the error occurred, of course the stream needs to keep a
* buffer all of the tokens but later we might not have access to those.)
*/
public Token startToken;
*/
@NotNull
private final Token startToken;
public <Symbol extends Token> NoViableAltException(Parser recognizer) { // LL(1) error
this(recognizer,recognizer.getInputStream(),
public NoViableAltException(@NotNull Parser recognizer) { // LL(1) error
this(recognizer,
recognizer.getInputStream(),
recognizer.getCurrentToken(),
recognizer.getCurrentToken(),
null,
recognizer._ctx);
}
public <Symbol> NoViableAltException(Parser recognizer,
SymbolStream<Symbol> input,
Token startToken,
Token offendingToken,
ATNConfigSet deadEndConfigs,
ParserRuleContext<?> ctx)
public NoViableAltException(@NotNull Parser recognizer,
@NotNull TokenStream input,
@NotNull Token startToken,
@NotNull Token offendingToken,
@Nullable ATNConfigSet deadEndConfigs,
@NotNull ParserRuleContext ctx)
{
super(recognizer, input, ctx);
this.deadEndConfigs = deadEndConfigs;
this.startToken = startToken;
this.offendingToken = offendingToken;
this.setOffendingToken(offendingToken);
}
@NotNull
public Token getStartToken() {
return startToken;
}
@Nullable
public ATNConfigSet getDeadEndConfigs() {
return deadEndConfigs;
}
}

View File

@ -1,18 +0,0 @@
package org.antlr.v4.runtime;
/** We must distinguish between listeners triggered during the parse
* from listeners triggered during a subsequent tree walk. During
* the parse, the ctx object arg for enter methods don't have any labels set.
* We can only access the general ParserRuleContext<Symbol> ctx.
* Also, we can only call exit methods for left-recursive rules. Let's
* make the interface clear these semantics up. If you need the ctx,
* use Parser.getRuleContext().
*/
public interface ParseListener<Symbol extends Token> {
void visitTerminal(ParserRuleContext<Symbol> parent, Symbol token);
/** Enter all but left-recursive rules */
void enterNonLRRule(ParserRuleContext<Symbol> ctx);
void exitEveryRule(ParserRuleContext<Symbol> ctx);
}

View File

@ -36,49 +36,59 @@ import org.antlr.v4.runtime.atn.RuleTransition;
import org.antlr.v4.runtime.dfa.DFA;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.tree.ErrorNode;
import org.antlr.v4.runtime.tree.ParseTreeListener;
import org.antlr.v4.runtime.tree.ParseTreeWalker;
import org.antlr.v4.runtime.tree.TerminalNode;
import java.util.ArrayList;
import java.util.List;
/** This is all the parsing support code essentially; most of it is error recovery stuff. */
public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>> {
public class TraceListener implements ParseListener<Token> {
public abstract class Parser extends Recognizer<Token, ParserATNSimulator> {
public class TraceListener implements ParseTreeListener {
@Override
public void enterNonLRRule(ParserRuleContext<Token> ctx) {
System.out.println("enter " + getRuleNames()[ctx.ruleIndex] + ", LT(1)=" + _input.LT(1).getText());
public void enterEveryRule(ParserRuleContext ctx) {
System.out.println("enter " + getRuleNames()[ctx.getRuleIndex()] +
", LT(1)=" + _input.LT(1).getText());
}
@Override
public void exitEveryRule(ParserRuleContext<Token> ctx) {
System.out.println("exit "+getRuleNames()[ctx.ruleIndex]+", LT(1)="+_input.LT(1).getText());
public void visitTerminal(TerminalNode node) {
System.out.println("consume "+node.getSymbol()+" rule "+
getRuleNames()[_ctx.getRuleIndex()]+
" alt="+_ctx.altNum);
}
@Override
public void visitTerminal(ParserRuleContext<Token> parent, Token token) {
System.out.println("consume "+token+" rule "+
getRuleNames()[parent.ruleIndex]+
" alt="+parent.altNum);
public void visitErrorNode(ErrorNode node) {
}
@Override
public void exitEveryRule(ParserRuleContext ctx) {
System.out.println("exit "+getRuleNames()[ctx.getRuleIndex()]+
", LT(1)="+_input.LT(1).getText());
}
}
public static class TrimToSizeListener implements ParseListener<Token> {
public static class TrimToSizeListener implements ParseTreeListener {
public static final TrimToSizeListener INSTANCE = new TrimToSizeListener();
@Override
public void visitTerminal(ParserRuleContext<Token> parent, Token token) {
}
public void enterEveryRule(ParserRuleContext ctx) { }
@Override
public void enterNonLRRule(ParserRuleContext<Token> ctx) {
}
public void visitTerminal(TerminalNode node) { }
@Override
public void exitEveryRule(ParserRuleContext<Token> ctx) {
public void visitErrorNode(ErrorNode node) { }
@Override
public void exitEveryRule(ParserRuleContext ctx) {
if (ctx.children instanceof ArrayList) {
((ArrayList<?>)ctx.children).trimToSize();
}
}
}
protected ANTLRErrorStrategy _errHandler = new DefaultErrorStrategy();
@ -90,9 +100,9 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
* When somebody calls the start rule, this gets set to the
* root context.
*/
protected ParserRuleContext<Token> _ctx;
protected ParserRuleContext _ctx;
protected boolean _buildParseTrees;
protected boolean _buildParseTrees = true;
protected TraceListener _tracer;
@ -100,9 +110,11 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
* *during* the parse. This is typically done only when not building
* parse trees for later visiting. We either trigger events during
* the parse or during tree walks later. Both could be done.
* Not intended for tree parsing but would work.
* Not intended for average user!!! Most people should use
* ParseTreeListener with ParseTreeWalker.
* @see ParseTreeWalker
*/
protected List<ParseListener<Token>> _parseListeners;
protected List<ParseTreeListener> _parseListeners;
/** Did the recognizer encounter a syntax error? Track how many. */
protected int _syntaxErrors = 0;
@ -130,7 +142,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
*/
public Token match(int ttype) throws RecognitionException {
Token t = getCurrentToken();
if ( getInputStream().LA(1)==ttype ) {
if ( t.getType()==ttype ) {
_errHandler.endErrorCondition(this);
consume();
}
@ -145,6 +157,24 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
return t;
}
public Token matchWildcard() throws RecognitionException {
Token t = getCurrentToken();
if (t.getType() > 0) {
_errHandler.endErrorCondition(this);
consume();
}
else {
t = _errHandler.recoverInline(this);
if (_buildParseTrees && t.getTokenIndex() == -1) {
// we must have conjured up a new token during single token insertion
// if it's not the current symbol
_ctx.addErrorNode(t);
}
}
return t;
}
/** Track the RuleContext objects during the parse and hook them up
* using the children list so that it forms a parse tree.
* The RuleContext returned from the start rule represents the root
@ -191,7 +221,6 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
}
/**
*
* @return {@code true} if the {@link ParserRuleContext#children} list is trimmed
* using the default {@link Parser.TrimToSizeListener} during the parse process.
*/
@ -208,19 +237,28 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
// return traceATNStates;
// }
public List<ParseListener<Token>> getParseListeners() {
public List<ParseTreeListener> getParseListeners() {
return _parseListeners;
}
public void addParseListener(ParseListener<Token> listener) {
/** Provide a listener that gets notified about token matches,
* and rule entry/exit events DURING the parse. It's a little bit
* weird for left recursive rule entry events but it's
* deterministic.
*
* THIS IS ONLY FOR ADVANCED USERS. Please give your
* ParseTreeListener to a ParseTreeWalker instead of giving it to
* the parser!!!!
*/
public void addParseListener(ParseTreeListener listener) {
if ( listener==null ) return;
if ( _parseListeners==null ) {
_parseListeners = new ArrayList<ParseListener<Token>>();
_parseListeners = new ArrayList<ParseTreeListener>();
}
this._parseListeners.add(listener);
}
public void removeParseListener(ParseListener<Token> l) {
public void removeParseListener(ParseTreeListener l) {
if ( l==null ) return;
if ( _parseListeners!=null ) {
_parseListeners.remove(l);
@ -234,17 +272,27 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
_parseListeners = null;
}
/** Notify any parse listeners (implemented as ParseTreeListener's)
* of an enter rule event. This is not involved with
* parse tree walking in any way; it's just reusing the
* ParseTreeListener interface. This is not for the average user.
*/
public void triggerEnterRuleEvent() {
for (ParseListener<Token> l : _parseListeners) {
l.enterNonLRRule(_ctx);
for (ParseTreeListener l : _parseListeners) {
l.enterEveryRule(_ctx);
_ctx.enterRule(l);
}
}
/** Notify any parse listeners (implemented as ParseTreeListener's)
* of an exit rule event. This is not involved with
* parse tree walking in any way; it's just reusing the
* ParseTreeListener interface. This is not for the average user.
*/
public void triggerExitRuleEvent() {
// reverse order walk of listeners
for (int i = _parseListeners.size()-1; i >= 0; i--) {
ParseListener<Token> l = _parseListeners.get(i);
ParseTreeListener l = _parseListeners.get(i);
_ctx.exitRule(l);
l.exitEveryRule(_ctx);
}
@ -295,18 +343,6 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
this._input = input;
}
public String getInputString(int start) {
return getInputString(start, getInputStream().index());
}
public String getInputString(int start, int stop) {
SymbolStream<Token> input = getInputStream();
if ( input instanceof TokenStream ) {
return ((TokenStream)input).toString(start,stop);
}
return "n/a";
}
/** Match needs to return the current input symbol, which gets put
* into the label for the associated token ref; e.g., x=ID.
*/
@ -323,14 +359,11 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
{
int line = -1;
int charPositionInLine = -1;
if (offendingToken instanceof Token) {
line = ((Token) offendingToken).getLine();
charPositionInLine = ((Token) offendingToken).getCharPositionInLine();
}
List<? extends ANTLRErrorListener<? super Token>> listeners = getErrorListeners();
for (ANTLRErrorListener<? super Token> listener : listeners) {
listener.error(this, offendingToken, line, charPositionInLine, msg, e);
}
line = offendingToken.getLine();
charPositionInLine = offendingToken.getCharPositionInLine();
ANTLRErrorListener listener = getErrorListenerDispatch();
listener.syntaxError(this, offendingToken, line, charPositionInLine, msg, e);
}
/** Consume the current symbol and return it. E.g., given the following
@ -349,22 +382,30 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
public Token consume() {
Token o = getCurrentToken();
getInputStream().consume();
if (_buildParseTrees) {
// TODO: tree parsers?
boolean hasListener = _parseListeners != null && !_parseListeners.isEmpty();
if (_buildParseTrees || hasListener) {
if ( _errHandler.inErrorRecoveryMode(this) ) {
// System.out.println("consume in error recovery mode for "+o);
_ctx.addErrorNode(o);
ErrorNode node = _ctx.addErrorNode(o);
if (_parseListeners != null) {
for (ParseTreeListener listener : _parseListeners) {
listener.visitErrorNode(node);
}
}
}
else {
TerminalNode node = _ctx.addChild(o);
if (_parseListeners != null) {
for (ParseTreeListener listener : _parseListeners) {
listener.visitTerminal(node);
}
}
}
else _ctx.addChild(o);
}
if ( _parseListeners != null) {
for (ParseListener<Token> l : _parseListeners) l.visitTerminal(_ctx, o);
}
return o;
}
protected void addContextToParseTree() {
ParserRuleContext<?> parent = (ParserRuleContext<?>)_ctx.parent;
ParserRuleContext parent = (ParserRuleContext)_ctx.parent;
// add current context to parent if we have a parent
if ( parent!=null ) {
parent.addChild(_ctx);
@ -378,47 +419,69 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
* This is flexible because users do not have to regenerate parsers
* to get trace facilities.
*/
public void enterRule(ParserRuleContext<Token> localctx, int ruleIndex) {
public void enterRule(ParserRuleContext localctx, int ruleIndex) {
_ctx = localctx;
_ctx.start = _input.LT(1);
_ctx.ruleIndex = ruleIndex;
if (_buildParseTrees) addContextToParseTree();
if ( _parseListeners != null) triggerEnterRuleEvent();
}
public void exitRule() {
_ctx.stop = _input.LT(-1);
// trigger event on _ctx, before it reverts to parent
if ( _parseListeners != null) triggerExitRuleEvent();
_ctx = (ParserRuleContext<Token>)_ctx.parent;
_ctx = (ParserRuleContext)_ctx.parent;
}
public void enterOuterAlt(ParserRuleContext<Token> localctx, int altNum) {
public void enterOuterAlt(ParserRuleContext localctx, int altNum) {
// if we have new localctx, make sure we replace existing ctx
// that is previous child of parse tree
if ( _buildParseTrees && _ctx != localctx ) {
ParserRuleContext<?> parent = (ParserRuleContext<?>)_ctx.parent;
parent.removeLastChild();
if ( parent!=null ) parent.addChild(localctx);
ParserRuleContext parent = (ParserRuleContext)_ctx.parent;
if ( parent!=null ) {
parent.removeLastChild();
parent.addChild(localctx);
}
}
_ctx = localctx;
_ctx.altNum = altNum;
}
/* like enterRule but for recursive rules; no enter events for recursive rules. */
public void pushNewRecursionContext(ParserRuleContext<Token> localctx, int ruleIndex) {
public void enterRecursionRule(ParserRuleContext localctx, int ruleIndex) {
_ctx = localctx;
_ctx.start = _input.LT(1);
_ctx.ruleIndex = ruleIndex;
if (_parseListeners != null) {
triggerEnterRuleEvent(); // simulates rule entry for left-recursive rules
}
}
public void unrollRecursionContexts(ParserRuleContext<Token> _parentctx) {
ParserRuleContext<Token> retctx = _ctx; // save current ctx (return value)
/* like enterRule but for recursive rules */
public void pushNewRecursionContext(ParserRuleContext localctx, int state, int ruleIndex) {
ParserRuleContext previous = _ctx;
previous.parent = localctx;
previous.invokingState = state;
previous.stop = _input.LT(-1);
_ctx = localctx;
_ctx.start = previous.start;
if (_buildParseTrees) {
_ctx.addChild(previous);
}
if ( _parseListeners != null ) {
triggerEnterRuleEvent(); // simulates rule entry for left-recursive rules
}
}
public void unrollRecursionContexts(ParserRuleContext _parentctx) {
_ctx.stop = _input.LT(-1);
ParserRuleContext retctx = _ctx; // save current ctx (return value)
// unroll so _ctx is as it was before call to recursive method
if ( _parseListeners != null ) {
while ( _ctx != _parentctx ) {
triggerExitRuleEvent();
_ctx = (ParserRuleContext<Token>)_ctx.parent;
_ctx = (ParserRuleContext)_ctx.parent;
}
}
else {
@ -429,16 +492,16 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
if (_buildParseTrees) _parentctx.addChild(retctx); // add return ctx into invoking rule's tree
}
public ParserRuleContext<Token> getInvokingContext(int ruleIndex) {
ParserRuleContext<Token> p = _ctx;
public ParserRuleContext getInvokingContext(int ruleIndex) {
ParserRuleContext p = _ctx;
while ( p!=null ) {
if ( p.getRuleIndex() == ruleIndex ) return p;
p = (ParserRuleContext<Token>)p.parent;
p = (ParserRuleContext)p.parent;
}
return null;
}
public ParserRuleContext<Token> getContext() {
public ParserRuleContext getContext() {
return _ctx;
}
@ -450,7 +513,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
public boolean isExpectedToken(int symbol) {
// return getInterpreter().atn.nextTokens(_ctx);
ATN atn = getInterpreter().atn;
ParserRuleContext<?> ctx = _ctx;
ParserRuleContext ctx = _ctx;
ATNState s = atn.states.get(ctx.s);
IntervalSet following = atn.nextTokens(s);
if (following.contains(symbol)) {
@ -467,7 +530,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
return true;
}
ctx = (ParserRuleContext<?>)ctx.parent;
ctx = (ParserRuleContext)ctx.parent;
}
if ( following.contains(Token.EPSILON) && symbol == Token.EOF ) {
@ -482,7 +545,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
*/
public IntervalSet getExpectedTokens() {
ATN atn = getInterpreter().atn;
ParserRuleContext<?> ctx = _ctx;
ParserRuleContext ctx = _ctx;
ATNState s = atn.states.get(ctx.s);
IntervalSet following = atn.nextTokens(s);
// System.out.println("following "+s+"="+following);
@ -496,7 +559,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
following = atn.nextTokens(rt.followState);
expected.addAll(following);
expected.remove(Token.EPSILON);
ctx = (ParserRuleContext<?>)ctx.parent;
ctx = (ParserRuleContext)ctx.parent;
}
if ( following.contains(Token.EPSILON) ) {
expected.add(Token.EOF);
@ -520,7 +583,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
// return atn.nextTokens(s, ctx);
// }
public ParserRuleContext<Token> getRuleContext() { return _ctx; }
public ParserRuleContext getRuleContext() { return _ctx; }
/** Return List<String> of the rule names in your parser instance
* leading up to a call to the current rule. You could override if
@ -538,34 +601,40 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
List<String> stack = new ArrayList<String>();
while ( p!=null ) {
// compute what follows who invoked us
stack.add(ruleNames[p.getRuleIndex()]);
int ruleIndex = p.getRuleIndex();
if ( ruleIndex<0 ) stack.add("n/a");
else stack.add(ruleNames[ruleIndex]);
p = p.parent;
}
return stack;
}
/** For debugging and other purposes */
public List<String> getDFAStrings() {
List<String> s = new ArrayList<String>();
for (int d = 0; d < _interp.decisionToDFA.length; d++) {
DFA dfa = _interp.decisionToDFA[d];
s.add( dfa.toString(getTokenNames()) );
}
return s;
public List<String> getDFAStrings() {
synchronized (_interp.decisionToDFA) {
List<String> s = new ArrayList<String>();
for (int d = 0; d < _interp.decisionToDFA.length; d++) {
DFA dfa = _interp.decisionToDFA[d];
s.add( dfa.toString(getTokenNames()) );
}
return s;
}
}
/** For debugging and other purposes */
public void dumpDFA() {
boolean seenOne = false;
for (int d = 0; d < _interp.decisionToDFA.length; d++) {
DFA dfa = _interp.decisionToDFA[d];
if ( dfa!=null ) {
if ( seenOne ) System.out.println();
System.out.println("Decision " + dfa.decision + ":");
System.out.print(dfa.toString(getTokenNames()));
seenOne = true;
}
}
/** For debugging and other purposes */
public void dumpDFA() {
synchronized (_interp.decisionToDFA) {
boolean seenOne = false;
for (int d = 0; d < _interp.decisionToDFA.length; d++) {
DFA dfa = _interp.decisionToDFA[d];
if ( dfa!=null ) {
if ( seenOne ) System.out.println();
System.out.println("Decision " + dfa.decision + ":");
System.out.print(dfa.toString(getTokenNames()));
seenOne = true;
}
}
}
}
public String getSourceName() {
@ -597,7 +666,7 @@ public abstract class Parser extends Recognizer<Token, ParserATNSimulator<Token>
// if ( traceATNStates ) _ctx.trace(atnState);
}
/** During a parse is extremely useful to listen in on the rule entry and exit
/** During a parse is sometimes useful to listen in on the rule entry and exit
* events as well as token matches. This is for quick and dirty debugging.
*/
public void setTrace(boolean trace) {

View File

@ -28,30 +28,32 @@
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATN;
import org.antlr.v4.runtime.atn.ATNState;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.tree.ErrorNode;
import org.antlr.v4.runtime.tree.ErrorNodeImpl;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.ParseTreeListener;
import org.antlr.v4.runtime.tree.TerminalNode;
import org.antlr.v4.runtime.tree.TerminalNodeImpl;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
/** A rule invocation record for parsing and tree parsing.
/** A rule invocation record for parsing.
*
* Contains all of the information about the current rule not stored in the
* RuleContext. It handles parse tree children list, Any ATN state
* tracing, and the default values available for rule indications:
* start, stop, ST, rule index, current alt number, current
* start, stop, rule index, current alt number, current
* ATN state.
*
* Subclasses made for each rule and grammar track the parameters,
* return values, locals, and labels specific to that rule. These
* are the objects that are returned from rules.
*
* Note text is not an actual property of the return value, it is computed
* Note text is not an actual field of a rule return value; it is computed
* from start and stop using the input stream's toString() method. I
* could add a ctor to this so that we can pass in and store the input
* stream, but I'm not sure we want to do that. It would seem to be undefined
@ -62,12 +64,10 @@ import java.util.List;
* group values such as this aggregate. The getters/setters are there to
* satisfy the superclass interface.
*/
public class ParserRuleContext<Symbol extends Token> extends RuleContext {
public static final ParserRuleContext<Token> EMPTY = new ParserRuleContext<Token>();
public class ParserRuleContext extends RuleContext {
/** If we are debugging or building a parse tree for a visitor,
* we need to track all of the tokens and rule invocations associated
* with this rule's context. This is empty for normal parsing
* with this rule's context. This is empty for parsing w/o tree constr.
* operation because we don't the need to track the details about
* how we parse this rule.
*/
@ -100,18 +100,21 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
*/
public int s = -1;
public Symbol start, stop;
/** Set during parsing to identify which rule parser is in. */
public int ruleIndex;
public Token start, stop;
/** Set during parsing to identify which alt of rule parser is in. */
public int altNum;
/**
* The exception which forced this rule to return. If the rule successfully
* completed, this is {@code null}.
*/
public RecognitionException exception;
public ParserRuleContext() { }
/** COPY a ctx (I'm deliberately not using copy constructor) */
public void copyFrom(ParserRuleContext<Symbol> ctx) {
public void copyFrom(ParserRuleContext ctx) {
// from RuleContext
this.parent = ctx.parent;
this.s = ctx.s;
@ -119,37 +122,33 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
this.start = ctx.start;
this.stop = ctx.stop;
this.ruleIndex = ctx.ruleIndex;
}
public ParserRuleContext(@Nullable ParserRuleContext<Symbol> parent, int invokingStateNumber, int stateNumber) {
public ParserRuleContext(@Nullable ParserRuleContext parent, int invokingStateNumber, int stateNumber) {
super(parent, invokingStateNumber);
this.s = stateNumber;
}
public ParserRuleContext(@Nullable ParserRuleContext<Symbol> parent, int stateNumber) {
public ParserRuleContext(@Nullable ParserRuleContext parent, int stateNumber) {
this(parent, parent!=null ? parent.s : -1 /* invoking state */, stateNumber);
}
// Double dispatch methods for listeners
// parse listener
public void enterRule(ParseListener<Symbol> listener) { }
public void exitRule(ParseListener<Symbol> listener) { }
public void enterRule(ParseTreeListener listener) { }
public void exitRule(ParseTreeListener listener) { }
// parse tree listener
public void enterRule(ParseTreeListener<Symbol> listener) { }
public void exitRule(ParseTreeListener<Symbol> listener) { }
/** Does not set parent link; other add methods do */
public void addChild(TerminalNode<Symbol> t) {
/** Does not set parent link; other add methods do that */
public TerminalNode addChild(TerminalNode t) {
if ( children==null ) children = new ArrayList<ParseTree>();
children.add(t);
return t;
}
public void addChild(RuleContext ruleInvocation) {
public RuleContext addChild(RuleContext ruleInvocation) {
if ( children==null ) children = new ArrayList<ParseTree>();
children.add(ruleInvocation);
return ruleInvocation;
}
/** Used by enterOuterAlt to toss out a RuleContext previously added as
@ -167,14 +166,15 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
// states.add(s);
// }
public void addChild(Symbol matchedToken) {
TerminalNodeImpl<Symbol> t = new TerminalNodeImpl<Symbol>(matchedToken);
public TerminalNode addChild(Token matchedToken) {
TerminalNodeImpl t = new TerminalNodeImpl(matchedToken);
addChild(t);
t.parent = this;
return t;
}
public ErrorNode<Symbol> addErrorNode(Symbol badToken) {
ErrorNodeImpl<Symbol> t = new ErrorNodeImpl<Symbol>(badToken);
public ErrorNode addErrorNode(Token badToken) {
ErrorNodeImpl t = new ErrorNodeImpl(badToken);
addChild(t);
t.parent = this;
return t;
@ -182,8 +182,8 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
@Override
/** Override to make type more specific */
public ParserRuleContext<Symbol> getParent() {
return (ParserRuleContext<Symbol>)super.getParent();
public ParserRuleContext getParent() {
return (ParserRuleContext)super.getParent();
}
@Override
@ -208,16 +208,15 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
return null;
}
@SuppressWarnings("checked")
public TerminalNode<Symbol> getToken(int ttype, int i) {
public TerminalNode getToken(int ttype, int i) {
if ( children==null || i < 0 || i >= children.size() ) {
return null;
}
int j = -1; // what token with ttype have we found?
for (ParseTree o : children) {
if ( o instanceof TerminalNode<?> ) {
TerminalNode<Symbol> tnode = (TerminalNode<Symbol>)o;
if ( o instanceof TerminalNode ) {
TerminalNode tnode = (TerminalNode)o;
Token symbol = tnode.getSymbol();
if ( symbol.getType()==ttype ) {
j++;
@ -231,20 +230,19 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
return null;
}
@SuppressWarnings("checked")
public List<TerminalNode<Symbol>> getTokens(int ttype) {
public List<TerminalNode> getTokens(int ttype) {
if ( children==null ) {
return Collections.emptyList();
}
List<TerminalNode<Symbol>> tokens = null;
List<TerminalNode> tokens = null;
for (ParseTree o : children) {
if ( o instanceof TerminalNode<?> ) {
TerminalNode<Symbol> tnode = (TerminalNode<Symbol>)o;
if ( o instanceof TerminalNode ) {
TerminalNode tnode = (TerminalNode)o;
Token symbol = tnode.getSymbol();
if ( symbol.getType()==ttype ) {
if ( tokens==null ) {
tokens = new ArrayList<TerminalNode<Symbol>>();
tokens = new ArrayList<TerminalNode>();
}
tokens.add(tnode);
}
@ -258,11 +256,11 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
return tokens;
}
public <T extends ParserRuleContext<?>> T getRuleContext(Class<? extends T> ctxType, int i) {
public <T extends ParserRuleContext> T getRuleContext(Class<? extends T> ctxType, int i) {
return getChild(ctxType, i);
}
public <T extends ParserRuleContext<?>> List<? extends T> getRuleContexts(Class<? extends T> ctxType) {
public <T extends ParserRuleContext> List<T> getRuleContexts(Class<? extends T> ctxType) {
if ( children==null ) {
return Collections.emptyList();
}
@ -289,29 +287,14 @@ public class ParserRuleContext<Symbol extends Token> extends RuleContext {
public int getChildCount() { return children!=null ? children.size() : 0; }
@Override
public int getRuleIndex() { return ruleIndex; }
public Symbol getStart() { return start; }
public Symbol getStop() { return stop; }
@Override
public String toString(@NotNull Recognizer<?,?> recog, RuleContext stop) {
if ( recog==null ) return super.toString(recog, stop);
StringBuilder buf = new StringBuilder();
ParserRuleContext<?> p = this;
buf.append("[");
while ( p != null && p != stop ) {
ATN atn = recog.getATN();
ATNState s = atn.states.get(p.s);
String ruleName = recog.getRuleNames()[s.ruleIndex];
buf.append(ruleName);
if ( p.parent != null ) buf.append(" ");
p = (ParserRuleContext<?>)p.parent;
}
buf.append("]");
return buf.toString();
public Interval getSourceInterval() {
if ( start==null || stop==null ) return Interval.INVALID;
return Interval.of(start.getTokenIndex(), stop.getTokenIndex());
}
public Token getStart() { return start; }
public Token getStop() { return stop; }
/** Used for rule context info debugging during parse-time, not so much for ATN debugging */
public String toInfoString(Parser recognizer) {
List<String> rules = recognizer.getRuleInvocationStack(this);

View File

@ -0,0 +1,96 @@
/*
[The "BSD license"]
Copyright (c) 2012 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.dfa.DFA;
import java.util.BitSet;
import java.util.Collection;
/**
* @author Sam Harwell
*/
public class ProxyErrorListener implements ANTLRErrorListener {
private final Collection<? extends ANTLRErrorListener> delegates;
public ProxyErrorListener(Collection<? extends ANTLRErrorListener> delegates) {
this.delegates = delegates;
}
@Override
public void syntaxError(Recognizer<?, ?> recognizer,
Object offendingSymbol,
int line,
int charPositionInLine,
String msg,
RecognitionException e)
{
for (ANTLRErrorListener listener : delegates) {
listener.syntaxError(recognizer, offendingSymbol, line, charPositionInLine, msg, e);
}
}
@Override
public void reportAmbiguity(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
BitSet ambigAlts,
ATNConfigSet configs)
{
for (ANTLRErrorListener listener : delegates) {
listener.reportAmbiguity(recognizer, dfa, startIndex, stopIndex, ambigAlts, configs);
}
}
@Override
public void reportAttemptingFullContext(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
ATNConfigSet configs)
{
for (ANTLRErrorListener listener : delegates) {
listener.reportAttemptingFullContext(recognizer, dfa, startIndex, stopIndex, configs);
}
}
@Override
public void reportContextSensitivity(Parser recognizer,
DFA dfa,
int startIndex,
int stopIndex,
ATNConfigSet configs)
{
for (ANTLRErrorListener listener : delegates) {
listener.reportContextSensitivity(recognizer, dfa, startIndex, stopIndex, configs);
}
}
}

View File

@ -39,26 +39,23 @@ import org.antlr.v4.runtime.misc.Nullable;
*/
public class RecognitionException extends RuntimeException {
/** Who threw the exception? */
protected Recognizer<?, ?> recognizer;
private Recognizer<?, ?> recognizer;
// TODO: make a dummy recognizer for the interpreter to use?
// Next two (ctx,input) should be what is in recognizer, but
// won't work when interpreting
protected RuleContext ctx;
private RuleContext ctx;
protected IntStream input;
/** What is index of token/char were we looking at when the error occurred? */
// public int offendingTokenIndex;
private IntStream input;
/** The current Token when an error occurred. Since not all streams
* can retrieve the ith Token, we have to track the Token object.
* For parsers. Even when it's a tree parser, token might be set.
*/
protected Token offendingToken;
private Token offendingToken;
protected int offendingState;
private int offendingState;
public RecognitionException(@Nullable Recognizer<?, ?> recognizer, IntStream input,
@Nullable ParserRuleContext ctx)
@ -69,13 +66,29 @@ public class RecognitionException extends RuntimeException {
if ( ctx!=null ) this.offendingState = ctx.s;
}
public RecognitionException(String message, @Nullable Recognizer<?, ?> recognizer, IntStream input,
@Nullable ParserRuleContext ctx)
{
super(message);
this.recognizer = recognizer;
this.input = input;
this.ctx = ctx;
if ( ctx!=null ) this.offendingState = ctx.s;
}
/** Where was the parser in the ATN when the error occurred?
* For No viable alternative exceptions, this is the decision state number.
* For others, it is the state whose emanating edge we couldn't match.
* This will help us tie into the grammar and syntax diagrams in
* ANTLRWorks v2.
*/
public int getOffendingState() { return offendingState; }
public int getOffendingState() {
return offendingState;
}
protected final void setOffendingState(int offendingState) {
this.offendingState = offendingState;
}
public IntervalSet getExpectedTokens() {
// TODO: do we really need this type check?
@ -97,6 +110,10 @@ public class RecognitionException extends RuntimeException {
return offendingToken;
}
protected final void setOffendingToken(Token offendingToken) {
this.offendingToken = offendingToken;
}
public Recognizer<?, ?> getRecognizer() {
return recognizer;
}

View File

@ -34,7 +34,6 @@ import org.antlr.v4.runtime.atn.ATNSimulator;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.CopyOnWriteArrayList;
@ -42,8 +41,10 @@ public abstract class Recognizer<Symbol, ATNInterpreter extends ATNSimulator> {
public static final int EOF=-1;
@NotNull
private List<ANTLRErrorListener<? super Symbol>> _listeners =
new CopyOnWriteArrayList<ANTLRErrorListener<? super Symbol>>() {{ add(ConsoleErrorListener.INSTANCE); }};
private List<ANTLRErrorListener> _listeners =
new CopyOnWriteArrayList<ANTLRErrorListener>() {{
add(ConsoleErrorListener.INSTANCE);
}};
protected ATNInterpreter _interp;
@ -66,8 +67,8 @@ public abstract class Recognizer<Symbol, ATNInterpreter extends ATNSimulator> {
/** What is the error header, normally line/character position information? */
public String getErrorHeader(RecognitionException e) {
int line = e.offendingToken.getLine();
int charPositionInLine = e.offendingToken.getCharPositionInLine();
int line = e.getOffendingToken().getLine();
int charPositionInLine = e.getOffendingToken().getCharPositionInLine();
return "line "+line+":"+charPositionInLine;
}
@ -99,7 +100,7 @@ public abstract class Recognizer<Symbol, ATNInterpreter extends ATNSimulator> {
/**
* @throws NullPointerException if {@code listener} is {@code null}.
*/
public void addErrorListener(@NotNull ANTLRErrorListener<? super Symbol> listener) {
public void addErrorListener(@NotNull ANTLRErrorListener listener) {
if (listener == null) {
throw new NullPointerException("listener cannot be null.");
}
@ -107,7 +108,7 @@ public abstract class Recognizer<Symbol, ATNInterpreter extends ATNSimulator> {
_listeners.add(listener);
}
public void removeErrorListener(@NotNull ANTLRErrorListener<? super Symbol> listener) {
public void removeErrorListener(@NotNull ANTLRErrorListener listener) {
_listeners.remove(listener);
}
@ -116,8 +117,12 @@ public abstract class Recognizer<Symbol, ATNInterpreter extends ATNSimulator> {
}
@NotNull
public List<? extends ANTLRErrorListener<? super Symbol>> getErrorListeners() {
return new ArrayList<ANTLRErrorListener<? super Symbol>>(_listeners);
public List<? extends ANTLRErrorListener> getErrorListeners() {
return _listeners;
}
public ANTLRErrorListener getErrorListenerDispatch() {
return new ProxyErrorListener(getErrorListeners());
}
// subclass needs to override these if there are sempreds or actions

View File

@ -32,11 +32,14 @@ import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.ParseTreeVisitor;
import org.antlr.v4.runtime.tree.RuleNode;
import org.antlr.v4.runtime.tree.Trees;
import org.antlr.v4.runtime.tree.gui.TreeViewer;
import javax.print.PrintException;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
/** A rule context is a record of a single rule invocation. It knows
* which context invoked it, if any. If there is no parent context, then
@ -52,13 +55,15 @@ import java.io.IOException;
* The parent contexts are useful for computing lookahead sets and
* getting error information.
*
* These objects are used during lexing, parsing, and prediction.
* These objects are used during parsing and prediction.
* For the special case of parsers and tree parsers, we use the subclass
* ParserRuleContext.
*
* @see ParserRuleContext
*/
public class RuleContext implements ParseTree.RuleNode {
public class RuleContext implements RuleNode {
public static final ParserRuleContext EMPTY = new ParserRuleContext();
/** What context invoked this rule? */
public RuleContext parent;
@ -181,8 +186,8 @@ public class RuleContext implements ParseTree.RuleNode {
* that they are now going to track perfectly together. Once they
* converged on state 21, there is no way they can separate. In other
* words, the prior stack state is not consulted when computing where to
* go in the closure operation. ?$ and ??$ are considered the same stack.
* If ? is popped off then $ and ?$ remain; they are now an empty and
* go in the closure operation. x$ and xy$ are considered the same stack.
* If x is popped off then $ and y$ remain; they are now an empty and
* nonempty context comparison. So, if one stack is a suffix of
* another, then it will still degenerate to the simple empty stack
* comparison case.
@ -208,7 +213,12 @@ public class RuleContext implements ParseTree.RuleNode {
return invokingState == -1;
}
// satisfy the ParseTree interface
// satisfy the ParseTree / SyntaxTree interface
@Override
public Interval getSourceInterval() {
return Interval.INVALID;
}
@Override
public RuleContext getRuleContext() { return this; }
@ -219,6 +229,27 @@ public class RuleContext implements ParseTree.RuleNode {
@Override
public RuleContext getPayload() { return this; }
/** Return the combined text of all child nodes. This method only considers
* tokens which have been added to the parse tree.
* <p>
* Since tokens on hidden channels (e.g. whitespace or comments) are not
* added to the parse trees, they will not appear in the output of this
* method.
*/
@Override
public String getText() {
if (getChildCount() == 0) {
return "";
}
StringBuilder builder = new StringBuilder();
for (int i = 0; i < getChildCount(); i++) {
builder.append(getChild(i).getText());
}
return builder.toString();
}
public int getRuleIndex() { return -1; }
@Override
@ -231,28 +262,25 @@ public class RuleContext implements ParseTree.RuleNode {
return 0;
}
@Override
public Interval getSourceInterval() {
if ( getChildCount()==0 ) return Interval.INVALID;
int start = getChild(0).getSourceInterval().a;
int stop = getChild(getChildCount()-1).getSourceInterval().b;
return new Interval(start, stop);
}
@Override
public <T> T accept(ParseTreeVisitor<? extends T> visitor) { return visitor.visitChildren(this); }
/** Call this method to view a parse tree in a dialog box visually. */
public void inspect(Parser parser) {
TreeViewer viewer = new TreeViewer(parser, this);
viewer.open();
}
/** Save this tree in a postscript file */
public void save(Parser parser, String fileName)
throws IOException, PrintException
{
Trees.writePS(this, parser, fileName);
// TreeViewer viewer = new TreeViewer(parser, this);
// viewer.save(fileName);
Trees.writePS(this, parser, fileName); // parrt routine
}
/** Save this tree in a postscript file using a particular font name and size */
public void save(Parser parser, String fileName,
String fontName, int fontSize)
throws IOException
@ -264,32 +292,65 @@ public class RuleContext implements ParseTree.RuleNode {
* (root child1 .. childN). Print just a node if this is a leaf.
* We have to know the recognizer so we can get rule names.
*/
public String toStringTree(Parser recog) {
public String toStringTree(@Nullable Parser recog) {
return Trees.toStringTree(this, recog);
}
/** Print out a whole tree, not just a node, in LISP format
* (root child1 .. childN). Print just a node if this is a leaf.
*/
public String toStringTree(@Nullable List<String> ruleNames) {
return Trees.toStringTree(this, ruleNames);
}
@Override
public String toStringTree() { return toStringTree(null); }
public String toStringTree() {
return toStringTree((List<String>)null);
}
@Override
public String toString() {
return toString(null);
return toString((List<String>)null, (RuleContext)null);
}
public String toString(@Nullable Recognizer<?,?> recog) {
public final String toString(@Nullable Recognizer<?,?> recog) {
return toString(recog, ParserRuleContext.EMPTY);
}
public final String toString(@Nullable List<String> ruleNames) {
return toString(ruleNames, null);
}
// recog null unless ParserRuleContext, in which case we use subclass toString(...)
public String toString(@Nullable Recognizer<?,?> recog, RuleContext stop) {
public String toString(@Nullable Recognizer<?,?> recog, @Nullable RuleContext stop) {
String[] ruleNames = recog != null ? recog.getRuleNames() : null;
List<String> ruleNamesList = ruleNames != null ? Arrays.asList(ruleNames) : null;
return toString(ruleNamesList, stop);
}
public String toString(@Nullable List<String> ruleNames, @Nullable RuleContext stop) {
StringBuilder buf = new StringBuilder();
RuleContext p = this;
buf.append("[");
while ( p != null && p != stop ) {
if ( !p.isEmpty() ) buf.append(p.invokingState);
if ( p.parent != null && !p.parent.isEmpty() ) buf.append(" ");
while (p != null && p != stop) {
if (ruleNames == null) {
if (!p.isEmpty()) {
buf.append(p.invokingState);
}
}
else {
int ruleIndex = p.getRuleIndex();
String ruleName = ruleIndex >= 0 && ruleIndex < ruleNames.size() ? ruleNames.get(ruleIndex) : Integer.toString(ruleIndex);
buf.append(ruleName);
}
if (p.parent != null && (ruleNames != null || !p.parent.isEmpty())) {
buf.append(" ");
}
p = p.parent;
}
buf.append("]");
return buf.toString();
}

View File

@ -35,23 +35,15 @@ package org.antlr.v4.runtime;
*/
public interface Token {
public static final int INVALID_TYPE = 0;
// public static final Token INVALID_TOKEN = new CommonToken(INVALID_TYPE);
public static final int MIN_TOKEN_TYPE = 1;
/** During lookahead operations, this "token" signifies we hit rule end ATN state
* and did not follow it despite needing to.
*/
public static final int EPSILON = -2;
/** imaginary tree navigation type; traverse "get child" link */
public static final int DOWN = 1;
public static final int MIN_USER_TOKEN_TYPE = 1;
/** imaginary tree navigation type; finish with a child list */
public static final int UP = 2;
public static final int MIN_USER_TOKEN_TYPE = UP+1;
public static final int EOF = CharStream.EOF;
public static final int EOF = IntStream.EOF;
/** All tokens go to the parser (unless skip() is called in that rule)
* on a particular "channel". The parser tunes to a particular channel
@ -62,7 +54,7 @@ public interface Token {
/** Anything on different channel than DEFAULT_CHANNEL is not parsed
* by parser.
*/
public static final int HIDDEN_CHANNEL = 99;
public static final int HIDDEN_CHANNEL = 1;
/** Get the text of the token */
String getText();

View File

@ -65,4 +65,7 @@ public interface TokenSource {
/** Optional method that lets users set factory in lexer or other source */
public void setTokenFactory(TokenFactory<?> factory);
/** Gets the factory used for constructing tokens. */
public TokenFactory<?> getTokenFactory();
}

View File

@ -1,78 +1,172 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
/** A stream of tokens accessing tokens from a TokenSource */
public interface TokenStream extends SymbolStream<Token> {
/** Get Token at current input pointer + i ahead where i=1 is next Token.
* i&lt;0 indicates tokens in the past. So -1 is previous token and -2 is
* two tokens ago. LT(0) is undefined. For i>=n, return Token.EOFToken.
* Return null for LT(0) and any index that results in an absolute address
* that is negative.
* TODO (Sam): Throw exception for invalid k?
*/
@Override
public Token LT(int k);
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
/** How far ahead has the stream been asked to look? The return
* value is a valid index from 0..n-1.
/**
* An {@link IntStream} whose symbols are {@link Token} instances.
*/
public interface TokenStream extends IntStream {
/**
* Get the {@link Token} instance associated with the value returned by
* {@link #LA LA(k)}. This method has the same pre- and post-conditions as
* {@link IntStream#LA}. In addition, when the preconditions of this method
* are met, the return value is non-null and the value of
* {@code LT(k).getType()==LA(k)}.
*
* @see IntStream#LA
*/
// int range();
@NotNull
public Token LT(int k);
/** Get a token at an absolute index i; 0..n-1. This is really only
* needed for profiling and debugging and token stream rewriting.
* If you don't want to buffer up tokens, then this method makes no
* sense for you. Naturally you can't use the rewrite stream feature.
* I believe DebugTokenStream can easily be altered to not use
* this method, removing the dependency.
/**
* Gets the {@link Token} at the specified {@code index} in the stream. When
* the preconditions of this method are met, the return value is non-null.
* <p/>
* The preconditions for this method are the same as the preconditions of
* {@link IntStream#seek}. If the behavior of {@code seek(index)} is
* unspecified for the current state and given {@code index}, then the
* behavior of this method is also unspecified.
* <p/>
* The symbol referred to by {@code index} differs from {@code seek()} only
* in the case of filtering streams where {@code index} lies before the end
* of the stream. Unlike {@code seek()}, this method does not adjust
* {@code index} to point to a non-ignored symbol.
*
* @throws IllegalArgumentException if {code index} is less than 0
* @throws UnsupportedOperationException if the stream does not support
* retrieving the token at the specified index
*/
@Override
public Token get(int i);
@NotNull
public Token get(int index);
/** Where is this stream pulling tokens from? This is not the name, but
* the object that provides Token objects.
/**
* Gets the underlying {@link TokenSource} which provides tokens for this
* stream.
*/
@NotNull
public TokenSource getTokenSource();
/** Return the text of all tokens from start to stop, inclusive.
* If the stream does not buffer all the tokens then it can just
* return "" or null; Users should not access $ruleLabel.text in
* an action of course in that case.
/**
* Return the text of all tokens within the specified {@code interval}. This
* method behaves like the following code (including potential exceptions
* for violating preconditions of {@link #get}, but may be optimized by the
* specific implementation.
*
* <pre>
* TokenStream stream = ...;
* String text = "";
* for (int i = interval.a; i <= interval.b; i++) {
* text += stream.get(i).getText();
* }
* </pre>
*
* @param interval The interval of tokens within this stream to get text
* for.
* @return The text of all tokens within the specified interval in this
* stream.
*
* @throws NullPointerException if {@code interval} is {@code null}
*/
public String toString(int start, int stop);
@NotNull
public String getText(@NotNull Interval interval);
/** Because the user is not required to use a token with an index stored
* in it, we must provide a means for two token objects themselves to
* indicate the start/end location. Most often this will just delegate
* to the other toString(int,int). This is also parallel with
* the TreeNodeStream.toString(Object,Object).
/**
* Return the text of all tokens in the stream. This method behaves like the
* following code, including potential exceptions from the calls to
* {@link IntStream#size} and {@link #getText(Interval)}, but may be
* optimized by the specific implementation.
*
* <pre>
* TokenStream stream = ...;
* String text = stream.getText(new Interval(0, stream.size()));
* </pre>
*
* @return The text of all tokens in the stream.
*/
public String toString(Token start, Token stop);
@NotNull
public String getText();
/**
* Return the text of all tokens in the source interval of the specified
* context. This method behaves like the following code, including potential
* exceptions from the call to {@link #getText(Interval)}, but may be
* optimized by the specific implementation.
* </p>
* If {@code ctx.getSourceInterval()} does not return a valid interval of
* tokens provided by this stream, the behavior is unspecified.
*
* <pre>
* TokenStream stream = ...;
* String text = stream.getText(ctx.getSourceInterval());
* </pre>
*
* @param ctx The context providing the source interval of tokens to get
* text for.
* @return The text of all tokens within the source interval of {@code ctx}.
*/
@NotNull
public String getText(@NotNull RuleContext ctx);
/**
* Return the text of all tokens in this stream between {@code start} and
* {@code stop} (inclusive).
* <p/>
* If the specified {@code start} or {@code stop} token was not provided by
* this stream, or if the {@code stop} occurred before the {@code start}
* token, the behavior is unspecified.
* <p/>
* For streams which ensure that the {@link Token#getTokenIndex} method is
* accurate for all of its provided tokens, this method behaves like the
* following code. Other streams may implement this method in other ways
* provided the behavior is consistent with this at a high level.
*
* <pre>
* TokenStream stream = ...;
* String text = "";
* for (int i = start.getTokenIndex(); i <= stop.getTokenIndex(); i++) {
* text += stream.get(i).getText();
* }
* </pre>
*
* @param start The first token in the interval to get text for.
* @param stop The last token in the interval to get text for (inclusive).
* @return The text of all tokens lying between the specified {@code start}
* and {@code stop} tokens.
*
* @throws UnsupportedOperationException if this stream does not support
* this method for the specified tokens
*/
@NotNull
public String getText(@NotNull Token start, @NotNull Token stop);
}

View File

@ -1,6 +1,6 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
Copyright (c) 2012 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
@ -28,23 +28,38 @@
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.*;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/** Useful for dumping out the input stream after doing some
* augmentation or other manipulations.
/** Useful for rewriting out a buffered input token stream after doing some
* augmentation or other manipulations on it.
*
* You can insert stuff, replace, and delete chunks. Note that the
* operations are done lazily--only if you convert the buffer to a
* String. This is very efficient because you are not moving data around
* all the time. As the buffer of tokens is converted to strings, the
* toString() method(s) check to see if there is an operation at the
* current index. If so, the operation is done and then normal String
* String with getText(). This is very efficient because you are not moving
* data around all the time. As the buffer of tokens is converted to strings,
* the getText() method(s) scan the input token stream and check
* to see if there is an operation at the current index.
* If so, the operation is done and then normal String
* rendering continues on the buffer. This is like having multiple Turing
* machine instruction streams (programs) operating on a single input tape. :)
*
* Since the operations are done lazily at toString-time, operations do not
* This rewriter makes no modifications to the token stream. It does not
* ask the stream to fill itself up nor does it advance the input cursor.
* The token stream index() will return the same value before and after
* any getText() call.
*
* The rewriter only works on tokens that you have in the buffer and
* ignores the current input cursor. If you are buffering tokens on-demand,
* calling getText() halfway through the input will only do rewrites
* for those tokens in the first half of the file.
*
* Since the operations are done lazily at getText-time, operations do not
* screw up the token index values. That is, an insert operation at token
* index i does not change the index values for tokens i+1..n-1.
*
@ -56,19 +71,18 @@ import java.util.*;
*
* CharStream input = new ANTLRFileStream("input");
* TLexer lex = new TLexer(input);
* TokenRewriteStream tokens = new TokenRewriteStream(lex);
* CommonTokenStream tokens = new CommonTokenStream(lex);
* T parser = new T(tokens);
* TokenStreamRewriter rewriter = new TokenStreamRewriter(tokens);
* parser.startRule();
*
* Then in the rules, you can execute
* Then in the rules, you can execute (assuming rewriter is visible):
* Token t,u;
* ...
* input.insertAfter(t, "text to put after t");}
* input.insertAfter(u, "text after u");}
* rewriter.insertAfter(t, "text to put after t");}
* rewriter.insertAfter(u, "text after u");}
* System.out.println(tokens.toString());
*
* Actually, you have to cast the 'input' to a TokenRewriteStream. :(
*
* You can also have multiple "instruction streams" and get multiple
* rewrites from a single pass over the input. Just name the instruction
* streams and use that name again when printing the buffer. This could be
@ -83,7 +97,7 @@ import java.util.*;
* If you don't use named rewrite streams, a "default" stream is used as
* the first example shows.
*/
public class TokenRewriteStream extends CommonTokenStream {
public class TokenStreamRewriter {
public static final String DEFAULT_PROGRAM_NAME = "default";
public static final int PROGRAM_INIT_SIZE = 100;
public static final int MIN_TOKEN_INDEX = 0;
@ -164,29 +178,28 @@ public class TokenRewriteStream extends CommonTokenStream {
}
}
/** Our source stream */
protected final TokenStream tokens;
/** You may have multiple, named streams of rewrite operations.
* I'm calling these things "programs."
* Maps String (name) -> rewrite (List)
*/
protected Map<String, List<RewriteOperation>> programs = null;
protected final Map<String, List<RewriteOperation>> programs;
/** Map String (program name) -> Integer index */
protected Map<String, Integer> lastRewriteTokenIndexes = null;
protected final Map<String, Integer> lastRewriteTokenIndexes;
protected void init() {
public TokenStreamRewriter(TokenStream tokens) {
this.tokens = tokens;
programs = new HashMap<String, List<RewriteOperation>>();
programs.put(DEFAULT_PROGRAM_NAME, new ArrayList<RewriteOperation>(PROGRAM_INIT_SIZE));
programs.put(DEFAULT_PROGRAM_NAME,
new ArrayList<RewriteOperation>(PROGRAM_INIT_SIZE));
lastRewriteTokenIndexes = new HashMap<String, Integer>();
}
public TokenRewriteStream(TokenSource tokenSource) {
super(tokenSource);
init();
}
public TokenRewriteStream(TokenSource tokenSource, int channel) {
super(tokenSource, channel);
init();
public final TokenStream getTokenStream() {
return tokens;
}
public void rollback(int instructionIndex) {
@ -336,44 +349,37 @@ public class TokenRewriteStream extends CommonTokenStream {
return is;
}
public String toOriginalString() {
fill();
return toOriginalString(MIN_TOKEN_INDEX, size()-1);
/** Return the text from the original tokens altered per the
* instructions given to this rewriter.
*/
public String getText() {
return getText(DEFAULT_PROGRAM_NAME, Interval.of(0,tokens.size()-1));
}
public String toOriginalString(int start, int end) {
StringBuilder buf = new StringBuilder();
for (int i=start; i>=MIN_TOKEN_INDEX && i<=end && i<tokens.size(); i++) {
if ( get(i).getType()!=Token.EOF ) buf.append(get(i).getText());
}
return buf.toString();
/** Return the text associated with the tokens in the interval from the
* original token stream but with the alterations given to this rewriter.
* The interval refers to the indexes in the original token stream.
* We do not alter the token stream in any way, so the indexes
* and intervals are still consistent. Includes any operations done
* to the first and last token in the interval. So, if you did an
* insertBefore on the first token, you would get that insertion.
* The same is true if you do an insertAfter the stop token.
*/
public String getText(Interval interval) {
return getText(DEFAULT_PROGRAM_NAME, interval);
}
@Override
public String toString() {
fill();
return toString(MIN_TOKEN_INDEX, size()-1);
}
public String toString(String programName) {
fill();
return toString(programName, MIN_TOKEN_INDEX, size()-1);
}
@Override
public String toString(int start, int end) {
return toString(DEFAULT_PROGRAM_NAME, start, end);
}
public String toString(String programName, int start, int end) {
public String getText(String programName, Interval interval) {
List<RewriteOperation> rewrites = programs.get(programName);
int start = interval.a;
int stop = interval.b;
// ensure start/end are in range
if ( end>tokens.size()-1 ) end = tokens.size()-1;
if ( stop>tokens.size()-1 ) stop = tokens.size()-1;
if ( start<0 ) start = 0;
if ( rewrites==null || rewrites.isEmpty() ) {
return toOriginalString(start,end); // no instructions to execute
return tokens.getText(interval); // no instructions to execute
}
StringBuilder buf = new StringBuilder();
@ -382,7 +388,7 @@ public class TokenRewriteStream extends CommonTokenStream {
// Walk buffer, executing instructions and emitting tokens
int i = start;
while ( i <= end && i < tokens.size() ) {
while ( i <= stop && i < tokens.size() ) {
RewriteOperation op = indexToOp.get(i);
indexToOp.remove(i); // remove so any left have index size-1
Token t = tokens.get(i);
@ -399,12 +405,10 @@ public class TokenRewriteStream extends CommonTokenStream {
// include stuff after end if it's last index in buffer
// So, if they did an insertAfter(lastValidIndex, "foo"), include
// foo if end==lastValidIndex.
if ( end==tokens.size()-1 ) {
if ( stop==tokens.size()-1 ) {
// Scan any remaining operations after last token
// should be included (they will be inserts).
Iterator<RewriteOperation> it = indexToOp.values().iterator();
while (it.hasNext()) {
RewriteOperation op = it.next();
for (RewriteOperation op : indexToOp.values()) {
if ( op.index >= tokens.size()-1 ) buf.append(op.text);
}
}
@ -565,10 +569,6 @@ public class TokenRewriteStream extends CommonTokenStream {
return x+y;
}
protected <T extends RewriteOperation> List<? extends T> getKindOfOps(List<? extends RewriteOperation> rewrites, Class<T> kind) {
return getKindOfOps(rewrites, kind, rewrites.size());
}
/** Get all operations before an index of a particular kind */
protected <T extends RewriteOperation> List<? extends T> getKindOfOps(List<? extends RewriteOperation> rewrites, Class<T> kind, int before) {
List<T> ops = new ArrayList<T>();
@ -576,21 +576,10 @@ public class TokenRewriteStream extends CommonTokenStream {
RewriteOperation op = rewrites.get(i);
if ( op==null ) continue; // ignore deleted
if ( kind.isInstance(op) ) {
ops.add((T)op);
ops.add(kind.cast(op));
}
}
return ops;
}
public String toDebugString() {
return toDebugString(MIN_TOKEN_INDEX, size()-1);
}
public String toDebugString(int start, int end) {
StringBuilder buf = new StringBuilder();
for (int i=start; i>=MIN_TOKEN_INDEX && i<=end && i<tokens.size(); i++) {
buf.append(get(i));
}
return buf.toString();
}
}

View File

@ -29,100 +29,170 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
/** Do not buffer up the entire char stream. It does keep a small buffer
* for efficiency and also buffers while a mark exists (set by the
* lookahead prediction in parser). "Unbuffered" here refers to fact
* that it doesn't buffer all data, not that's it's on demand loading of char.
*/
public class UnbufferedCharStream implements CharStream {
/** A buffer of the data being scanned */
/**
* A moving window buffer of the data being scanned. While there's a marker,
* we keep adding to buffer. Otherwise, {@link #consume consume()} resets so
* we start filling at index 0 again.
*/
protected char[] data;
/** How many characters are actually in the buffer */
/**
* The number of characters currently in {@link #data data}.
* <p/>
* This is not the buffer capacity, that's {@code data.length}.
*/
protected int n;
/** 0..n-1 index into string of next char */
/**
* 0..n-1 index into {@link #data data} of next character.
* <p/>
* The {@code LA(1)} character is {@code data[p]}. If {@code p == n}, we are
* out of buffered characters.
*/
protected int p=0;
protected int earliestMarker = -1;
/**
* Count up with {@link #mark mark()} and down with
* {@link #release release()}. When we {@code release()} the last mark,
* {@code numMarkers} reaches 0 and we reset the buffer. Copy
* {@code data[p]..data[n-1]} to {@code data[0]..data[(n-1)-p]}.
*/
protected int numMarkers = 0;
/** Absolute char index. It's the index of the char about to be
* read via LA(1). Goes from 0 to numchar-1.
/**
* This is the {@code LA(-1)} character for the current position.
*/
protected int lastChar = -1;
/**
* When {@code numMarkers > 0}, this is the {@code LA(-1)} character for the
* first character in {@link #data data}. Otherwise, this is unspecified.
*/
protected int lastCharBufferStart;
/**
* Absolute character index. It's the index of the character about to be
* read via {@code LA(1)}. Goes from 0 to the number of characters in the
* entire stream, although the stream size is unknown before the end is
* reached.
*/
protected int currentCharIndex = 0;
/** Buf is window into stream. This is absolute index of data[0] */
protected int bufferStartIndex = 0;
protected Reader input;
/** What is name or source of this char stream? */
/** The name or source of this char stream. */
public String name;
public UnbufferedCharStream(InputStream input) {
this(input, 256);
}
/** Useful for subclasses that pull char from other than this.input. */
public UnbufferedCharStream() {
this(256);
}
public UnbufferedCharStream(Reader input) {
this(input, 256);
}
public UnbufferedCharStream(InputStream input, int bufferSize) {
this.input = new InputStreamReader(input);
data = new char[bufferSize];
}
public UnbufferedCharStream(Reader input, int bufferSize) {
this.input = input;
data = new char[bufferSize];
}
public void reset() {
p = 0;
earliestMarker = -1;
currentCharIndex = 0;
bufferStartIndex = 0;
/** Useful for subclasses that pull char from other than this.input. */
public UnbufferedCharStream(int bufferSize) {
n = 0;
data = new char[bufferSize];
}
public UnbufferedCharStream(InputStream input) {
this(input, 256);
}
public UnbufferedCharStream(Reader input) {
this(input, 256);
}
public UnbufferedCharStream(InputStream input, int bufferSize) {
this(bufferSize);
this.input = new InputStreamReader(input);
fill(1); // prime
}
public UnbufferedCharStream(Reader input, int bufferSize) {
this(bufferSize);
this.input = input;
fill(1); // prime
}
@Override
public void consume() {
p++;
currentCharIndex++;
// have we hit end of buffer when no markers?
if ( p==n && earliestMarker < 0 ) {
// if so, it's an opportunity to start filling at index 0 again
// System.out.println("p=="+n+", no marker; reset buf start index="+currentCharIndex);
p = 0;
n = 0;
bufferStartIndex = currentCharIndex;
}
}
if (LA(1) == CharStream.EOF) {
throw new IllegalStateException("cannot consume EOF");
}
/** Make sure we have 'need' elements from current position p. Last valid
* p index is data.size()-1. p+need-1 is the data index 'need' elements
* ahead. If we need 1 element, (p+1-1)==p must be < data.size().
// buf always has at least data[p==0] in this method due to ctor
lastChar = data[p]; // track last char for LA(-1)
if (p == n-1 && numMarkers==0) {
n = 0;
p = -1; // p++ will leave this at 0
lastCharBufferStart = lastChar;
}
p++;
currentCharIndex++;
sync(1);
}
/**
* Make sure we have 'need' elements from current position {@link #p p}.
* Last valid {@code p} index is {@code data.length-1}. {@code p+need-1} is
* the char index 'need' elements ahead. If we need 1 element,
* {@code (p+1-1)==p} must be less than {@code data.length}.
*/
protected void sync(int want) {
int need = (p+want-1) - n + 1; // how many more elements we need?
if ( need > 0 ) fill(need); // out of elements?
}
/** add n elements to buffer */
public void fill(int n) {
for (int i=1; i<=n; i++) {
try {
int c = input.read();
add(c);
}
catch (IOException ioe) {
throw new RuntimeException(ioe);
}
if ( need > 0 ) {
fill(need);
}
}
protected void add(int c) {
if ( n>=data.length ) {
/**
* Add {@code n} characters to the buffer. Returns the number of characters
* actually added to the buffer. If the return value is less than {@code n},
* then EOF was reached before {@code n} characters could be added.
*/
protected int fill(int n) {
for (int i=0; i<n; i++) {
if (this.n > 0 && data[this.n - 1] == CharStream.EOF) {
return i;
}
try {
int c = nextChar();
add(c);
}
catch (IOException ioe) {
throw new RuntimeException(ioe);
}
}
return n;
}
/**
* Override to provide different source of characters than
* {@link #input input}.
*/
protected int nextChar() throws IOException {
return input.read();
}
protected void add(int c) {
if ( n>=data.length ) {
char[] newdata = new char[data.length*2]; // resize
System.arraycopy(data, 0, newdata, 0, data.length);
data = newdata;
@ -132,56 +202,92 @@ public class UnbufferedCharStream implements CharStream {
@Override
public int LA(int i) {
if ( i==-1 ) return lastChar; // special case
sync(i);
int index = p + i - 1;
if ( index < 0 ) throw new IndexOutOfBoundsException();
if ( index > n ) return CharStream.EOF;
if ( index > n ) return IntStream.EOF;
int c = data[index];
if ( c==(char)CharStream.EOF ) return CharStream.EOF;
if ( c==(char)IntStream.EOF ) return IntStream.EOF;
return c;
}
/** Return a marker that we can release later. Marker happens to be
* index into buffer (not index()).
*/
/**
* Return a marker that we can release later.
* <p/>
* The specific marker value used for this class allows for some level of
* protection against misuse where {@code seek()} is called on a mark or
* {@code release()} is called in the wrong order.
*/
@Override
public int mark() {
int m = p;
if ( p < earliestMarker) {
// they must have done seek to before min marker
throw new IllegalArgumentException("can't set marker earlier than previous existing marker: "+p+"<"+ earliestMarker);
}
if ( earliestMarker < 0 ) earliestMarker = m; // set first marker
return m;
if (numMarkers == 0) {
lastCharBufferStart = lastChar;
}
int mark = -numMarkers - 1;
numMarkers++;
return mark;
}
/** Decrement number of markers, resetting buffer if we hit 0.
* @param marker
*/
@Override
public void release(int marker) {
// release is noop unless we remove earliest. then we don't need to
// keep anything in buffer. We only care about earliest. Releasing
// marker other than earliest does nothing as we can just keep in
// buffer.
if ( marker < earliestMarker || marker >= n ) {
throw new IllegalArgumentException("invalid marker: "+
marker+" not in "+0+".."+n);
}
if ( marker == earliestMarker) earliestMarker = -1;
int expectedMark = -numMarkers;
if ( marker!=expectedMark ) {
throw new IllegalStateException("release() called with an invalid marker.");
}
numMarkers--;
if ( numMarkers==0 && p > 0 ) { // release buffer when we can, but don't do unnecessary work
// Copy data[p]..data[n-1] to data[0]..data[(n-1)-p], reset ptrs
// p is last valid char; move nothing if p==n as we have no valid char
System.arraycopy(data, p, data, 0, n - p); // shift n-p char from p to 0
n = n - p;
p = 0;
lastCharBufferStart = lastChar;
}
}
@Override
public int index() {
return p + bufferStartIndex;
return currentCharIndex;
}
/** Seek to absolute character index, which might not be in the current
* sliding window. Move {@code p} to {@code index-bufferStartIndex}.
*/
@Override
public void seek(int index) {
if (index == currentCharIndex) {
return;
}
if (index > currentCharIndex) {
sync(index - currentCharIndex);
index = Math.min(index, getBufferStartIndex() + n - 1);
}
// index == to bufferStartIndex should set p to 0
int i = index - bufferStartIndex;
if ( i < 0 || i >= n ) {
int i = index - getBufferStartIndex();
if ( i < 0 ) {
throw new IllegalArgumentException("cannot seek to negative index " + index);
}
else if (i >= n) {
throw new UnsupportedOperationException("seek to index outside buffer: "+
index+" not in "+bufferStartIndex+".."+(bufferStartIndex+n));
index+" not in "+getBufferStartIndex()+".."+(getBufferStartIndex()+n));
}
p = i;
p = i;
currentCharIndex = index;
if (p == 0) {
lastChar = lastCharBufferStart;
}
else {
lastChar = data[p-1];
}
}
@Override
@ -191,11 +297,32 @@ public class UnbufferedCharStream implements CharStream {
@Override
public String getSourceName() {
return name;
}
return name;
}
@Override
public String substring(int start, int stop) {
return null; // map to buffer indexes
}
@Override
public String getText(Interval interval) {
if (interval.a < 0 || interval.b < interval.a - 1) {
throw new IllegalArgumentException("invalid interval");
}
int bufferStartIndex = getBufferStartIndex();
if (n > 0 && data[n - 1] == Character.MAX_VALUE) {
if (interval.a + interval.length() > bufferStartIndex + n) {
throw new IllegalArgumentException("the interval extends past the end of the stream");
}
}
if (interval.a < bufferStartIndex || interval.b >= bufferStartIndex + n) {
throw new UnsupportedOperationException("interval "+interval+" outside buffer: "+
bufferStartIndex+".."+(bufferStartIndex+n));
}
// convert from absolute to local index
int i = interval.a - bufferStartIndex;
return new String(data, i, interval.length());
}
protected final int getBufferStartIndex() {
return currentCharIndex - p;
}
}

View File

@ -1,100 +1,336 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.LookaheadStream;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.misc.NotNull;
/** A token stream that pulls tokens from the source on-demand and
* without tracking a complete buffer of the tokens. This stream buffers
* the minimum number of tokens possible.
*
* You can't use this stream if you pass whitespace or other off-channel
* tokens to the parser. The stream can't ignore off-channel tokens.
*
* You can only look backwards 1 token: LT(-1).
*
* Use this when you need to read from a socket or other infinite stream.
*
* @see BufferedTokenStream
* @see CommonTokenStream
*/
public class UnbufferedTokenStream<T extends Token>
extends LookaheadStream<T>
implements TokenStream
{
public class UnbufferedTokenStream<T extends Token> implements TokenStream {
protected TokenSource tokenSource;
protected int tokenIndex = 0; // simple counter to set token index in tokens
/** Skip tokens on any channel but this one; this is how we skip whitespace... */
protected int channel = Token.DEFAULT_CHANNEL;
/**
* A moving window buffer of the data being scanned. While there's a marker,
* we keep adding to buffer. Otherwise, {@link #consume consume()} resets so
* we start filling at index 0 again.
*/
protected Token[] tokens;
/**
* The number of tokens currently in {@link #tokens tokens}.
* <p/>
* This is not the buffer capacity, that's {@code tokens.length}.
*/
protected int n;
/**
* 0..n-1 index into {@link #tokens tokens} of next token.
* <p/>
* The {@code LT(1)} token is {@code tokens[p]}. If {@code p == n}, we are
* out of buffered tokens.
*/
protected int p=0;
/**
* Count up with {@link #mark mark()} and down with
* {@link #release release()}. When we {@code release()} the last mark,
* {@code numMarkers} reaches 0 and we reset the buffer. Copy
* {@code tokens[p]..tokens[n-1]} to {@code tokens[0]..tokens[(n-1)-p]}.
*/
protected int numMarkers = 0;
/**
* This is the {@code LT(-1)} token for the current position.
*/
protected Token lastToken;
/**
* When {@code numMarkers > 0}, this is the {@code LT(-1)} token for the
* first token in {@link #tokens}. Otherwise, this is {@code null}.
*/
protected Token lastTokenBufferStart;
/**
* Absolute token index. It's the index of the token about to be read via
* {@code LT(1)}. Goes from 0 to the number of tokens in the entire stream,
* although the stream size is unknown before the end is reached.
* <p/>
* This value is used to set the token indexes if the stream provides tokens
* that implement {@link WritableToken}.
*/
protected int currentTokenIndex = 0;
public UnbufferedTokenStream(TokenSource tokenSource) {
this(tokenSource, 256);
}
public UnbufferedTokenStream(TokenSource tokenSource, int bufferSize) {
this.tokenSource = tokenSource;
tokens = new Token[bufferSize];
n = 0;
fill(1); // prime the pump
}
@Override
public T nextElement() {
T t = (T)tokenSource.nextToken();
if ( t instanceof WritableToken ) {
((WritableToken)t).setTokenIndex(tokenIndex);
}
tokenIndex++;
return t;
@Override
public Token get(int i) { // get absolute index
int bufferStartIndex = getBufferStartIndex();
if (i < bufferStartIndex || i >= bufferStartIndex + n) {
throw new IndexOutOfBoundsException("get("+i+") outside buffer: "+
bufferStartIndex+".."+(bufferStartIndex+n));
}
return tokens[i - bufferStartIndex];
}
@Override
public boolean isEOF(Token o) {
return false;
}
@Override
public Token LT(int i) {
if ( i==-1 ) {
return lastToken;
}
@Override
public TokenSource getTokenSource() { return tokenSource; }
sync(i);
int index = p + i - 1;
if ( index < 0 ) {
throw new IndexOutOfBoundsException("LT("+i+") gives negative index");
}
@Override
public String toString(int start, int stop) {
throw new UnsupportedOperationException("unbuffered stream can't give strings");
}
if ( index >= n ) {
assert n > 0 && tokens[n-1].getType() == Token.EOF;
return tokens[n-1];
}
@Override
public String toString(Token start, Token stop) {
throw new UnsupportedOperationException("unbuffered stream can't give strings");
}
return tokens[index];
}
@Override
public int LA(int i) { return LT(i).getType(); }
@Override
public int LA(int i) {
return LT(i).getType();
}
@Override
public T get(int i) {
throw new UnsupportedOperationException("Absolute token indexes are meaningless in an unbuffered stream");
}
@Override
public TokenSource getTokenSource() {
return tokenSource;
}
@Override
public String getSourceName() { return tokenSource.getSourceName(); }
@NotNull
@Override
public String getText() {
return "";
}
@NotNull
@Override
public String getText(RuleContext ctx) {
return getText(ctx.getSourceInterval());
}
@NotNull
@Override
public String getText(Token start, Token stop) {
return getText(Interval.of(start.getTokenIndex(), stop.getTokenIndex()));
}
@Override
public void consume() {
if (LA(1) == Token.EOF) {
throw new IllegalStateException("cannot consume EOF");
}
// buf always has at least tokens[p==0] in this method due to ctor
lastToken = tokens[p]; // track last token for LT(-1)
// if we're at last token and no markers, opportunity to flush buffer
if ( p == n-1 && numMarkers==0 ) {
n = 0;
p = -1; // p++ will leave this at 0
lastTokenBufferStart = lastToken;
}
p++;
currentTokenIndex++;
sync(1);
}
/** Make sure we have 'need' elements from current position {@link #p p}. Last valid
* {@code p} index is {@code tokens.length-1}. {@code p+need-1} is the tokens index 'need' elements
* ahead. If we need 1 element, {@code (p+1-1)==p} must be less than {@code tokens.length}.
*/
protected void sync(int want) {
int need = (p+want-1) - n + 1; // how many more elements we need?
if ( need > 0 ) {
fill(need);
}
}
/**
* Add {@code n} elements to the buffer. Returns the number of tokens
* actually added to the buffer. If the return value is less than {@code n},
* then EOF was reached before {@code n} tokens could be added.
*/
protected int fill(int n) {
for (int i=0; i<n; i++) {
if (this.n > 0 && tokens[this.n-1].getType() == Token.EOF) {
return i;
}
Token t = tokenSource.nextToken();
add(t);
}
return n;
}
protected void add(@NotNull Token t) {
if ( n>=tokens.length ) {
Token[] newtokens = new Token[tokens.length*2]; // resize
System.arraycopy(tokens, 0, newtokens, 0, tokens.length);
tokens = newtokens;
}
if (t instanceof WritableToken) {
((WritableToken)t).setTokenIndex(getBufferStartIndex() + n);
}
tokens[n++] = t;
}
/**
* Return a marker that we can release later.
* <p/>
* The specific marker value used for this class allows for some level of
* protection against misuse where {@code seek()} is called on a mark or
* {@code release()} is called in the wrong order.
*/
@Override
public int mark() {
if (numMarkers == 0) {
lastTokenBufferStart = lastToken;
}
int mark = -numMarkers - 1;
numMarkers++;
return mark;
}
@Override
public void release(int marker) {
int expectedMark = -numMarkers;
if ( marker!=expectedMark ) {
throw new IllegalStateException("release() called with an invalid marker.");
}
numMarkers--;
if ( numMarkers==0 ) { // can we release buffer?
if (p > 0) {
// Copy tokens[p]..tokens[n-1] to tokens[0]..tokens[(n-1)-p], reset ptrs
// p is last valid token; move nothing if p==n as we have no valid char
System.arraycopy(tokens, p, tokens, 0, n - p); // shift n-p tokens from p to 0
n = n - p;
p = 0;
}
lastTokenBufferStart = lastToken;
}
}
@Override
public int index() {
return currentTokenIndex;
}
@Override
public void seek(int index) { // seek to absolute index
if (index == currentTokenIndex) {
return;
}
if (index > currentTokenIndex) {
sync(index - currentTokenIndex);
index = Math.min(index, getBufferStartIndex() + n - 1);
}
int bufferStartIndex = getBufferStartIndex();
int i = index - bufferStartIndex;
if ( i < 0 ) {
throw new IllegalArgumentException("cannot seek to negative index " + index);
}
else if (i >= n) {
throw new UnsupportedOperationException("seek to index outside buffer: "+
index+" not in "+ bufferStartIndex +".."+(bufferStartIndex +n));
}
p = i;
currentTokenIndex = index;
if (p == 0) {
lastToken = lastTokenBufferStart;
}
else {
lastToken = tokens[p-1];
}
}
@Override
public int size() {
throw new UnsupportedOperationException("Unbuffered stream cannot know its size");
}
@Override
public String getSourceName() {
return tokenSource.getSourceName();
}
@NotNull
@Override
public String getText(Interval interval) {
int bufferStartIndex = getBufferStartIndex();
int bufferStopIndex = bufferStartIndex + tokens.length - 1;
int start = interval.a;
int stop = interval.b;
if (start < bufferStartIndex || stop > bufferStopIndex) {
throw new UnsupportedOperationException("interval "+interval+" not in token buffer window: "+
bufferStartIndex+".."+bufferStopIndex);
}
int a = start - bufferStartIndex;
int b = stop - bufferStartIndex;
StringBuilder buf = new StringBuilder();
for (int i = a; i <= b; i++) {
Token t = tokens[i];
buf.append(t.getText());
}
return buf.toString();
}
protected final int getBufferStartIndex() {
return currentTokenIndex - p;
}
}

View File

@ -71,6 +71,7 @@ public class ATN {
// runtime for lexer only
public int[] ruleToTokenType;
public int[] ruleToActionIndex;
@NotNull
public final List<TokensStartState> modeToStartState = new ArrayList<TokensStartState>();
@ -119,7 +120,7 @@ public class ATN {
}
public DecisionState getDecisionState(int decision) {
if ( decisionToState.size()>0 ) {
if ( !decisionToState.isEmpty() ) {
return decisionToState.get(decision);
}
return null;

View File

@ -1,46 +1,45 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
/** An ATN state, predicted alt, and syntactic/semantic context.
* The syntactic context is a pointer into the rule invocation
/** A tuple: (ATN state, predicted alt, syntactic, semantic context).
* The syntactic context is a graph-structured stack node whose
* path(s) to the root is the rule invocation(s)
* chain used to arrive at the state. The semantic context is
* the unordered set semantic predicates encountered before reaching
* the tree of semantic predicates encountered before reaching
* an ATN state.
*
* (state, alt, rule context, semantic context)
*/
public class ATNConfig {
/** The ATN state associated with this configuration */
@ -55,7 +54,7 @@ public class ATNConfig {
* execution of the ATN simulator.
*/
@Nullable
public RuleContext context;
public PredictionContext context;
/**
* We cannot execute predicates dependent upon local context unless
@ -70,22 +69,27 @@ public class ATNConfig {
*/
public int reachesIntoOuterContext;
/** Capture lexer action we traverse */
public int lexerActionIndex = -1; // TOOD: move to subclass
@NotNull
public final SemanticContext semanticContext;
public ATNConfig(ATNConfig old) { // dup
this.state = old.state;
this.alt = old.alt;
this.context = old.context;
this.semanticContext = old.semanticContext;
this.reachesIntoOuterContext = old.reachesIntoOuterContext;
}
public ATNConfig(@NotNull ATNState state,
int alt,
@Nullable RuleContext context)
@Nullable PredictionContext context)
{
this(state, alt, context, SemanticContext.NONE);
}
public ATNConfig(@NotNull ATNState state,
int alt,
@Nullable RuleContext context,
@Nullable PredictionContext context,
@NotNull SemanticContext semanticContext)
{
this.state = state;
@ -98,23 +102,33 @@ public class ATNConfig {
this(c, state, c.context, c.semanticContext);
}
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state, @NotNull SemanticContext semanticContext) {
this(c, state, c.context, semanticContext);
}
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state,
@NotNull SemanticContext semanticContext)
{
this(c, state, c.context, semanticContext);
}
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state, @Nullable RuleContext context) {
public ATNConfig(@NotNull ATNConfig c,
@NotNull SemanticContext semanticContext)
{
this(c, c.state, c.context, semanticContext);
}
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state,
@Nullable PredictionContext context)
{
this(c, state, context, c.semanticContext);
}
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state, @Nullable RuleContext context,
public ATNConfig(@NotNull ATNConfig c, @NotNull ATNState state,
@Nullable PredictionContext context,
@NotNull SemanticContext semanticContext)
{
this.state = state;
this.alt = c.alt;
this.context = context;
this.semanticContext = semanticContext;
this.reachesIntoOuterContext = c.reachesIntoOuterContext;
this.semanticContext = semanticContext;
this.lexerActionIndex = c.lexerActionIndex;
}
/** An ATN configuration is equal to another if both have
@ -171,8 +185,9 @@ public class ATNConfig {
buf.append(alt);
}
if ( context!=null ) {
buf.append(",");
buf.append(context.toString(recog));
buf.append(",[");
buf.append(context.toString());
buf.append("]");
}
if ( semanticContext!=null && semanticContext != SemanticContext.NONE ) {
buf.append(",");

View File

@ -1,54 +1,307 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.OrderedHashSet;
import org.antlr.v4.runtime.misc.Array2DHashSet;
import org.antlr.v4.runtime.misc.DoubleKeyMap;
import java.util.ArrayList;
import java.util.BitSet;
import java.util.Collection;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Set;
/** Specialized OrderedHashSet that can track info about the set.
* Might be able to optimize later w/o affecting code that uses this set.
histogram of lexer DFA configset size:
206 30 <- 206 sets with size 30
47 1
17 31
12 2
10 3
7 32
4 4
3 35
2 9
2 6
2 5
2 34
1 7
1 33
1 29
1 12
1 119 <- max size
322 set size for SLL parser java.* in DFA states:
888 1
411 54
365 88
304 56
206 80
182 16
167 86
166 78
158 84
131 2
121 20
120 8
119 112
82 10
73 6
53 174
47 90
45 4
39 12
38 122
37 89
37 62
34 3
34 18
32 81
31 87
28 45
27 144
25 41
24 132
22 91
22 7
21 82
21 28
21 27
17 9
16 29
16 155
15 51
15 118
14 146
14 114
13 5
13 38
12 48
11 64
11 50
11 22
11 134
11 131
10 79
10 76
10 59
10 58
10 55
10 39
10 116
9 74
9 47
9 310
...
javalr, java.* configs with # preds histogram:
4569 0
57 1
27 27
5 76
4 28
3 72
3 38
3 30
2 6
2 32
1 9
1 2
javalr, java.* all atnconfigsets; max size = 322, num sets = 269088
114186 1 <-- optimize
35712 6
28081 78
15252 54
14171 56
13159 12
11810 88
6873 86
6158 80
5169 4
3773 118
2350 16
1002 112
915 28
898 44
734 2
632 62
575 8
566 59
474 20
388 84
343 48
333 55
328 47
311 41
306 38
277 81
263 79
255 66
245 90
245 87
234 50
224 10
220 60
194 64
186 32
184 82
150 18
125 7
121 132
116 30
103 51
95 114
84 36
82 40
78 22
77 89
55 9
53 174
48 152
44 67
44 5
42 115
41 58
38 122
37 134
34 13
34 116
29 45
29 3
29 24
27 144
26 146
25 91
24 113
20 27
...
number with 1-9 elements:
114186 1
35712 6
5169 4
734 2
575 8
125 7
55 9
44 5
29 3
Can cover 60% of sizes with size up to 6
Can cover 44% of sizes with size up to 4
Can cover 42% of sizes with size up to 1
*/
public class ATNConfigSet extends OrderedHashSet<ATNConfig> {
public class ATNConfigSet implements Set<ATNConfig> {
/*
The reason that we need this is because we don't want the hash map to use
the standard hash code and equals. We need all configurations with the same
(s,i,_,semctx) to be equal. Unfortunately, this key effectively doubles
the number of objects associated with ATNConfigs. The other solution is to
use a hash table that lets us specify the equals/hashcode operation.
*/
public static class ConfigHashSet extends Array2DHashSet<ATNConfig> {
public ConfigHashSet() {
super(16,2);
}
@Override
public int hashCode(ATNConfig o) {
int hashCode = 7;
hashCode = 31 * hashCode + o.state.stateNumber;
hashCode = 31 * hashCode + o.alt;
hashCode = 31 * hashCode + o.semanticContext.hashCode();
return hashCode;
}
@Override
public boolean equals(ATNConfig a, ATNConfig b) {
if ( a==b ) return true;
if ( a==null || b==null ) return false;
if ( hashCode(a) != hashCode(b) ) return false;
return a.state.stateNumber==b.state.stateNumber
&& a.alt==b.alt
&& b.semanticContext.equals(b.semanticContext);
}
}
/** Indicates that the set of configurations is read-only. Do not
* allow any code to manipulate the set; DFA states will point at
* the sets and they must not change. This does not protect the other
* fields; in particular, conflictingAlts is set after
* we've made this readonly.
*/
protected boolean readonly = false;
/** All configs but hashed by (s, i, _, pi) not incl context. Wiped out
* when we go readonly as this set becomes a DFA state.
*/
public ConfigHashSet configLookup;
/** Track the elements as they are added to the set; supports get(i) */
public final ArrayList<ATNConfig> configs = new ArrayList<ATNConfig>(7);
// TODO: these fields make me pretty uncomfortable but nice to pack up info together, saves recomputation
// TODO: can we track conflicts as they are added to save scanning configs later?
public int uniqueAlt;
public IntervalSet conflictingAlts;
protected BitSet conflictingAlts;
// Used in parser and lexer. In lexer, it indicates we hit a pred
// while computing a closure operation. Don't make a DFA state from this.
public boolean hasSemanticContext;
public boolean dipsIntoOuterContext;
public ATNConfigSet() { }
/** Indicates that this configuration set is part of a full context
* LL prediction. It will be used to determine how to merge $. With SLL
* it's a wildcard whereas it is not for LL context merge.
*/
public final boolean fullCtx;
public ATNConfigSet(boolean fullCtx) {
configLookup = new ConfigHashSet();
this.fullCtx = fullCtx;
}
public ATNConfigSet() { this(true); }
public ATNConfigSet(ATNConfigSet old) {
this(old.fullCtx);
addAll(old);
this.uniqueAlt = old.uniqueAlt;
this.conflictingAlts = old.conflictingAlts;
@ -56,22 +309,185 @@ public class ATNConfigSet extends OrderedHashSet<ATNConfig> {
this.dipsIntoOuterContext = old.dipsIntoOuterContext;
}
@Override
public boolean add(ATNConfig config) {
return add(config, null);
}
/** Adding a new config means merging contexts with existing configs for
* (s, i, pi, _)
* We use (s,i,pi) as key
*/
public boolean add(
ATNConfig config,
DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext> mergeCache)
{
if ( readonly ) throw new IllegalStateException("This set is readonly");
if ( config.semanticContext!=SemanticContext.NONE ) {
hasSemanticContext = true;
}
ATNConfig existing = configLookup.absorb(config);
if ( existing==config ) { // we added this new one
configs.add(config); // track order here
return true;
}
// a previous (s,i,pi,_), merge with it and save result
boolean rootIsWildcard = !fullCtx;
PredictionContext merged =
PredictionContext.merge(existing.context, config.context, rootIsWildcard, mergeCache);
// no need to check for existing.context, config.context in cache
// since only way to create new graphs is "call rule" and here. We
// cache at both places.
existing.reachesIntoOuterContext =
Math.max(existing.reachesIntoOuterContext, config.reachesIntoOuterContext);
existing.context = merged; // replace context; no need to alt mapping
return true;
}
/** Return a List holding list of configs */
public List<ATNConfig> elements() { return configs; }
public Set<ATNState> getStates() {
Set<ATNState> states = new HashSet<ATNState>();
for (ATNConfig c : this.elements) {
for (ATNConfig c : configs) {
states.add(c.state);
}
return states;
}
public List<SemanticContext> getPredicates() {
List<SemanticContext> preds = new ArrayList<SemanticContext>();
for (ATNConfig c : configs) {
if ( c.semanticContext!=SemanticContext.NONE ) {
preds.add(c.semanticContext);
}
}
return preds;
}
public ATNConfig get(int i) { return configs.get(i); }
// TODO: very expensive, used in lexer to kill after wildcard config
public void remove(int i) {
if ( readonly ) throw new IllegalStateException("This set is readonly");
ATNConfig c = elements().get(i);
configLookup.remove(c);
configs.remove(c); // slow linear search. ugh but not worse than it was
}
public void optimizeConfigs(ATNSimulator interpreter) {
if ( readonly ) throw new IllegalStateException("This set is readonly");
if ( configLookup.isEmpty() ) return;
for (ATNConfig config : configs) {
// int before = PredictionContext.getAllContextNodes(config.context).size();
config.context = interpreter.getCachedContext(config.context);
// int after = PredictionContext.getAllContextNodes(config.context).size();
// System.out.println("configs "+before+"->"+after);
}
}
public boolean addAll(Collection<? extends ATNConfig> coll) {
for (ATNConfig c : coll) add(c);
return false;
}
@Override
public boolean equals(Object o) {
// System.out.print("equals " + this + ", " + o+" = ");
ATNConfigSet other = (ATNConfigSet)o;
boolean same = configs!=null &&
configs.equals(other.configs) && // includes stack context
this.fullCtx == other.fullCtx &&
this.uniqueAlt == other.uniqueAlt &&
this.conflictingAlts == other.conflictingAlts &&
this.hasSemanticContext == other.hasSemanticContext &&
this.dipsIntoOuterContext == other.dipsIntoOuterContext;
// System.out.println(same);
return same;
}
@Override
public int hashCode() {
return configs.hashCode();
}
@Override
public int size() {
return configs.size();
}
@Override
public boolean isEmpty() {
return configs.isEmpty();
}
@Override
public boolean contains(Object o) {
if ( o instanceof ATNConfig ) {
return configLookup.contains(o);
}
return false;
}
@Override
public Iterator<ATNConfig> iterator() {
return configs.iterator();
}
@Override
public void clear() {
if ( readonly ) throw new IllegalStateException("This set is readonly");
configs.clear();
configLookup.clear();
}
public void setReadonly(boolean readonly) {
this.readonly = readonly;
configLookup = null; // can't mod, no need for lookup cache
}
@Override
public String toString() {
StringBuilder buf = new StringBuilder();
buf.append(super.toString());
if ( hasSemanticContext ) buf.append(",hasSemanticContext="+hasSemanticContext);
if ( uniqueAlt!=ATN.INVALID_ALT_NUMBER ) buf.append(",uniqueAlt="+uniqueAlt);
if ( conflictingAlts!=null ) buf.append(",conflictingAlts="+conflictingAlts);
buf.append(elements().toString());
if ( hasSemanticContext ) buf.append(",hasSemanticContext=").append(hasSemanticContext);
if ( uniqueAlt!=ATN.INVALID_ALT_NUMBER ) buf.append(",uniqueAlt=").append(uniqueAlt);
if ( conflictingAlts!=null ) buf.append(",conflictingAlts=").append(conflictingAlts);
if ( dipsIntoOuterContext ) buf.append(",dipsIntoOuterContext");
return buf.toString();
}
// satisfy interface
@Override
public Object[] toArray() {
return configLookup.toArray();
}
@Override
public <T> T[] toArray(T[] a) {
return configLookup.toArray(a);
}
@Override
public boolean remove(Object o) {
throw new UnsupportedOperationException();
}
@Override
public boolean containsAll(Collection<?> c) {
throw new UnsupportedOperationException();
}
@Override
public boolean retainAll(Collection<?> c) {
throw new UnsupportedOperationException();
}
@Override
public boolean removeAll(Collection<?> c) {
throw new UnsupportedOperationException();
}
}

View File

@ -1,65 +1,113 @@
/*
[The "BSD license"]
Copyright (c) 2011 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.runtime.dfa.DFAState;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Pair;
import java.util.ArrayList;
import java.util.IdentityHashMap;
import java.util.List;
public abstract class ATNSimulator {
public static final int SERIALIZED_NON_GREEDY_MASK = 0x8000;
public static final int SERIALIZED_STATE_TYPE_MASK = 0x7FFF;
/** Must distinguish between missing edge and edge we know leads nowhere */
@NotNull
public static final DFAState ERROR;
@NotNull
public final ATN atn;
/** The context cache maps all PredictionContext objects that are equals()
* to a single cached copy. This cache is shared across all contexts
* in all ATNConfigs in all DFA states. We rebuild each ATNConfigSet
* to use only cached nodes/graphs in addDFAState(). We don't want to
* fill this during closure() since there are lots of contexts that
* pop up but are not used ever again. It also greatly slows down closure().
*
* This cache makes a huge difference in memory and a little bit in speed.
* For the Java grammar on java.*, it dropped the memory requirements
* at the end from 25M to 16M. We don't store any of the full context
* graphs in the DFA because they are limited to local context only,
* but apparently there's a lot of repetition there as well. We optimize
* the config contexts before storing the config set in the DFA states
* by literally rebuilding them with cached subgraphs only.
*
* I tried a cache for use during closure operations, that was
* whacked after each adaptivePredict(). It cost a little bit
* more time I think and doesn't save on the overall footprint
* so it's not worth the complexity.
*/
protected final PredictionContextCache sharedContextCache;
static {
ERROR = new DFAState(new ATNConfigSet());
ERROR.stateNumber = Integer.MAX_VALUE;
}
public ATNSimulator(@NotNull ATN atn) {
public ATNSimulator(@NotNull ATN atn,
@NotNull PredictionContextCache sharedContextCache)
{
this.atn = atn;
this.sharedContextCache = sharedContextCache;
}
public abstract void reset();
public PredictionContext getCachedContext(PredictionContext context) {
if ( sharedContextCache==null ) return context;
IdentityHashMap<PredictionContext, PredictionContext> visited =
new IdentityHashMap<PredictionContext, PredictionContext>();
return PredictionContext.getCachedContext(context,
sharedContextCache,
visited);
}
public static ATN deserialize(@NotNull char[] data) {
ATN atn = new ATN();
List<IntervalSet> sets = new ArrayList<IntervalSet>();
int p = 0;
atn.grammarType = toInt(data[p++]);
atn.maxTokenType = toInt(data[p++]);
//
// STATES
//
List<Pair<LoopEndState, Integer>> loopBackStateNumbers = new ArrayList<Pair<LoopEndState, Integer>>();
List<Pair<BlockStartState, Integer>> endStateNumbers = new ArrayList<Pair<BlockStartState, Integer>>();
int nstates = toInt(data[p++]);
for (int i=1; i<=nstates; i++) {
int stype = toInt(data[p++]);
@ -68,13 +116,37 @@ public abstract class ATNSimulator {
atn.addState(null);
continue;
}
boolean nonGreedy = (stype & SERIALIZED_NON_GREEDY_MASK) != 0;
stype &= SERIALIZED_STATE_TYPE_MASK;
ATNState s = stateFactory(stype, i);
if (s instanceof DecisionState) {
((DecisionState)s).nonGreedy = nonGreedy;
}
s.ruleIndex = toInt(data[p++]);
if ( stype == ATNState.LOOP_END ) { // special case
((LoopEndState)s).loopBackStateNumber = toInt(data[p++]);
int loopBackStateNumber = toInt(data[p++]);
loopBackStateNumbers.add(new Pair<LoopEndState, Integer>((LoopEndState)s, loopBackStateNumber));
}
else if (s instanceof BlockStartState) {
int endStateNumber = toInt(data[p++]);
endStateNumbers.add(new Pair<BlockStartState, Integer>((BlockStartState)s, endStateNumber));
}
atn.addState(s);
}
// delay the assignment of loop back and end states until we know all the state instances have been initialized
for (Pair<LoopEndState, Integer> pair : loopBackStateNumbers) {
pair.a.loopBackState = atn.states.get(pair.b);
}
for (Pair<BlockStartState, Integer> pair : endStateNumbers) {
pair.a.endState = (BlockEndState)atn.states.get(pair.b);
}
//
// RULES
//
int nrules = toInt(data[p++]);
if ( atn.grammarType == ATN.LEXER ) {
atn.ruleToTokenType = new int[nrules];
@ -92,11 +164,30 @@ public abstract class ATNSimulator {
atn.ruleToActionIndex[i] = actionIndex;
}
}
atn.ruleToStopState = new RuleStopState[nrules];
for (ATNState state : atn.states) {
if (!(state instanceof RuleStopState)) {
continue;
}
RuleStopState stopState = (RuleStopState)state;
atn.ruleToStopState[state.ruleIndex] = stopState;
atn.ruleToStartState[state.ruleIndex].stopState = stopState;
}
//
// MODES
//
int nmodes = toInt(data[p++]);
for (int i=0; i<nmodes; i++) {
int s = toInt(data[p++]);
atn.modeToStartState.add((TokensStartState)atn.states.get(s));
}
//
// SETS
//
int nsets = toInt(data[p++]);
for (int i=1; i<=nsets; i++) {
int nintervals = toInt(data[p]);
@ -108,6 +199,10 @@ public abstract class ATNSimulator {
p += 2;
}
}
//
// EDGES
//
int nedges = toInt(data[p++]);
for (int i=1; i<=nedges; i++) {
int src = toInt(data[p]);
@ -125,18 +220,122 @@ public abstract class ATNSimulator {
srcState.addTransition(trans);
p += 6;
}
// edges for rule stop states can be derived, so they aren't serialized
for (ATNState state : atn.states) {
for (int i = 0; i < state.getNumberOfTransitions(); i++) {
Transition t = state.transition(i);
if (!(t instanceof RuleTransition)) {
continue;
}
RuleTransition ruleTransition = (RuleTransition)t;
atn.ruleToStopState[ruleTransition.target.ruleIndex].addTransition(new EpsilonTransition(ruleTransition.followState));
}
}
for (ATNState state : atn.states) {
if (state instanceof BlockStartState) {
// we need to know the end state to set its start state
if (((BlockStartState)state).endState == null) {
throw new IllegalStateException();
}
// block end states can only be associated to a single block start state
if (((BlockStartState)state).endState.startState != null) {
throw new IllegalStateException();
}
((BlockStartState)state).endState.startState = (BlockStartState)state;
}
if (state instanceof PlusLoopbackState) {
PlusLoopbackState loopbackState = (PlusLoopbackState)state;
for (int i = 0; i < loopbackState.getNumberOfTransitions(); i++) {
ATNState target = loopbackState.transition(i).target;
if (target instanceof PlusBlockStartState) {
((PlusBlockStartState)target).loopBackState = loopbackState;
}
}
}
else if (state instanceof StarLoopbackState) {
StarLoopbackState loopbackState = (StarLoopbackState)state;
for (int i = 0; i < loopbackState.getNumberOfTransitions(); i++) {
ATNState target = loopbackState.transition(i).target;
if (target instanceof StarLoopEntryState) {
((StarLoopEntryState)target).loopBackState = loopbackState;
}
}
}
}
//
// DECISIONS
//
int ndecisions = toInt(data[p++]);
for (int i=1; i<=ndecisions; i++) {
int s = toInt(data[p++]);
int isGreedy = toInt(data[p++]);
DecisionState decState = (DecisionState)atn.states.get(s);
atn.decisionToState.add(decState);
decState.decision = i-1;
decState.isGreedy = isGreedy==1;
}
verifyATN(atn);
return atn;
}
private static void verifyATN(ATN atn) {
// verify assumptions
for (ATNState state : atn.states) {
if (state == null) {
continue;
}
if (state instanceof PlusBlockStartState) {
if (((PlusBlockStartState)state).loopBackState == null) {
throw new IllegalStateException();
}
}
if (state instanceof StarLoopEntryState) {
if (((StarLoopEntryState)state).loopBackState == null) {
throw new IllegalStateException();
}
}
if (state instanceof LoopEndState) {
if (((LoopEndState)state).loopBackState == null) {
throw new IllegalStateException();
}
}
if (state instanceof RuleStartState) {
if (((RuleStartState)state).stopState == null) {
throw new IllegalStateException();
}
}
if (state instanceof BlockStartState) {
if (((BlockStartState)state).endState == null) {
throw new IllegalStateException();
}
}
if (state instanceof BlockEndState) {
if (((BlockEndState)state).startState == null) {
throw new IllegalStateException();
}
}
if (state instanceof DecisionState) {
DecisionState decisionState = (DecisionState)state;
if (decisionState.getNumberOfTransitions() > 1 && decisionState.decision < 0) {
throw new IllegalStateException();
}
}
}
}
public static int toInt(char c) {
return c==65535 ? -1 : c;
}

View File

@ -31,7 +31,12 @@ package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.IntervalSet;
import java.util.*;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class ATNState {
public static final int INITIAL_NUM_TRANSITIONS = 4;
@ -112,12 +117,22 @@ public class ATNState {
return false;
}
public boolean isNonGreedyExitState() {
return false;
}
@Override
public String toString() {
return String.valueOf(stateNumber);
}
public int getNumberOfTransitions() { return transitions.size(); }
public Transition[] getTransitions() {
return transitions.toArray(new Transition[transitions.size()]);
}
public int getNumberOfTransitions() {
return transitions.size();
}
public void addTransition(Transition e) {
if (transitions.isEmpty()) {
@ -137,6 +152,10 @@ public class ATNState {
transitions.set(i, e);
}
public Transition removeTransition(int index) {
return transitions.remove(index);
}
public int getStateType() {
return serializationTypes.get(this.getClass());
}

View File

@ -31,7 +31,7 @@ package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
public class ActionTransition extends Transition {
public final class ActionTransition extends Transition {
public final int ruleIndex;
public final int actionIndex;
public final boolean isCtxDependent; // e.g., $i ref in action
@ -47,11 +47,21 @@ public class ActionTransition extends Transition {
this.isCtxDependent = isCtxDependent;
}
@Override
public int getSerializationType() {
return ACTION;
}
@Override
public boolean isEpsilon() {
return true; // we are to be ignored by analysis 'cept for predicates
}
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return false;
}
@Override
public String toString() {
return "action_"+ruleIndex+":"+actionIndex;

View File

@ -0,0 +1,163 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.DoubleKeyMap;
import java.util.Arrays;
import java.util.Iterator;
public class ArrayPredictionContext extends PredictionContext {
/** Parent can be null only if full ctx mode and we make an array
* from EMPTY and non-empty. We merge EMPTY by using null parent and
* returnState == EMPTY_FULL_RETURN_STATE
*/
public final PredictionContext[] parents;
/** Sorted for merge, no duplicates; if present,
* EMPTY_FULL_RETURN_STATE is always first
*/
public final int[] returnStates;
public ArrayPredictionContext(SingletonPredictionContext a) {
this(new PredictionContext[] {a.parent}, new int[] {a.returnState});
}
public ArrayPredictionContext(PredictionContext[] parents, int[] returnStates) {
super(calculateHashCode(parents, returnStates));
assert parents!=null && parents.length>0;
assert returnStates!=null && returnStates.length>0;
// System.err.println("CREATE ARRAY: "+Arrays.toString(parents)+", "+Arrays.toString(returnStates));
this.parents = parents;
this.returnStates = returnStates;
}
//ArrayPredictionContext(@NotNull PredictionContext[] parents, int[] returnStates, int parentHashCode, int returnStateHashCode) {
// super(calculateHashCode(parentHashCode, returnStateHashCode));
// assert parents.length == returnStates.length;
// assert returnStates.length > 1 || returnStates[0] != EMPTY_FULL_STATE_KEY : "Should be using PredictionContext.EMPTY instead.";
//
// this.parents = parents;
// this.returnStates = returnStates;
// }
//
//ArrayPredictionContext(@NotNull PredictionContext[] parents, int[] returnStates, int hashCode) {
// super(hashCode);
// assert parents.length == returnStates.length;
// assert returnStates.length > 1 || returnStates[0] != EMPTY_FULL_STATE_KEY : "Should be using PredictionContext.EMPTY instead.";
//
// this.parents = parents;
// this.returnStates = returnStates;
// }
protected static int calculateHashCode(PredictionContext[] parents, int[] returnStates) {
return calculateHashCode(calculateParentHashCode(parents),
calculateReturnStatesHashCode(returnStates));
}
protected static int calculateParentHashCode(PredictionContext[] parents) {
int hashCode = 1;
for (PredictionContext p : parents) {
if ( p!=null ) { // can be null for full ctx stack in ArrayPredictionContext
hashCode = hashCode * 31 ^ p.hashCode();
}
}
return hashCode;
}
protected static int calculateReturnStatesHashCode(int[] returnStates) {
int hashCode = 1;
for (int state : returnStates) {
hashCode = hashCode * 31 ^ state;
}
return hashCode;
}
@Override
public Iterator<SingletonPredictionContext> iterator() {
return new Iterator<SingletonPredictionContext>() {
int i = 0;
@Override
public boolean hasNext() { return i < parents.length; }
@Override
public SingletonPredictionContext next() {
SingletonPredictionContext ctx =
SingletonPredictionContext.create(parents[i], returnStates[i]);
i++;
return ctx;
}
@Override
public void remove() { throw new UnsupportedOperationException(); }
};
}
@Override
public boolean isEmpty() {
return size()==1 &&
returnStates[0]==EMPTY_RETURN_STATE;
}
@Override
public int size() {
return returnStates.length;
}
@Override
public PredictionContext getParent(int index) {
return parents[index];
}
@Override
public int getReturnState(int index) {
return returnStates[index];
}
// @Override
// public int findReturnState(int returnState) {
// return Arrays.binarySearch(returnStates, returnState);
// }
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
else if ( !(o instanceof ArrayPredictionContext) ) {
return false;
}
if ( this.hashCode() != o.hashCode() ) {
return false; // can't be same if hash is different
}
ArrayPredictionContext a = (ArrayPredictionContext)o;
return Arrays.equals(returnStates, a.returnStates) &&
Arrays.equals(parents, a.parents);
}
@Override
public String toString() {
if ( isEmpty() ) return "[]";
StringBuilder buf = new StringBuilder();
buf.append("[");
for (int i=0; i<returnStates.length; i++) {
if ( i>0 ) buf.append(", ");
if ( returnStates[i]==EMPTY_RETURN_STATE ) {
buf.append("$");
continue;
}
buf.append(returnStates[i]);
if ( parents[i]!=null ) {
buf.append(' ');
buf.append(parents[i].toString());
}
else {
buf.append("null");
}
}
buf.append("]");
return buf.toString();
}
}

View File

@ -29,11 +29,11 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
/** TODO: make all transitions sets? no, should remove set edges */
public class AtomTransition extends Transition {
public final class AtomTransition extends Transition {
/** The token type or character value; or, signifies special label. */
public final int label;
@ -42,10 +42,20 @@ public class AtomTransition extends Transition {
this.label = label;
}
@Override
public int getSerializationType() {
return ATOM;
}
@Override
@NotNull
public IntervalSet label() { return IntervalSet.of(label); }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return label == symbol;
}
@Override
@NotNull
public String toString() {

View File

@ -31,4 +31,5 @@ package org.antlr.v4.runtime.atn;
/** Terminal node of a simple (a|b|c) block */
public class BlockEndState extends ATNState {
public BlockStartState startState;
}

View File

@ -31,6 +31,5 @@ package org.antlr.v4.runtime.atn;
public class DecisionState extends ATNState {
public int decision = -1;
public boolean isGreedy = true;
public boolean nonGreedy;
}

View File

@ -0,0 +1,34 @@
package org.antlr.v4.runtime.atn;
public class EmptyPredictionContext extends SingletonPredictionContext {
public EmptyPredictionContext() {
super(null, EMPTY_RETURN_STATE);
}
public boolean isEmpty() { return true; }
@Override
public int size() {
return 1;
}
@Override
public PredictionContext getParent(int index) {
return null;
}
@Override
public int getReturnState(int index) {
return returnState;
}
@Override
public boolean equals(Object o) {
return this == o;
}
@Override
public String toString() {
return "$";
}
}

View File

@ -31,12 +31,22 @@ package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
public class EpsilonTransition extends Transition {
public final class EpsilonTransition extends Transition {
public EpsilonTransition(@NotNull ATNState target) { super(target); }
@Override
public int getSerializationType() {
return EPSILON;
}
@Override
public boolean isEpsilon() { return true; }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return false;
}
@Override
@NotNull
public String toString() {

View File

@ -29,7 +29,7 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.IntStream;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.runtime.misc.IntervalSet;
@ -40,10 +40,10 @@ import java.util.HashSet;
import java.util.Set;
public class LL1Analyzer {
/** Used during LOOK to detect computation cycles. E.g., ()* causes
* infinite loop without it. If we get to same state would be infinite
* loop.
/** Special value added to the lookahead sets to indicate that we hit
* a predicate during analysis if seeThruPreds==false.
*/
public static final int HIT_PRED = Token.INVALID_TYPE;
@NotNull
public final ATN atn;
@ -63,35 +63,46 @@ public class LL1Analyzer {
Set<ATNConfig> lookBusy = new HashSet<ATNConfig>();
boolean seeThruPreds = false; // fail to get lookahead upon pred
_LOOK(s.transition(alt - 1).target,
ParserRuleContext.EMPTY,
look[alt], lookBusy, seeThruPreds);
if ( look[alt].size()==0 ) look[alt] = null;
PredictionContext.EMPTY,
look[alt], lookBusy, seeThruPreds, false);
// Wipe out lookahead for this alternative if we found nothing
// or we had a predicate when we !seeThruPreds
if ( look[alt].size()==0 || look[alt].contains(HIT_PRED) ) {
look[alt] = null;
}
}
return look;
}
/** Get lookahead, using ctx if we reach end of rule. If ctx is EMPTY, don't chase FOLLOW.
* If ctx is null, EPSILON is in set if we can reach end of rule.
*/
/**
* Get lookahead, using {@code ctx} if we reach end of rule. If {@code ctx}
* is {@code null} or {@link RuleContext#EMPTY EMPTY}, don't chase FOLLOW.
* If {@code ctx} is {@code null}, {@link Token#EPSILON EPSILON} is in set
* if we can reach end of rule. If {@code ctx} is
* {@link RuleContext#EMPTY EMPTY}, {@link IntStream#EOF EOF} is in set if
* we can reach end of rule.
*/
@NotNull
public IntervalSet LOOK(@NotNull ATNState s, @Nullable RuleContext ctx) {
IntervalSet r = new IntervalSet();
boolean seeThruPreds = true; // ignore preds; get all lookahead
_LOOK(s, ctx, r, new HashSet<ATNConfig>(), seeThruPreds);
PredictionContext lookContext = ctx != null ? PredictionContext.fromRuleContext(s.atn, ctx) : null;
_LOOK(s, lookContext,
r, new HashSet<ATNConfig>(), seeThruPreds, true);
return r;
}
/** Computer set of tokens that can come next. If the context is EMPTY,
/** Compute set of tokens that can come next. If the context is EMPTY,
* then we don't go anywhere when we hit the end of the rule. We have
* the correct set. If the context is null, that means that we did not want
* any tokens following this rule--just the tokens that could be found within this
* rule. Add EPSILON to the set indicating we reached the end of the ruled out having
* to match a token.
*/
protected void _LOOK(@NotNull ATNState s, @Nullable RuleContext ctx,
protected void _LOOK(@NotNull ATNState s, @Nullable PredictionContext ctx,
@NotNull IntervalSet look,
@NotNull Set<ATNConfig> lookBusy,
boolean seeThruPreds)
boolean seeThruPreds, boolean addEOF)
{
// System.out.println("_LOOK("+s.stateNumber+", ctx="+ctx);
ATNConfig c = new ATNConfig(s, 0, ctx);
@ -101,42 +112,54 @@ public class LL1Analyzer {
if ( ctx==null ) {
look.add(Token.EPSILON);
return;
}
if ( ctx.invokingState!=-1 ) {
ATNState invokingState = atn.states.get(ctx.invokingState);
RuleTransition rt = (RuleTransition)invokingState.transition(0);
ATNState retState = rt.followState;
// System.out.println("popping back to "+retState);
_LOOK(retState, ctx.parent, look, lookBusy, seeThruPreds);
return;
}
} else if (ctx.isEmpty() && addEOF) {
look.add(Token.EOF);
return;
}
if ( ctx != PredictionContext.EMPTY ) {
// run thru all possible stack tops in ctx
for (SingletonPredictionContext p : ctx) {
ATNState returnState = atn.states.get(p.returnState);
// System.out.println("popping back to "+retState);
_LOOK(returnState, p.parent, look, lookBusy, seeThruPreds, addEOF);
}
return;
}
}
int n = s.getNumberOfTransitions();
for (int i=0; i<n; i++) {
Transition t = s.transition(i);
if ( t.getClass() == RuleTransition.class ) {
RuleContext newContext =
new RuleContext(ctx, s.stateNumber);
_LOOK(t.target, newContext, look, lookBusy, seeThruPreds);
}
else if ( t.isEpsilon() && seeThruPreds ) {
_LOOK(t.target, ctx, look, lookBusy, seeThruPreds);
}
else if ( t.getClass() == WildcardTransition.class ) {
look.addAll( IntervalSet.of(Token.MIN_USER_TOKEN_TYPE, atn.maxTokenType) );
}
else {
Transition t = s.transition(i);
if ( t.getClass() == RuleTransition.class ) {
PredictionContext newContext =
SingletonPredictionContext.create(ctx, ((RuleTransition)t).followState.stateNumber);
_LOOK(t.target, newContext, look, lookBusy, seeThruPreds, addEOF);
}
else if ( t instanceof PredicateTransition ) {
if ( seeThruPreds ) {
_LOOK(t.target, ctx, look, lookBusy, seeThruPreds, addEOF);
}
else {
look.add(HIT_PRED);
}
}
else if ( t.isEpsilon() ) {
_LOOK(t.target, ctx, look, lookBusy, seeThruPreds, addEOF);
}
else if ( t.getClass() == WildcardTransition.class ) {
look.addAll( IntervalSet.of(Token.MIN_USER_TOKEN_TYPE, atn.maxTokenType) );
}
else {
// System.out.println("adding "+ t);
IntervalSet set = t.label();
if (set != null) {
if (t instanceof NotSetTransition) {
set = set.complement(IntervalSet.of(Token.MIN_USER_TOKEN_TYPE, atn.maxTokenType));
}
look.addAll(set);
}
}
}
}
IntervalSet set = t.label();
if (set != null) {
if (t instanceof NotSetTransition) {
set = set.complement(IntervalSet.of(Token.MIN_USER_TOKEN_TYPE, atn.maxTokenType));
}
look.addAll(set);
}
}
}
}
}

View File

@ -0,0 +1,113 @@
/*
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
public class LexerATNConfig extends ATNConfig {
/** Capture lexer action we traverse */
public int lexerActionIndex = -1;
private final boolean passedThroughNonGreedyDecision;
public LexerATNConfig(@NotNull ATNState state,
int alt,
@Nullable PredictionContext context)
{
super(state, alt, context, SemanticContext.NONE);
this.passedThroughNonGreedyDecision = false;
}
public LexerATNConfig(@NotNull ATNState state,
int alt,
@Nullable PredictionContext context,
int actionIndex)
{
super(state, alt, context, SemanticContext.NONE);
this.lexerActionIndex = actionIndex;
this.passedThroughNonGreedyDecision = false;
}
public LexerATNConfig(@NotNull LexerATNConfig c, @NotNull ATNState state) {
super(c, state, c.context, c.semanticContext);
this.lexerActionIndex = c.lexerActionIndex;
this.passedThroughNonGreedyDecision = checkNonGreedyDecision(c, state);
}
public LexerATNConfig(@NotNull LexerATNConfig c, @NotNull ATNState state,
int actionIndex)
{
super(c, state, c.context, c.semanticContext);
this.lexerActionIndex = actionIndex;
this.passedThroughNonGreedyDecision = checkNonGreedyDecision(c, state);
}
public LexerATNConfig(@NotNull LexerATNConfig c, @NotNull ATNState state,
@Nullable PredictionContext context) {
super(c, state, context, c.semanticContext);
this.lexerActionIndex = c.lexerActionIndex;
this.passedThroughNonGreedyDecision = checkNonGreedyDecision(c, state);
}
public final boolean hasPassedThroughNonGreedyDecision() {
return passedThroughNonGreedyDecision;
}
@Override
public int hashCode() {
int hashCode = super.hashCode();
hashCode = 35 * hashCode ^ (passedThroughNonGreedyDecision ? 1 : 0);
return hashCode;
}
@Override
public boolean equals(ATNConfig other) {
if (this == other) {
return true;
}
else if (!(other instanceof LexerATNConfig)) {
return false;
}
LexerATNConfig lexerOther = (LexerATNConfig)other;
if (passedThroughNonGreedyDecision != lexerOther.passedThroughNonGreedyDecision) {
return false;
}
return super.equals(other);
}
private static boolean checkNonGreedyDecision(LexerATNConfig source, ATNState target) {
return source.passedThroughNonGreedyDecision
|| target instanceof DecisionState && ((DecisionState)target).nonGreedy;
}
}

View File

@ -31,5 +31,5 @@ package org.antlr.v4.runtime.atn;
/** Mark the end of a * or + loop */
public class LoopEndState extends ATNState {
public int loopBackStateNumber;
public ATNState loopBackState;
}

View File

@ -33,11 +33,23 @@ import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
public class NotSetTransition extends SetTransition {
public final class NotSetTransition extends SetTransition {
public NotSetTransition(@NotNull ATNState target, @Nullable IntervalSet set) {
super(target, set);
}
@Override
public int getSerializationType() {
return NOT_SET;
}
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return symbol >= minVocabSymbol
&& symbol <= maxVocabSymbol
&& !super.matches(symbol, minVocabSymbol, maxVocabSymbol);
}
@Override
public String toString() {
return '~'+super.toString();

View File

@ -0,0 +1,57 @@
/*
* [The "BSD license"]
* Copyright (c) 2012 Terence Parr
* Copyright (c) 2012 Sam Harwell
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
package org.antlr.v4.runtime.atn;
/**
*
* @author Sam Harwell
*/
public class OrderedATNConfigSet extends ATNConfigSet {
public OrderedATNConfigSet() {
this.configLookup = new LexerConfigHashSet();
}
protected static class LexerConfigHashSet extends ConfigHashSet {
@Override
public int hashCode(ATNConfig o) {
return o.hashCode();
}
@Override
public boolean equals(ATNConfig a, ATNConfig b) {
return a.equals(b);
}
}
}

View File

@ -29,160 +29,146 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.Parser;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.runtime.TokenStream;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.tree.TraceTree;
public class ParserATNPathFinder /*extends ParserATNSimulator*/ {
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class ParserATNPathFinder extends ParserATNSimulator<Token> {
public ParserATNPathFinder(@Nullable Parser parser, @NotNull ATN atn) {
super(parser, atn);
}
/** Given an input sequence, as a subset of the input stream, trace the path through the
* ATN starting at s. The path returned includes s and the final target of the last input
* symbol. If there are multiple paths through the ATN to the final state, it uses the first
* method finds. This is used to figure out how input sequence is matched in more than one
* way between the alternatives of a decision. It's only that decision we are concerned with
* and so if there are ambiguous decisions further along, we will ignore them for the
* purposes of computing the path to the final state. To figure out multiple paths for
* decision, use this method on the left edge of the alternatives of the decision in question.
*
* TODO: I haven't figured out what to do with nongreedy decisions yet
* TODO: preds. unless i create rule specific ctxs, i can't eval preds. also must eval args!
*/
public TraceTree trace(@NotNull ATNState s, @Nullable RuleContext ctx,
TokenStream input, int start, int stop)
{
System.out.println("REACHES "+s.stateNumber+" start state");
List<TraceTree> leaves = new ArrayList<TraceTree>();
HashSet<ATNState>[] busy = new HashSet[stop-start+1];
for (int i = 0; i < busy.length; i++) {
busy[i] = new HashSet<ATNState>();
}
TraceTree path = _trace(s, ctx, ctx, input, start, start, stop, leaves, busy);
if ( path!=null ) path.leaves = leaves;
return path;
}
/** Returns true if we found path */
public TraceTree _trace(@NotNull ATNState s, RuleContext initialContext, RuleContext ctx,
TokenStream input, int start, int i, int stop,
List<TraceTree> leaves, @NotNull Set<ATNState>[] busy)
{
TraceTree root = new TraceTree(s);
if ( i>stop ) {
leaves.add(root); // track final states
System.out.println("leaves=" + leaves);
return root;
}
if ( !busy[i-start].add(s) ) {
System.out.println("already visited "+s.stateNumber+" at input "+i+"="+input.get(i).getText());
return null;
}
busy[i-start].add(s);
System.out.println("TRACE "+s.stateNumber+" at input "+input.get(i).getText());
if ( s instanceof RuleStopState) {
// We hit rule end. If we have context info, use it
if ( ctx!=null && !ctx.isEmpty() ) {
System.out.println("stop state "+s.stateNumber+", ctx="+ctx);
ATNState invokingState = atn.states.get(ctx.invokingState);
RuleTransition rt = (RuleTransition)invokingState.transition(0);
ATNState retState = rt.followState;
return _trace(retState, initialContext, ctx.parent, input, start, i, stop, leaves, busy);
}
else {
// else if we have no context info, just chase follow links (if greedy)
System.out.println("FALLING off rule "+getRuleName(s.ruleIndex));
}
}
int n = s.getNumberOfTransitions();
boolean aGoodPath = false;
TraceTree found = null;
for (int j=0; j<n; j++) {
Transition t = s.transition(j);
if ( t.getClass() == RuleTransition.class ) {
RuleContext newContext =
new RuleContext(ctx, s.stateNumber);
found = _trace(t.target, initialContext, newContext, input, start, i, stop, leaves, busy);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
continue;
}
if ( t instanceof PredicateTransition ) {
found = predTransition(initialContext, ctx, input, start, i, stop, leaves, busy, root, t);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
continue;
}
if ( t.isEpsilon() ) {
found = _trace(t.target, initialContext, ctx, input, start, i, stop, leaves, busy);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
continue;
}
if ( t.getClass() == WildcardTransition.class ) {
System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
continue;
}
IntervalSet set = t.label();
if ( set!=null ) {
if ( t instanceof NotSetTransition ) {
if ( !set.contains(input.get(i).getType()) ) {
System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
}
}
else {
if ( set.contains(input.get(i).getType()) ) {
System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
if ( found!=null ) {aGoodPath=true; root.addChild(found);}
}
}
}
}
if ( aGoodPath ) return root; // found at least one transition leading to success
return null;
}
public TraceTree predTransition(RuleContext initialContext, RuleContext ctx, TokenStream input, int start,
int i, int stop, List<TraceTree> leaves, Set<ATNState>[] busy,
TraceTree root, Transition t)
{
SemanticContext.Predicate pred = ((PredicateTransition) t).getPredicate();
boolean pass = false;
if ( pred.isCtxDependent ) {
if ( ctx instanceof ParserRuleContext && ctx==initialContext ) {
System.out.println("eval pred "+pred+"="+pred.eval(parser, ctx));
pass = pred.eval(parser, ctx);
}
else {
pass = true; // see thru ctx dependent when out of context
}
}
else {
System.out.println("eval pred "+pred+"="+pred.eval(parser, initialContext));
pass = pred.eval(parser, ctx);
}
if ( pass ) {
return _trace(t.target, initialContext, ctx, input, start, i, stop, leaves, busy);
}
return null;
}
// public ParserATNPathFinder(@Nullable Parser parser, @NotNull ATN atn, @NotNull DFA[] decisionToDFA) {
// super(parser, atn, decisionToDFA);
// }
//
// /** Given an input sequence, as a subset of the input stream, trace the path through the
// * ATN starting at s. The path returned includes s and the final target of the last input
// * symbol. If there are multiple paths through the ATN to the final state, it uses the first
// * method finds. This is used to figure out how input sequence is matched in more than one
// * way between the alternatives of a decision. It's only that decision we are concerned with
// * and so if there are ambiguous decisions further along, we will ignore them for the
// * purposes of computing the path to the final state. To figure out multiple paths for
// * decision, use this method on the left edge of the alternatives of the decision in question.
// *
// * TODO: I haven't figured out what to do with nongreedy decisions yet
// * TODO: preds. unless i create rule specific ctxs, i can't eval preds. also must eval args!
// */
// public TraceTree trace(@NotNull ATNState s, @Nullable RuleContext ctx,
// TokenStream input, int start, int stop)
// {
// System.out.println("REACHES "+s.stateNumber+" start state");
// List<TraceTree> leaves = new ArrayList<TraceTree>();
// HashSet<ATNState>[] busy = new HashSet[stop-start+1];
// for (int i = 0; i < busy.length; i++) {
// busy[i] = new HashSet<ATNState>();
// }
// TraceTree path = _trace(s, ctx, ctx, input, start, start, stop, leaves, busy);
// if ( path!=null ) path.leaves = leaves;
// return path;
// }
//
// /** Returns true if we found path */
// public TraceTree _trace(@NotNull ATNState s, RuleContext initialContext, RuleContext ctx,
// TokenStream input, int start, int i, int stop,
// List<TraceTree> leaves, @NotNull Set<ATNState>[] busy)
// {
// TraceTree root = new TraceTree(s);
// if ( i>stop ) {
// leaves.add(root); // track final states
// System.out.println("leaves=" + leaves);
// return root;
// }
//
// if ( !busy[i-start].add(s) ) {
// System.out.println("already visited "+s.stateNumber+" at input "+i+"="+input.get(i).getText());
// return null;
// }
// busy[i-start].add(s);
//
// System.out.println("TRACE "+s.stateNumber+" at input "+input.get(i).getText());
//
// if ( s instanceof RuleStopState) {
// // We hit rule end. If we have context info, use it
// if ( ctx!=null && !ctx.isEmpty() ) {
// System.out.println("stop state "+s.stateNumber+", ctx="+ctx);
// ATNState invokingState = atn.states.get(ctx.invokingState);
// RuleTransition rt = (RuleTransition)invokingState.transition(0);
// ATNState retState = rt.followState;
// return _trace(retState, initialContext, ctx.parent, input, start, i, stop, leaves, busy);
// }
// else {
// // else if we have no context info, just chase follow links (if greedy)
// System.out.println("FALLING off rule "+getRuleName(s.ruleIndex));
// }
// }
//
// int n = s.getNumberOfTransitions();
// boolean aGoodPath = false;
// TraceTree found;
// for (int j=0; j<n; j++) {
// Transition t = s.transition(j);
// if ( t.getClass() == RuleTransition.class ) {
// RuleContext newContext =
// new RuleContext(ctx, s.stateNumber);
// found = _trace(t.target, initialContext, newContext, input, start, i, stop, leaves, busy);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// continue;
// }
// if ( t instanceof PredicateTransition ) {
// found = predTransition(initialContext, ctx, input, start, i, stop, leaves, busy, root, t);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// continue;
// }
// if ( t.isEpsilon() ) {
// found = _trace(t.target, initialContext, ctx, input, start, i, stop, leaves, busy);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// continue;
// }
// if ( t.getClass() == WildcardTransition.class ) {
// System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
// found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// continue;
// }
// IntervalSet set = t.label();
// if ( set!=null ) {
// if ( t instanceof NotSetTransition ) {
// if ( !set.contains(input.get(i).getType()) ) {
// System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
// found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// }
// }
// else {
// if ( set.contains(input.get(i).getType()) ) {
// System.out.println("REACHES " + t.target.stateNumber + " matching input " + input.get(i).getText());
// found = _trace(t.target, initialContext, ctx, input, start, i+1, stop, leaves, busy);
// if ( found!=null ) {aGoodPath=true; root.addChild(found);}
// }
// }
// }
// }
// if ( aGoodPath ) return root; // found at least one transition leading to success
// return null;
// }
//
// public TraceTree predTransition(RuleContext initialContext, RuleContext ctx, TokenStream input, int start,
// int i, int stop, List<TraceTree> leaves, Set<ATNState>[] busy,
// TraceTree root, Transition t)
// {
// SemanticContext.Predicate pred = ((PredicateTransition) t).getPredicate();
// boolean pass;
// if ( pred.isCtxDependent ) {
// if ( ctx instanceof ParserRuleContext && ctx==initialContext ) {
// System.out.println("eval pred "+pred+"="+pred.eval(parser, ctx));
// pass = pred.eval(parser, ctx);
// }
// else {
// pass = true; // see thru ctx dependent when out of context
// }
// }
// else {
// System.out.println("eval pred "+pred+"="+pred.eval(parser, initialContext));
// pass = pred.eval(parser, ctx);
// }
// if ( pass ) {
// return _trace(t.target, initialContext, ctx, input, start, i, stop, leaves, busy);
// }
// return null;
// }
}

View File

@ -37,7 +37,7 @@ import org.antlr.v4.runtime.misc.NotNull;
* may have to combine a bunch of them as it collects predicates from
* multiple ATN configurations into a single DFA state.
*/
public class PredicateTransition extends Transition {
public final class PredicateTransition extends Transition {
public final int ruleIndex;
public final int predIndex;
public final boolean isCtxDependent; // e.g., $i ref in pred
@ -49,9 +49,19 @@ public class PredicateTransition extends Transition {
this.isCtxDependent = isCtxDependent;
}
@Override
public int getSerializationType() {
return PREDICATE;
}
@Override
public boolean isEpsilon() { return true; }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return false;
}
public SemanticContext.Predicate getPredicate() {
return new SemanticContext.Predicate(ruleIndex, predIndex, isCtxDependent);
}

View File

@ -0,0 +1,629 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.RuleContext;
import org.antlr.v4.runtime.misc.DoubleKeyMap;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
public abstract class PredictionContext implements Iterable<SingletonPredictionContext>,
Comparable<PredictionContext> // to sort node lists by id
{
/** Represents $ in local ctx prediction, which means wildcard. *+x = *. */
public static final EmptyPredictionContext EMPTY = new EmptyPredictionContext();
/** Represents $ in an array in full ctx mode, when $ doesn't mean wildcard:
* $ + x = [$,x]. Here, $ = EMPTY_RETURN_STATE.
*/
public static final int EMPTY_RETURN_STATE = Integer.MAX_VALUE;
public static int globalNodeCount = 0;
public final int id = globalNodeCount++;
public final int cachedHashCode;
protected PredictionContext(int cachedHashCode) {
this.cachedHashCode = cachedHashCode;
}
/** Convert a RuleContext tree to a PredictionContext graph.
* Return EMPTY if outerContext is empty or null.
*/
public static PredictionContext fromRuleContext(@NotNull ATN atn, RuleContext outerContext) {
if ( outerContext==null ) outerContext = RuleContext.EMPTY;
// if we are in RuleContext of start rule, s, then PredictionContext
// is EMPTY. Nobody called us. (if we are empty, return empty)
if ( outerContext.parent==null || outerContext==RuleContext.EMPTY ) {
return PredictionContext.EMPTY;
}
// If we have a parent, convert it to a PredictionContext graph
PredictionContext parent = EMPTY;
if ( outerContext.parent != null ) {
parent = PredictionContext.fromRuleContext(atn, outerContext.parent);
}
ATNState state = atn.states.get(outerContext.invokingState);
RuleTransition transition = (RuleTransition)state.transition(0);
return SingletonPredictionContext.create(parent, transition.followState.stateNumber);
}
@Override
public abstract Iterator<SingletonPredictionContext> iterator();
public abstract int size();
public abstract PredictionContext getParent(int index);
public abstract int getReturnState(int index);
/** This means only the EMPTY context is in set */
public boolean isEmpty() {
return this == EMPTY;
}
public boolean hasEmptyPath() {
return getReturnState(size() - 1) == EMPTY_RETURN_STATE;
}
@Override
public int compareTo(PredictionContext o) { // used for toDotString to print nodes in order
return id - o.id;
}
@Override
public int hashCode() {
return cachedHashCode;
}
protected static int calculateHashCode(int parentHashCode, int returnStateHashCode) {
return 5 * 5 * 7 + 5 * parentHashCode + returnStateHashCode;
}
/** Two contexts conflict() if they are equals() or one is a stack suffix
* of the other. For example, contexts [21 12 $] and [21 9 $] do not
* conflict, but [21 $] and [21 12 $] do conflict. Note that I should
* probably not show the $ in this case. There is a dummy node for each
* stack that just means empty; $ is a marker that's all.
*
* This is used in relation to checking conflicts associated with a
* single NFA state's configurations within a single DFA state.
* If there are configurations s and t within a DFA state such that
* s.state=t.state && s.alt != t.alt && s.ctx conflicts t.ctx then
* the DFA state predicts more than a single alt--it's nondeterministic.
* Two contexts conflict if they are the same or if one is a suffix
* of the other.
*
* When comparing contexts, if one context has a stack and the other
* does not then they should be considered the same context. The only
* way for an NFA state p to have an empty context and a nonempty context
* is the case when closure falls off end of rule without a call stack
* and re-enters the rule with a context. This resolves the issue I
* discussed with Sriram Srinivasan Feb 28, 2005 about not terminating
* fast enough upon nondeterminism.
*
* UPDATE FOR GRAPH STACK; no suffix
*/
// public boolean conflictsWith(PredictionContext other) {
// return this.equals(other);
// }
// dispatch
public static PredictionContext merge(
PredictionContext a, PredictionContext b,
boolean rootIsWildcard,
DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext> mergeCache)
{
// share same graph if both same
if ( (a==null&&b==null) || a==b || (a!=null&&a.equals(b)) ) return a;
if ( a instanceof SingletonPredictionContext && b instanceof SingletonPredictionContext) {
return mergeSingletons((SingletonPredictionContext)a,
(SingletonPredictionContext)b,
rootIsWildcard, mergeCache);
}
// At least one of a or b is array
// If one is $ and rootIsWildcard, return $ as * wildcard
if ( rootIsWildcard ) {
if ( a instanceof EmptyPredictionContext ) return a;
if ( b instanceof EmptyPredictionContext ) return b;
}
// convert singleton so both are arrays to normalize
if ( a instanceof SingletonPredictionContext ) {
a = new ArrayPredictionContext((SingletonPredictionContext)a);
}
if ( b instanceof SingletonPredictionContext) {
b = new ArrayPredictionContext((SingletonPredictionContext)b);
}
return mergeArrays((ArrayPredictionContext) a, (ArrayPredictionContext) b,
rootIsWildcard, mergeCache);
}
// http://www.antlr.org/wiki/download/attachments/32014352/singleton-merge.png
public static PredictionContext mergeSingletons(
SingletonPredictionContext a,
SingletonPredictionContext b,
boolean rootIsWildcard,
DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext> mergeCache)
{
if ( mergeCache!=null ) {
PredictionContext previous = mergeCache.get(a,b);
if ( previous!=null ) return previous;
previous = mergeCache.get(b,a);
if ( previous!=null ) return previous;
}
PredictionContext rootMerge = mergeRoot(a, b, rootIsWildcard);
if ( rootMerge!=null ) {
if ( mergeCache!=null ) mergeCache.put(a, b, rootMerge);
return rootMerge;
}
if ( a.returnState==b.returnState ) { // a == b
PredictionContext parent = merge(a.parent, b.parent, rootIsWildcard, mergeCache);
// if parent is same as existing a or b parent or reduced to a parent, return it
if ( parent == a.parent ) return a; // ax + bx = ax, if a=b
if ( parent == b.parent ) return b; // ax + bx = bx, if a=b
// else: ax + ay = a'[x,y]
// merge parents x and y, giving array node with x,y then remainders
// of those graphs. dup a, a' points at merged array
// new joined parent so create new singleton pointing to it, a'
PredictionContext a_ = SingletonPredictionContext.create(parent, a.returnState);
if ( mergeCache!=null ) mergeCache.put(a, b, a_);
return a_;
}
else { // a != b payloads differ
// see if we can collapse parents due to $+x parents if local ctx
PredictionContext singleParent = null;
if ( a==b || (a.parent!=null && a.parent.equals(b.parent)) ) { // ax + bx = [a,b]x
singleParent = a.parent;
}
if ( singleParent!=null ) { // parents are same
// sort payloads and use same parent
int[] payloads = {a.returnState, b.returnState};
if ( a.returnState > b.returnState ) {
payloads[0] = b.returnState;
payloads[1] = a.returnState;
}
PredictionContext[] parents = {singleParent, singleParent};
PredictionContext a_ = new ArrayPredictionContext(parents, payloads);
if ( mergeCache!=null ) mergeCache.put(a, b, a_);
return a_;
}
// parents differ and can't merge them. Just pack together
// into array; can't merge.
// ax + by = [ax,by]
int[] payloads = {a.returnState, b.returnState};
PredictionContext[] parents = {a.parent, b.parent};
if ( a.returnState > b.returnState ) { // sort by payload
payloads[0] = b.returnState;
payloads[1] = a.returnState;
parents = new PredictionContext[] {b.parent, a.parent};
}
PredictionContext a_ = new ArrayPredictionContext(parents, payloads);
if ( mergeCache!=null ) mergeCache.put(a, b, a_);
return a_;
}
}
// http://www.antlr.org/wiki/download/attachments/32014352/local-ctx-root-merge.png
// http://www.antlr.org/wiki/download/attachments/32014352/full-ctx-root-merge.png
/** Handle case where at least one of a or b is $ (EMPTY) */
public static PredictionContext mergeRoot(SingletonPredictionContext a,
SingletonPredictionContext b,
boolean rootIsWildcard)
{
if ( rootIsWildcard ) {
if ( a == EMPTY ) return EMPTY; // * + b = *
if ( b == EMPTY ) return EMPTY; // a + * = *
}
else {
if ( a == EMPTY && b == EMPTY ) return EMPTY; // $ + $ = $
if ( a == EMPTY ) { // $ + x = [$,x]
int[] payloads = {b.returnState, EMPTY_RETURN_STATE};
PredictionContext[] parents = {b.parent, null};
PredictionContext joined =
new ArrayPredictionContext(parents, payloads);
return joined;
}
if ( b == EMPTY ) { // x + $ = [$,x] ($ is always first if present)
int[] payloads = {a.returnState, EMPTY_RETURN_STATE};
PredictionContext[] parents = {a.parent, null};
PredictionContext joined =
new ArrayPredictionContext(parents, payloads);
return joined;
}
}
return null;
}
// http://www.antlr.org/wiki/download/attachments/32014352/array-merge.png
public static PredictionContext mergeArrays(
ArrayPredictionContext a,
ArrayPredictionContext b,
boolean rootIsWildcard,
DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext> mergeCache)
{
if ( mergeCache!=null ) {
PredictionContext previous = mergeCache.get(a,b);
if ( previous!=null ) return previous;
previous = mergeCache.get(b,a);
if ( previous!=null ) return previous;
}
// merge sorted payloads a + b => M
int i = 0; // walks a
int j = 0; // walks b
int k = 0; // walks target M array
int[] mergedReturnStates =
new int[a.returnStates.length + b.returnStates.length];
PredictionContext[] mergedParents =
new PredictionContext[a.returnStates.length + b.returnStates.length];
// walk and merge to yield mergedParents, mergedReturnStates
while ( i<a.returnStates.length && j<b.returnStates.length ) {
PredictionContext a_parent = a.parents[i];
PredictionContext b_parent = b.parents[j];
if ( a.returnStates[i]==b.returnStates[j] ) {
// same payload (stack tops are equal), must yield merged singleton
int payload = a.returnStates[i];
// $+$ = $
boolean both$ = payload == EMPTY_RETURN_STATE &&
a_parent == null && b_parent == null;
boolean ax_ax = (a_parent!=null && b_parent!=null) &&
a_parent.equals(b_parent); // ax+ax -> ax
if ( both$ || ax_ax ) {
mergedParents[k] = a_parent; // choose left
mergedReturnStates[k] = payload;
}
else { // ax+ay -> a'[x,y]
PredictionContext mergedParent =
merge(a_parent, b_parent, rootIsWildcard, mergeCache);
mergedParents[k] = mergedParent;
mergedReturnStates[k] = payload;
}
i++; // hop over left one as usual
j++; // but also skip one in right side since we merge
}
else if ( a.returnStates[i]<b.returnStates[j] ) { // copy a[i] to M
mergedParents[k] = a_parent;
mergedReturnStates[k] = a.returnStates[i];
i++;
}
else { // b > a, copy b[j] to M
mergedParents[k] = b_parent;
mergedReturnStates[k] = b.returnStates[j];
j++;
}
k++;
}
// copy over any payloads remaining in either array
if (i < a.returnStates.length) {
for (int p = i; p < a.returnStates.length; p++) {
mergedParents[k] = a.parents[p];
mergedReturnStates[k] = a.returnStates[p];
k++;
}
}
else {
for (int p = j; p < b.returnStates.length; p++) {
mergedParents[k] = b.parents[p];
mergedReturnStates[k] = b.returnStates[p];
k++;
}
}
// trim merged if we combined a few that had same stack tops
if ( k < mergedParents.length ) { // write index < last position; trim
if ( k == 1 ) { // for just one merged element, return singleton top
PredictionContext a_ =
SingletonPredictionContext.create(mergedParents[0],
mergedReturnStates[0]);
if ( mergeCache!=null ) mergeCache.put(a,b,a_);
return a_;
}
mergedParents = Arrays.copyOf(mergedParents, k);
mergedReturnStates = Arrays.copyOf(mergedReturnStates, k);
}
PredictionContext M =
new ArrayPredictionContext(mergedParents, mergedReturnStates);
// if we created same array as a or b, return that instead
// TODO: track whether this is possible above during merge sort for speed
if ( M.equals(a) ) {
if ( mergeCache!=null ) mergeCache.put(a,b,a);
return a;
}
if ( M.equals(b) ) {
if ( mergeCache!=null ) mergeCache.put(a,b,b);
return b;
}
combineCommonParents(mergedParents);
if ( mergeCache!=null ) mergeCache.put(a,b,M);
return M;
}
/** make pass over all M parents; merge any equals() ones */
protected static void combineCommonParents(PredictionContext[] parents) {
Map<PredictionContext, PredictionContext> uniqueParents =
new HashMap<PredictionContext, PredictionContext>();
for (int p = 0; p < parents.length; p++) {
PredictionContext parent = parents[p];
if ( !uniqueParents.containsKey(parent) ) { // don't replace
uniqueParents.put(parent, parent);
}
}
for (int p = 0; p < parents.length; p++) {
parents[p] = uniqueParents.get(parents[p]);
}
}
public static String toDOTString(PredictionContext context) {
if ( context==null ) return "";
StringBuilder buf = new StringBuilder();
buf.append("digraph G {\n");
buf.append("rankdir=LR;\n");
List<PredictionContext> nodes = getAllContextNodes(context);
Collections.sort(nodes);
for (PredictionContext current : nodes) {
if ( current instanceof SingletonPredictionContext ) {
String s = String.valueOf(current.id);
buf.append(" s").append(s);
String returnState = String.valueOf(current.getReturnState(0));
if ( current instanceof EmptyPredictionContext ) returnState = "$";
buf.append(" [label=\"").append(returnState).append("\"];\n");
continue;
}
ArrayPredictionContext arr = (ArrayPredictionContext)current;
buf.append(" s").append(arr.id);
buf.append(" [shape=box, label=\"");
buf.append("[");
boolean first = true;
for (int inv : arr.returnStates) {
if ( !first ) buf.append(", ");
if ( inv == EMPTY_RETURN_STATE ) buf.append("$");
else buf.append(inv);
first = false;
}
buf.append("]");
buf.append("\"];\n");
}
for (PredictionContext current : nodes) {
if ( current==EMPTY ) continue;
for (int i = 0; i < current.size(); i++) {
if ( current.getParent(i)==null ) continue;
String s = String.valueOf(current.id);
buf.append(" s").append(s);
buf.append("->");
buf.append("s");
buf.append(current.getParent(i).id);
if ( current.size()>1 ) buf.append(" [label=\"parent["+i+"]\"];\n");
else buf.append(";\n");
}
}
buf.append("}\n");
return buf.toString();
}
// From Sam
public static PredictionContext getCachedContext(
@NotNull PredictionContext context,
@NotNull PredictionContextCache contextCache,
@NotNull IdentityHashMap<PredictionContext, PredictionContext> visited)
{
if (context.isEmpty()) {
return context;
}
PredictionContext existing = visited.get(context);
if (existing != null) {
return existing;
}
synchronized (contextCache) {
existing = contextCache.get(context);
if (existing != null) {
visited.put(context, existing);
return existing;
}
}
boolean changed = false;
PredictionContext[] parents = new PredictionContext[context.size()];
for (int i = 0; i < parents.length; i++) {
PredictionContext parent = getCachedContext(context.getParent(i), contextCache, visited);
if (changed || parent != context.getParent(i)) {
if (!changed) {
parents = new PredictionContext[context.size()];
for (int j = 0; j < context.size(); j++) {
parents[j] = context.getParent(j);
}
changed = true;
}
parents[i] = parent;
}
}
if (!changed) {
synchronized (contextCache) {
contextCache.add(context);
}
visited.put(context, context);
return context;
}
PredictionContext updated;
if (parents.length == 0) {
updated = EMPTY;
}
else if (parents.length == 1) {
updated = SingletonPredictionContext.create(parents[0], context.getReturnState(0));
}
else {
ArrayPredictionContext arrayPredictionContext = (ArrayPredictionContext)context;
updated = new ArrayPredictionContext(parents, arrayPredictionContext.returnStates);
}
synchronized (contextCache) {
contextCache.add(updated);
}
visited.put(updated, updated);
visited.put(context, updated);
return updated;
}
// // extra structures, but cut/paste/morphed works, so leave it.
// // seems to do a breadth-first walk
// public static List<PredictionContext> getAllNodes(PredictionContext context) {
// Map<PredictionContext, PredictionContext> visited =
// new IdentityHashMap<PredictionContext, PredictionContext>();
// Deque<PredictionContext> workList = new ArrayDeque<PredictionContext>();
// workList.add(context);
// visited.put(context, context);
// List<PredictionContext> nodes = new ArrayList<PredictionContext>();
// while (!workList.isEmpty()) {
// PredictionContext current = workList.pop();
// nodes.add(current);
// for (int i = 0; i < current.size(); i++) {
// PredictionContext parent = current.getParent(i);
// if ( parent!=null && visited.put(parent, parent) == null) {
// workList.push(parent);
// }
// }
// }
// return nodes;
// }
// ter's recursive version of Sam's getAllNodes()
public static List<PredictionContext> getAllContextNodes(PredictionContext context) {
List<PredictionContext> nodes = new ArrayList<PredictionContext>();
Map<PredictionContext, PredictionContext> visited =
new IdentityHashMap<PredictionContext, PredictionContext>();
getAllContextNodes_(context, nodes, visited);
return nodes;
}
public static void getAllContextNodes_(PredictionContext context,
List<PredictionContext> nodes,
Map<PredictionContext, PredictionContext> visited)
{
if ( context==null || visited.containsKey(context) ) return;
visited.put(context, context);
nodes.add(context);
for (int i = 0; i < context.size(); i++) {
getAllContextNodes_(context.getParent(i), nodes, visited);
}
}
public String toString(@Nullable Recognizer<?,?> recog) {
return toString();
// return toString(recog, ParserRuleContext.EMPTY);
}
// recog null unless ParserRuleContext, in which case we use subclass toString(...)
public String toString(@Nullable Recognizer<?,?> recog, RuleContext stop) {
StringBuilder buf = new StringBuilder();
PredictionContext p = this;
buf.append("[");
// while ( p != null && p != stop ) {
// if ( !p.isEmpty() ) buf.append(p.returnState);
// if ( p.parent != null && !p.parent.isEmpty() ) buf.append(" ");
// p = p.parent;
// }
buf.append("]");
return buf.toString();
}
public String[] toStrings(Recognizer<?, ?> recognizer, int currentState) {
return toStrings(recognizer, EMPTY, currentState);
}
// FROM SAM
public String[] toStrings(Recognizer<?, ?> recognizer, PredictionContext stop, int currentState) {
List<String> result = new ArrayList<String>();
outer:
for (int perm = 0; ; perm++) {
int offset = 0;
boolean last = true;
PredictionContext p = this;
int stateNumber = currentState;
StringBuilder localBuffer = new StringBuilder();
localBuffer.append("[");
while ( !p.isEmpty() && p != stop ) {
int index = 0;
if (p.size() > 0) {
int bits = 1;
while ((1 << bits) < p.size()) {
bits++;
}
int mask = (1 << bits) - 1;
index = (perm >> offset) & mask;
last &= index >= p.size() - 1;
if (index >= p.size()) {
continue outer;
}
offset += bits;
}
if ( recognizer!=null ) {
if (localBuffer.length() > 1) {
// first char is '[', if more than that this isn't the first rule
localBuffer.append(' ');
}
ATN atn = recognizer.getATN();
ATNState s = atn.states.get(stateNumber);
String ruleName = recognizer.getRuleNames()[s.ruleIndex];
localBuffer.append(ruleName);
}
else if ( p.getReturnState(index)!= EMPTY_RETURN_STATE) {
if ( !p.isEmpty() ) {
if (localBuffer.length() > 1) {
// first char is '[', if more than that this isn't the first rule
localBuffer.append(' ');
}
localBuffer.append(p.getReturnState(index));
}
}
stateNumber = p.getReturnState(index);
p = p.getParent(index);
}
localBuffer.append("]");
result.add(localBuffer.toString());
if (last) {
break;
}
}
return result.toArray(new String[result.size()]);
}
}

View File

@ -0,0 +1,36 @@
package org.antlr.v4.runtime.atn;
import java.util.HashMap;
import java.util.Map;
/** Used to cache PredictionContext objects. Its used for the shared
* context cash associated with contexts in DFA states. This cache
* can be used for both lexers and parsers.
*/
public class PredictionContextCache {
protected Map<PredictionContext, PredictionContext> cache =
new HashMap<PredictionContext, PredictionContext>();
/** Add a context to the cache and return it. If the context already exists,
* return that one instead and do not add a new context to the cache.
* Protect shared cache from unsafe thread access.
*/
public PredictionContext add(PredictionContext ctx) {
if ( ctx==PredictionContext.EMPTY ) return PredictionContext.EMPTY;
PredictionContext existing = cache.get(ctx);
if ( existing!=null ) {
// System.out.println(name+" reuses "+existing);
return existing;
}
cache.put(ctx, ctx);
return ctx;
}
public PredictionContext get(PredictionContext ctx) {
return cache.get(ctx);
}
public int size() {
return cache.size();
}
}

View File

@ -0,0 +1,408 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.FlexibleHashMap;
import org.antlr.v4.runtime.misc.NotNull;
import java.util.BitSet;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
public enum PredictionMode {
/** Do only local context prediction (SLL style) and using
* heuristic which almost always works but is much faster
* than precise answer.
*/
SLL,
/** Full LL(*) that always gets right answer. For speed
* reasons, we terminate the prediction process when we know for
* sure which alt to predict. We don't always know what
* the ambiguity is in this mode.
*/
LL,
/** Tell the full LL prediction algorithm to pursue lookahead until
* it has uniquely predicted an alternative without conflict or it's
* certain that it's found an ambiguous input sequence. when this
* variable is false. When true, the prediction process will
* continue looking for the exact ambiguous sequence even if
* it has already figured out which alternative to predict.
*/
LL_EXACT_AMBIG_DETECTION;
/** A Map that uses just the state and the stack context as the key. */
static class AltAndContextMap extends FlexibleHashMap<ATNConfig,BitSet> {
/** Code is function of (s, _, ctx, _) */
@Override
public int hashCode(ATNConfig o) {
int hashCode = 7;
hashCode = 31 * hashCode + o.state.stateNumber;
hashCode = 31 * hashCode + o.context.hashCode();
return hashCode;
}
@Override
public boolean equals(ATNConfig a, ATNConfig b) {
if ( a==b ) return true;
if ( a==null || b==null ) return false;
if ( hashCode(a) != hashCode(b) ) return false;
return a.state.stateNumber==b.state.stateNumber
&& b.context.equals(b.context);
}
}
/**
SLL prediction termination.
There are two cases: the usual combined SLL+LL parsing and
pure SLL parsing that has no fail over to full LL.
COMBINED SLL+LL PARSING
SLL can decide to give up any point, even immediately,
failing over to full LL. To be as efficient as possible,
though, SLL should fail over only when it's positive it can't get
anywhere on more lookahead without seeing a conflict.
Assuming combined SLL+LL parsing, an SLL confg set with only
conflicting subsets should failover to full LL, even if the
config sets don't resolve to the same alternative like {1,2}
and {3,4}. If there is at least one nonconflicting set of
configs, SLL could continue with the hopes that more lookahead
will resolve via one of those nonconflicting configs.
Here's the prediction termination rule them: SLL (for SLL+LL
parsing) stops when it sees only conflicting config subsets.
In contrast, full LL keeps going when there is uncertainty.
HEURISTIC
As a heuristic, we stop prediction when we see any conflicting subset
unless we see a state that only has one alternative associated with
it. The single-alt-state thing lets prediction continue upon rules
like (otherwise, it would admit defeat too soon):
// [12|1|[], 6|2|[], 12|2|[]].
s : (ID | ID ID?) ';' ;
When the ATN simulation reaches the state before ';', it has a DFA
state that looks like: [12|1|[], 6|2|[], 12|2|[]]. Naturally 12|1|[]
and 12|2|[] conflict, but we cannot stop processing this node because
alternative to has another way to continue, via [6|2|[]].
It also let's us continue for this rule:
// [1|1|[], 1|2|[], 8|3|[]]
a : A | A | A B ;
After matching input A, we reach the stop state for rule A, state 1.
State 8 is the state right before B. Clearly alternatives 1 and 2
conflict and no amount of further lookahead will separate the two.
However, alternative 3 will be able to continue and so we do not stop
working on this state. In the previous example, we're concerned with
states associated with the conflicting alternatives. Here alt 3 is not
associated with the conflicting configs, but since we can continue
looking for input reasonably, don't declare the state done.
PURE SLL PARSING
To handle pure SLL parsing, all we have to do is make sure that we
combine stack contexts for configurations that differ only by semantic
predicate. From there, we can do the usual SLL termination heuristic.
PREDICATES IN SLL+LL PARSING
SLL decisions don't evaluate predicates until after they reach DFA
stop states because they need to create the DFA cache that
works in all (semantic) situations. (In contrast, full LL
evaluates predicates collected during start state computation
so it can ignore predicates thereafter.) This means that SLL
termination detection can totally ignore semantic predicates.
Of course, implementation-wise, ATNConfigSets combine stack
contexts but not semantic predicate contexts so we might see
two configs like this:
(s, 1, x, {}), (s, 1, x', {p})
Before testing these configurations against others, we have
to merge x and x' (w/o modifying the existing configs). For
example, we test (x+x')==x'' when looking for conflicts in
the following configs.
(s, 1, x, {}), (s, 1, x', {p}), (s, 2, x'', {})
If the configuration set has predicates, which we can test
quickly, this algorithm makes a copy of the configs and
strip out all of the predicates so that a standard
ATNConfigSet will merge everything ignoring
predicates.
*/
public static boolean hasSLLConflictTerminatingPrediction(PredictionMode mode, @NotNull ATNConfigSet configs) {
// pure SLL mode parsing
if ( mode == PredictionMode.SLL ) {
// Don't bother with combining configs from different semantic
// contexts if we can fail over to full LL; costs more time
// since we'll often fail over anyway.
if ( configs.hasSemanticContext ) {
// dup configs, tossing out semantic predicates
ATNConfigSet dup = new ATNConfigSet();
for (ATNConfig c : configs) {
c = new ATNConfig(c,SemanticContext.NONE);
dup.add(c);
}
configs = dup;
}
// now we have combined contexts for configs with dissimilar preds
}
// pure SLL or combined SLL+LL mode parsing
Collection<BitSet> altsets = getConflictingAltSubsets(configs);
boolean heuristic =
hasConflictingAltSet(altsets) && !hasStateAssociatedWithOneAlt(configs);
return heuristic;
}
/**
Full LL prediction termination.
Can we stop looking ahead during ATN simulation or is there some
uncertainty as to which alternative we will ultimately pick, after
consuming more input? Even if there are partial conflicts, we might
know that everything is going to resolve to the same minimum
alt. That means we can stop since no more lookahead will change that
fact. On the other hand, there might be multiple conflicts that
resolve to different minimums. That means we need more look ahead to
decide which of those alternatives we should predict.
The basic idea is to split the set of configurations, C, into
conflicting (s, _, ctx, _) subsets and singleton subsets with
non-conflicting configurations. Two config's conflict if they have
identical state and rule stack contexts but different alternative
numbers: (s, i, ctx, _), (s, j, ctx, _) for i!=j.
Reduce these config subsets to the set of possible alternatives. You
can compute the alternative subsets in one go as follows:
A_s,ctx = {i | (s, i, ctx, _) for in C holding s, ctx fixed}
Or in pseudo-code:
for c in C:
map[c] U= c.alt # map hash/equals uses s and x, not alt and not pred
Then map.values is the set of A_s,ctx sets.
If |A_s,ctx|=1 then there is no conflict associated with s and ctx.
Reduce the subsets to singletons by choosing a minimum of each subset.
If the union of these alternatives sets is a singleton, then no amount
of more lookahead will help us. We will always pick that
alternative. If, however, there is more than one alternative, then we
are uncertain which alt to predict and must continue looking for
resolution. We may or may not discover an ambiguity in the future,
even if there are no conflicting subsets this round.
The biggest sin is to terminate early because it means we've made a
decision but were uncertain as to the eventual outcome. We haven't
used enough lookahead. On the other hand, announcing a conflict too
late is no big deal; you will still have the conflict. It's just
inefficient. It might even look until the end of file.
Semantic predicates for full LL aren't involved in this decision
because the predicates are evaluated during start state computation.
This set of configurations was derived from the initial subset with
configurations holding false predicate stripped out.
CONFLICTING CONFIGS
Two configurations, (s, i, x) and (s, j, x'), conflict when i!=j but
x = x'. Because we merge all (s, i, _) configurations together, that
means that there are at most n configurations associated with state s
for n possible alternatives in the decision. The merged stacks
complicate the comparison of config contexts, x and x'. Sam checks to
see if one is a subset of the other by calling merge and checking to
see if the merged result is either x or x'. If the x associated with
lowest alternative i is the superset, then i is the only possible
prediction since the others resolve to min i as well. If, however, x
is associated with j>i then at least one stack configuration for j is
not in conflict with alt i. The algorithm should keep going, looking
for more lookahead due to the uncertainty.
For simplicity, I'm doing a equality check between x and x' that lets
the algorithm continue to consume lookahead longer than necessary.
The reason I like the equality is of course the simplicity but also
because that is the test you need to detect the alternatives that are
actually in conflict.
CONTINUE/STOP RULE
Continue if union of resolved alt sets from nonconflicting and
conflicting alt subsets has more than one alt. We are uncertain about
which alternative to predict.
The complete set of alternatives, [i for (_,i,_)], tells us
which alternatives are still in the running for the amount of input
we've consumed at this point. The conflicting sets let us to strip
away configurations that won't lead to more states (because we
resolve conflicts to the configuration with a minimum alternate for
given conflicting set.)
CASES:
* no conflicts & > 1 alt in set => continue
* (s, 1, x), (s, 2, x), (s, 3, z)
(s', 1, y), (s', 2, y)
yields nonconflicting set {3} U conflicting sets min({1,2}) U min({1,2}) = {1,3}
=> continue
* (s, 1, x), (s, 2, x),
(s', 1, y), (s', 2, y)
(s'', 1, z)
yields nonconflicting set you this {1} U conflicting sets min({1,2}) U min({1,2}) = {1}
=> stop and predict 1
* (s, 1, x), (s, 2, x),
(s', 1, y), (s', 2, y)
yields conflicting, reduced sets {1} U {1} = {1}
=> stop and predict 1, can announce ambiguity {1,2}
* (s, 1, x), (s, 2, x)
(s', 2, y), (s', 3, y)
yields conflicting, reduced sets {1} U {2} = {1,2}
=> continue
* (s, 1, x), (s, 2, x)
(s', 3, y), (s', 4, y)
yields conflicting, reduced sets {1} U {3} = {1,3}
=> continue
EXACT AMBIGUITY DETECTION
If all states report the same conflicting alt set, then we know we
have the real ambiguity set:
|A_i|>1 and A_i = A_j for all i, j.
In other words, we continue examining lookahead until all A_i have
more than one alt and all A_i are the same. If A={{1,2}, {1,3}}, then
regular LL prediction would terminate because the resolved set is
{1}. To determine what the real ambiguity is, we have to know whether
the ambiguity is between one and two or one and three so we keep
going. We can only stop prediction when we need exact ambiguity
detection when the sets look like A={{1,2}} or {{1,2},{1,2}} etc...
*/
public static int resolvesToJustOneViableAlt(Collection<BitSet> altsets) {
return getSingleViableAlt(altsets);
}
public static boolean allSubsetsConflict(Collection<BitSet> altsets) {
return !hasNonConflictingAltSet(altsets);
}
/** return (there exists len(A_i)==1 for some A_i in altsets A) */
public static boolean hasNonConflictingAltSet(Collection<BitSet> altsets) {
for (BitSet alts : altsets) {
if ( alts.cardinality()==1 ) {
return true;
}
}
return false;
}
/** return (there exists len(A_i)>1 for some A_i in altsets A) */
public static boolean hasConflictingAltSet(Collection<BitSet> altsets) {
for (BitSet alts : altsets) {
if ( alts.cardinality()>1 ) {
return true;
}
}
return false;
}
public static boolean allSubsetsEqual(Collection<BitSet> altsets) {
Iterator<BitSet> it = altsets.iterator();
BitSet first = it.next();
while ( it.hasNext() ) {
BitSet next = it.next();
if ( !next.equals(first) ) return false;
}
return true;
}
public static int getUniqueAlt(Collection<BitSet> altsets) {
BitSet all = getAlts(altsets);
if ( all.cardinality()==1 ) return all.nextSetBit(0);
return ATN.INVALID_ALT_NUMBER;
}
public static BitSet getAlts(Collection<BitSet> altsets) {
BitSet all = new BitSet();
for (BitSet alts : altsets) {
all.or(alts);
}
return all;
}
/**
* This function gets the conflicting alt subsets from a configuration set.
* for c in configs:
* map[c] U= c.alt # map hash/equals uses s and x, not alt and not pred
*/
public static Collection<BitSet> getConflictingAltSubsets(ATNConfigSet configs) {
AltAndContextMap configToAlts = new AltAndContextMap();
for (ATNConfig c : configs) {
BitSet alts = configToAlts.get(c);
if ( alts==null ) {
alts = new BitSet();
configToAlts.put(c, alts);
}
alts.set(c.alt);
}
return configToAlts.values();
}
/** Get a map from state to alt subset from a configuration set.
* for c in configs:
* map[c.state] U= c.alt
*/
public static Map<ATNState, BitSet> getStateToAltMap(ATNConfigSet configs) {
Map<ATNState, BitSet> m = new HashMap<ATNState, BitSet>();
for (ATNConfig c : configs) {
BitSet alts = m.get(c.state);
if ( alts==null ) {
alts = new BitSet();
m.put(c.state, alts);
}
alts.set(c.alt);
}
return m;
}
public static boolean hasStateAssociatedWithOneAlt(ATNConfigSet configs) {
Map<ATNState, BitSet> x = getStateToAltMap(configs);
for (BitSet alts : x.values()) {
if ( alts.cardinality()==1 ) return true;
}
return false;
}
public static int getSingleViableAlt(Collection<BitSet> altsets) {
BitSet viableAlts = new BitSet();
for (BitSet alts : altsets) {
int minAlt = alts.nextSetBit(0);
viableAlts.set(minAlt);
if ( viableAlts.cardinality()>1 ) { // more than 1 viable alt
return ATN.INVALID_ALT_NUMBER;
}
}
return viableAlts.nextSetBit(0);
}
}

View File

@ -29,10 +29,10 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
public class RangeTransition extends Transition {
public final class RangeTransition extends Transition {
public final int from;
public final int to;
@ -42,10 +42,20 @@ public class RangeTransition extends Transition {
this.to = to;
}
@Override
public int getSerializationType() {
return RANGE;
}
@Override
@NotNull
public IntervalSet label() { return IntervalSet.of(from, to); }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return symbol >= from && symbol <= to;
}
@Override
@NotNull
public String toString() {

View File

@ -32,7 +32,7 @@ package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
/** */
public class RuleTransition extends Transition {
public final class RuleTransition extends Transition {
/** Ptr to the rule definition object for this rule ref */
public final int ruleIndex; // no Rule object at runtime
@ -49,6 +49,16 @@ public class RuleTransition extends Transition {
this.followState = followState;
}
@Override
public int getSerializationType() {
return RULE;
}
@Override
public boolean isEpsilon() { return true; }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return false;
}
}

View File

@ -64,8 +64,6 @@ public abstract class SemanticContext {
*/
public abstract boolean eval(Recognizer<?,?> parser, RuleContext outerContext);
public SemanticContext optimize() { return this; }
public static class Predicate extends SemanticContext {
public final int ruleIndex;
public final int predIndex;
@ -125,7 +123,7 @@ public abstract class SemanticContext {
}
@Override
public boolean equals(@NotNull Object obj) {
public boolean equals(Object obj) {
if ( this==obj ) return true;
if ( !(obj instanceof AND) ) return false;
AND other = (AND)obj;
@ -162,7 +160,7 @@ public abstract class SemanticContext {
}
@Override
public boolean equals(@NotNull Object obj) {
public boolean equals(Object obj) {
if ( this==obj ) return true;
if ( !(obj instanceof OR) ) return false;
OR other = (OR)obj;

View File

@ -29,10 +29,10 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
/** A transition containing a set of values */
public class SetTransition extends Transition {
@ -46,10 +46,20 @@ public class SetTransition extends Transition {
this.set = set;
}
@Override
public int getSerializationType() {
return SET;
}
@Override
@NotNull
public IntervalSet label() { return set; }
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return set.contains(symbol);
}
@Override
@NotNull
public String toString() {

View File

@ -0,0 +1,89 @@
package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.DoubleKeyMap;
import java.util.Iterator;
public class SingletonPredictionContext extends PredictionContext {
public final PredictionContext parent;
public final int returnState;
SingletonPredictionContext(PredictionContext parent, int returnState) {
super(calculateHashCode(parent!=null ? 31 ^ parent.hashCode() : 1,
31 ^ returnState));
assert returnState!=ATNState.INVALID_STATE_NUMBER;
this.parent = parent;
this.returnState = returnState;
}
public static SingletonPredictionContext create(PredictionContext parent, int returnState) {
if ( returnState == EMPTY_RETURN_STATE && parent == null ) {
// someone can pass in the bits of an array ctx that mean $
return EMPTY;
}
return new SingletonPredictionContext(parent, returnState);
}
@Override
public Iterator<SingletonPredictionContext> iterator() {
final SingletonPredictionContext self = this;
return new Iterator<SingletonPredictionContext>() {
int i = 0;
@Override
public boolean hasNext() { return i==0; }
@Override
public SingletonPredictionContext next() { i++; return self; }
@Override
public void remove() { throw new UnsupportedOperationException(); }
};
}
@Override
public int size() {
return 1;
}
@Override
public PredictionContext getParent(int index) {
assert index == 0;
return parent;
}
@Override
public int getReturnState(int index) {
assert index == 0;
return returnState;
}
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
else if ( !(o instanceof SingletonPredictionContext) ) {
return false;
}
if ( this.hashCode() != o.hashCode() ) {
return false; // can't be same if hash is different
}
SingletonPredictionContext s = (SingletonPredictionContext)o;
return returnState == s.returnState &&
(parent!=null && parent.equals(s.parent));
}
@Override
public String toString() {
String up = parent!=null ? parent.toString() : "";
if ( up.length()==0 ) {
if ( returnState == EMPTY_RETURN_STATE ) {
return "$";
}
return String.valueOf(returnState);
}
return String.valueOf(returnState)+" "+up;
}
}

View File

@ -30,4 +30,7 @@
package org.antlr.v4.runtime.atn;
public class StarLoopbackState extends ATNState {
public final StarLoopEntryState getLoopEntryState() {
return (StarLoopEntryState)transition(0).target;
}
}

View File

@ -30,5 +30,5 @@
package org.antlr.v4.runtime.atn;
/** The Tokens rule start state linking to each lexer rule start state */
public class TokensStartState extends BlockStartState {
public class TokensStartState extends DecisionState {
}

View File

@ -33,7 +33,11 @@ import org.antlr.v4.runtime.misc.IntervalSet;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.*;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/** An ATN transition between any two ATN states. Subclasses define
* atom, set, epsilon, action, predicate, rule transitions.
@ -91,13 +95,21 @@ public abstract class Transition {
@NotNull
public ATNState target;
protected Transition(@NotNull ATNState target) { this.target = target; }
protected Transition(@NotNull ATNState target) {
if (target == null) {
throw new NullPointerException("target cannot be null.");
}
public int getSerializationType() { return 0; }
this.target = target;
}
public abstract int getSerializationType();
/** Are we epsilon, action, sempred? */
public boolean isEpsilon() { return false; }
@Nullable
public IntervalSet label() { return null; }
public abstract boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol);
}

View File

@ -31,9 +31,19 @@ package org.antlr.v4.runtime.atn;
import org.antlr.v4.runtime.misc.NotNull;
public class WildcardTransition extends Transition {
public final class WildcardTransition extends Transition {
public WildcardTransition(@NotNull ATNState target) { super(target); }
@Override
public int getSerializationType() {
return WILDCARD;
}
@Override
public boolean matches(int symbol, int minVocabSymbol, int maxVocabSymbol) {
return symbol >= minVocabSymbol && symbol <= maxVocabSymbol;
}
@Override
@NotNull
public String toString() {

View File

@ -29,11 +29,19 @@
package org.antlr.v4.runtime.dfa;
import org.antlr.v4.runtime.TokenStream;
import org.antlr.v4.runtime.atn.*;
import org.antlr.v4.runtime.atn.ATNState;
import org.antlr.v4.runtime.atn.DecisionState;
import org.antlr.v4.runtime.atn.ParserATNSimulator;
import org.antlr.v4.runtime.atn.Transition;
import org.antlr.v4.runtime.misc.NotNull;
import org.antlr.v4.runtime.misc.Nullable;
import java.util.*;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class DFA {
/** A set of all DFA states. Use Map so we can get old state back
@ -96,7 +104,7 @@ public class DFA {
List<Set<ATNState>> atnStates = new ArrayList<Set<ATNState>>();
int i = start;
for (DFAState D : dfaStates) {
Set<ATNState> fullSet = D.configset.getStates();
Set<ATNState> fullSet = D.configs.getStates();
Set<ATNState> statesInvolved = new HashSet<ATNState>();
for (ATNState astate : fullSet) {
Transition t = astate.transition(0);

Some files were not shown because too many files have changed in this diff Show More