Merge branch 'master' into vs2017projects

This commit is contained in:
Nicolas 2017-12-15 19:57:07 +13:00
commit e7b6521431
56 changed files with 967 additions and 166 deletions

View File

@ -8,7 +8,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
</parent>
<artifactId>antlr4-maven-plugin</artifactId>
<packaging>maven-plugin</packaging>

View File

@ -174,4 +174,7 @@ YYYY/MM/DD, github id, Full name, email
2017/11/02, jasonmoo, Jason Mooberry, jason.mooberry@gmail.com
2017/11/05, ajaypanyala, Ajay Panyala, ajay.panyala@gmail.com
2017/11/24, zqlu.cn, Zhiqiang Lu, zqlu.cn@gmail.com
2017/11/28, niccroad, Nicolas Croad, nic.croad@gmail.com
2017/11/28, niccroad, Nicolas Croad, nic.croad@gmail.com
2017/12/01, DavidMoraisFerreira, David Morais Ferreira, david.moraisferreira@gmail.com
2017/12/01, SebastianLng, Sebastian Lang, sebastian.lang@outlook.com
2017/12/03, oranoran, Oran Epelbaum, oran / epelbaum me

View File

@ -0,0 +1,78 @@
# Case-Insensitive Lexing
In some languages, keywords are case insensitive meaning that `BeGiN` means the same thing as `begin` or `BEGIN`. ANTLR has two mechanisms to support building grammars for such languages:
1. Build lexical rules that match either upper or lower case.
* **Advantage**: no changes required to ANTLR, makes it clear in the grammar that the language in this case insensitive.
* **Disadvantage**: might have a small efficiency cost and grammar is a more verbose and more of a hassle to write.
2. Build lexical rules that match keywords in all uppercase and then parse with a custom [character stream](https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/CharStream.java) that converts all characters to uppercase before sending them to the lexer (via the `LA()` method). Care must be taken not to convert all characters in the stream to uppercase because characters within strings and comments should be unaffected. All we really want is to trick the lexer into thinking the input is all uppercase.
* **Advantage**: Could have a speed advantage depending on implementation, no change required to the grammar.
* **Disadvantage**: Requires that the case-insensitive stream and grammar are used in correctly in conjunction with each other, makes all characters appear as uppercase/lowercase to the lexer but some grammars are case sensitive outside of keywords, errors new case insensitive streams and language output targets (java, C#, C++, ...).
For the 4.7.1 release, we discussed both approaches in [detail](https://github.com/antlr/antlr4/pull/2046) and even possibly altering the ANTLR metalanguage to directly support case-insensitive lexing. We discussed including the case insensitive streams into the runtime but not all would be immediately supported. I decided to simply make documentation that clearly states how to handle this and include the appropriate snippets that people can cut-and-paste into their grammars.
## Case-insensitive grammars
As a prime example of a grammar that specifically describes case insensitive keywords, see the
[SQLite grammar](https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4). To match a case insensitive keyword, there are rules such as
```
K_UPDATE : U P D A T E;
```
that will match `UpdaTE` and `upDATE` etc... as the `update` keyword. This rule makes use of some generically useful fragment rules that you can cut-and-paste into your grammars:
```
fragment A : [aA]; // match either an 'a' or 'A'
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];
```
No special streams are required to use this mechanism for case insensitivity.
## Custom character streams approach
The other approach is to use lexical rules that match either all uppercase or all lowercase, such as:
```
K_UPDATE : 'UPDATE';
```
Then, when creating the character stream to parse from, we need a custom class that overrides methods used by the lexer. Below you will find custom character streams for a number of the targets that you can copy into your projects, but here is how to use the streams in Java as an example:
```java
CharStream s = CharStreams.fromPath(Paths.get('test.sql'));
CaseChangingCharStream upper = new CaseChangingCharStream(s, true);
Lexer lexer = new SomeSQLLexer(upper);
```
Here are implementations of `CaseChangingCharStream` in various target languages:
* [Java](https://github.com/parrt/antlr4/blob/case-insensitivity-doc/doc/resources/CaseChangingCharStream.java)
* [JavaScript](https://github.com/parrt/antlr4/blob/case-insensitivity-doc/doc/resources/CaseInsensitiveInputStream.js)
* [Go](https://github.com/parrt/antlr4/blob/case-insensitivity-doc/doc/resources/case_changing_stream.go)
* [C#](https://github.com/parrt/antlr4/blob/case-insensitivity-doc/doc/resources/CaseChangingCharStream.cs)

View File

@ -6,7 +6,7 @@ Hi and welcome to the version 4 release of ANTLR! It's named after the fearless
ANTLR is really two things: a tool that translates your grammar to a parser/lexer in Java (or other target language) and the runtime needed by the generated parsers/lexers. Even if you are using the ANTLR Intellij plug-in or ANTLRWorks to run the ANTLR tool, the generated code will still need the runtime library.
The first thing you should do is probably download and install a development tool plug-in. Even if you only use such tools for editing, they are great. Then, follow the instructions below to get the runtime environment available to your system to run generated parsers/lexers. In what follows, I talk about antlr-4.7-complete.jar, which has the tool and the runtime and any other support libraries (e.g., ANTLR v4 is written in v3).
The first thing you should do is probably download and install a development tool plug-in. Even if you only use such tools for editing, they are great. Then, follow the instructions below to get the runtime environment available to your system to run generated parsers/lexers. In what follows, I talk about antlr-4.7.1-complete.jar, which has the tool and the runtime and any other support libraries (e.g., ANTLR v4 is written in v3).
If you are going to integrate ANTLR into your existing build system using mvn, ant, or want to get ANTLR into your IDE such as eclipse or intellij, see Integrating ANTLR into Development Systems.
@ -16,21 +16,21 @@ If you are going to integrate ANTLR into your existing build system using mvn, a
1. Download
```
$ cd /usr/local/lib
$ curl -O http://www.antlr.org/download/antlr-4.7-complete.jar
$ curl -O http://www.antlr.org/download/antlr-4.7.1-complete.jar
```
Or just download in browser from website:
[http://www.antlr.org/download.html](http://www.antlr.org/download.html)
and put it somewhere rational like `/usr/local/lib`.
2. Add `antlr-4.7-complete.jar` to your `CLASSPATH`:
2. Add `antlr-4.7.1-complete.jar` to your `CLASSPATH`:
```
$ export CLASSPATH=".:/usr/local/lib/antlr-4.7-complete.jar:$CLASSPATH"
$ export CLASSPATH=".:/usr/local/lib/antlr-4.7.1-complete.jar:$CLASSPATH"
```
It's also a good idea to put this in your `.bash_profile` or whatever your startup script is.
3. Create aliases for the ANTLR Tool, and `TestRig`.
```
$ alias antlr4='java -Xmx500M -cp "/usr/local/lib/antlr-4.7-complete.jar:$CLASSPATH" org.antlr.v4.Tool'
$ alias antlr4='java -Xmx500M -cp "/usr/local/lib/antlr-4.7.1-complete.jar:$CLASSPATH" org.antlr.v4.Tool'
$ alias grun='java org.antlr.v4.gui.TestRig'
```
@ -45,7 +45,7 @@ Save to your directory for 3rd party Java libraries, say `C:\Javalib`
* Permanently: Using System Properties dialog > Environment variables > Create or append to `CLASSPATH` variable
* Temporarily, at command line:
```
SET CLASSPATH=.;C:\Javalib\antlr-4.7-complete.jar;%CLASSPATH%
SET CLASSPATH=.;C:\Javalib\antlr-4.7.1-complete.jar;%CLASSPATH%
```
3. Create short convenient commands for the ANTLR Tool, and TestRig, using batch files or doskey commands:
* Batch files (in directory in system PATH) antlr4.bat and grun.bat
@ -67,7 +67,7 @@ Either launch org.antlr.v4.Tool directly:
```
$ java org.antlr.v4.Tool
ANTLR Parser Generator Version 4.7
ANTLR Parser Generator Version 4.7.1
-o ___ specify output directory where all output is generated
-lib ___ specify location of .tokens files
...
@ -76,8 +76,8 @@ ANTLR Parser Generator Version 4.7
or use -jar option on java:
```
$ java -jar /usr/local/lib/antlr-4.7-complete.jar
ANTLR Parser Generator Version 4.7
$ java -jar /usr/local/lib/antlr-4.7.1-complete.jar
ANTLR Parser Generator Version 4.7.1
-o ___ specify output directory where all output is generated
-lib ___ specify location of .tokens files
...

View File

@ -8,7 +8,7 @@ Notes:
<li>Copyright © 2012, The Pragmatic Bookshelf. Pragmatic Bookshelf grants a nonexclusive, irrevocable, royalty-free, worldwide license to reproduce, distribute, prepare derivative works, and otherwise use this contribution as part of the ANTLR project and associated documentation.</li>
<li>This text was copied with permission from the <a href=http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference>The Definitive ANTLR 4 Reference</a>, though it is being morphed over time as the tool changes.</li>
<li>Much of this text was copied with permission from the <a href=http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference>The Definitive ANTLR 4 Reference</a>, though it is being morphed over time as the tool changes.</li>
</ul>
Links in the documentation refer to various sections of the book but have been redirected to the general book page on the publisher's site. There are two excerpts on the publisher's website that might be useful to you without having to purchase the book: [Let's get Meta](http://media.pragprog.com/titles/tpantlr2/picture.pdf) and [Building a Translator with a Listener](http://media.pragprog.com/titles/tpantlr2/listener.pdf). You should also consider reading the following books (the vid describes the reference book):
@ -55,6 +55,8 @@ This documentation is a reference and summarizes grammar syntax and the key sema
* [Parsing binary streams](parsing-binary-files.md)
* [Case-Insensitive Lexing](case-insensitive-lexing.md)
* [Parser and lexer interpreters](interpreters.md)
* [Resources](resources.md)

View File

@ -159,6 +159,28 @@ With JDK 1.7 (not 6 or 8), do this:
mvn release:prepare -Darguments="-DskipTests"
```
Hm...per https://github.com/keybase/keybase-issues/issues/1712 we need this to make gpg work:
```bash
export GPG_TTY=$(tty)
```
Side note to set jdk 1.7 on os x:
```bash
alias java='/Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/bin/java'
alias javac='/Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/bin/javac'
alias javadoc='/Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/bin/javadoc'
alias jar='/Library/Java/JavaVirtualMachines/jdk1.7.0_21.jdk/Contents/Home/bin/jar'
```
You should see 0x33 in generated .class files after 0xCAFEBABE; see [Java SE 7 = 51 (0x33 hex)](https://en.wikipedia.org/wiki/Java_class_file):
```bash
beast:/tmp/org/antlr/v4 $ od -h Tool.class |head -1
0000000 feca beba 0000 3300 fa04 0207 0ab8 0100
```
It will start out by asking you the version number:
```
@ -244,7 +266,9 @@ popd
### CSharp
*Publishing to Nuget from Windows*
Now we have [appveyor create artifact](https://ci.appveyor.com/project/parrt/antlr4/build/artifacts). Go to [nuget](https://www.nuget.org/packages/manage/upload) to upload the `.nupkg`.
### Publishing to Nuget from Windows
**Install the pre-requisites**
@ -310,13 +334,12 @@ index-servers =
pypitest
[pypi]
repository: https://pypi.python.org/pypi
username: parrt
password: XXX
password: xxx
[pypitest]
repository: https://testpypi.python.org/pypi
username: parrt
password: xxx
```
Then run the usual python set up stuff:
@ -324,8 +347,7 @@ Then run the usual python set up stuff:
```bash
cd ~/antlr/code/antlr4/runtime/Python2
# assume you have ~/.pypirc set up
python setup.py register -r pypi
python setup.py sdist bdist_wininst upload -r pypi
python2 setup.py sdist upload
```
and do again for Python 3 target
@ -333,8 +355,7 @@ and do again for Python 3 target
```bash
cd ~/antlr/code/antlr4/runtime/Python3
# assume you have ~/.pypirc set up
python setup.py register -r pypi
python setup.py sdist bdist_wininst upload -r pypi
python3 setup.py sdist upload
```
There are links to the artifacts in [download.html](http://www.antlr.org/download.html) already.
@ -368,12 +389,12 @@ cd runtime/Cpp
cp antlr4-cpp-runtime-source.zip ~/antlr/sites/website-antlr4/download/antlr4-cpp-runtime-4.7-source.zip
```
On a Windows machine the build scripts checks if VS 2013 and/or VS 2015 are installed and builds binaries for each, if found. This script requires 7z to be installed (http://7-zip.org).
On a Windows machine the build scripts checks if VS 2013 and/or VS 2015 are installed and builds binaries for each, if found. This script requires 7z to be installed (http://7-zip.org then do `set PATH=%PATH%;C:\Program Files\7-Zip\` from DOS not powershell).
```bash
cd runtime/Cpp
deploy-windows.cmd
cp antlr4-cpp-runtime-vs2015.zip ~/antlr/sites/website-antlr4/download/antlr4-cpp-runtime-4.7-vs2015.zip
cp runtime\bin\vs-2015\x64\Release DLL\antlr4-cpp-runtime-vs2015.zip ~/antlr/sites/website-antlr4/download/antlr4-cpp-runtime-4.7-vs2015.zip
```
Move target to website (**_rename to a specific ANTLR version first if needed_**):

View File

@ -0,0 +1,105 @@
/* Copyright (c) 2012-2017 The ANTLR Project. All rights reserved.
* Use of this file is governed by the BSD 3-clause license that
* can be found in the LICENSE.txt file in the project root.
*/
using System;
using Antlr4.Runtime.Misc;
namespace Antlr4.Runtime
{
/// <summary>
/// This class supports case-insensitive lexing by wrapping an existing
/// <see cref="ICharStream"/> and forcing the lexer to see either upper or
/// lowercase characters. Grammar literals should then be either upper or
/// lower case such as 'BEGIN' or 'begin'. The text of the character
/// stream is unaffected. Example: input 'BeGiN' would match lexer rule
/// 'BEGIN' if constructor parameter upper=true but getText() would return
/// 'BeGiN'.
/// </summary>
public class CaseChangingCharStream : ICharStream
{
private ICharStream stream;
private bool upper;
/// <summary>
/// Constructs a new CaseChangingCharStream wrapping the given <paramref name="stream"/> forcing
/// all characters to upper case or lower case.
/// </summary>
/// <param name="stream">The stream to wrap.</param>
/// <param name="upper">If true force each symbol to upper case, otherwise force to lower.</param>
public CaseChangingCharStream(ICharStream stream, bool upper)
{
this.stream = stream;
this.upper = upper;
}
public int Index
{
get
{
return stream.Index;
}
}
public int Size
{
get
{
return stream.Size;
}
}
public string SourceName
{
get
{
return stream.SourceName;
}
}
public void Consume()
{
stream.Consume();
}
[return: NotNull]
public string GetText(Interval interval)
{
return stream.GetText(interval);
}
public int LA(int i)
{
int c = stream.LA(i);
if (c <= 0)
{
return c;
}
char o = (char)c;
if (upper)
{
return (int)char.ToUpperInvariant(o);
}
return (int)char.ToLowerInvariant(o);
}
public int Mark()
{
return stream.Mark();
}
public void Release(int marker)
{
stream.Release(marker);
}
public void Seek(int index)
{
stream.Seek(index);
}
}
}

View File

@ -0,0 +1,81 @@
package org.antlr.v4.runtime;
import org.antlr.v4.runtime.misc.Interval;
/**
* This class supports case-insensitive lexing by wrapping an existing
* {@link CharStream} and forcing the lexer to see either upper or
* lowercase characters. Grammar literals should then be either upper or
* lower case such as 'BEGIN' or 'begin'. The text of the character
* stream is unaffected. Example: input 'BeGiN' would match lexer rule
* 'BEGIN' if constructor parameter upper=true but getText() would return
* 'BeGiN'.
*/
public class CaseChangingCharStream implements CharStream {
final CharStream stream;
final boolean upper;
/**
* Constructs a new CaseChangingCharStream wrapping the given {@link CharStream} forcing
* all characters to upper case or lower case.
* @param stream The stream to wrap.
* @param upper If true force each symbol to upper case, otherwise force to lower.
*/
public CaseChangingCharStream(CharStream stream, boolean upper) {
this.stream = stream;
this.upper = upper;
}
@Override
public String getText(Interval interval) {
return stream.getText(interval);
}
@Override
public void consume() {
stream.consume();
}
@Override
public int LA(int i) {
int c = stream.LA(i);
if (c <= 0) {
return c;
}
if (upper) {
return Character.toUpperCase(c);
}
return Character.toLowerCase(c);
}
@Override
public int mark() {
return stream.mark();
}
@Override
public void release(int marker) {
stream.release(marker);
}
@Override
public int index() {
return stream.index();
}
@Override
public void seek(int index) {
stream.seek(index);
}
@Override
public int size() {
return stream.size();
}
@Override
public String getSourceName() {
return stream.getSourceName();
}
}

View File

@ -0,0 +1,54 @@
//
/* Copyright (c) 2012-2017 The ANTLR Project. All rights reserved.
* Use of this file is governed by the BSD 3-clause license that
* can be found in the LICENSE.txt file in the project root.
*/
//
function CaseInsensitiveInputStream(stream, upper) {
this._stream = stream;
this._case = upper ? String.toUpperCase : String.toLowerCase;
return this;
}
CaseInsensitiveInputStream.prototype.LA = function (offset) {
c = this._stream.LA(i);
if (c <= 0) {
return c;
}
return this._case.call(String.fromCodePoint(c))
};
CaseInsensitiveInputStream.prototype.reset = function() {
return this._stream.reset();
};
CaseInsensitiveInputStream.prototype.consume = function() {
return this._stream.consume();
};
CaseInsensitiveInputStream.prototype.LT = function(offset) {
return this._stream.LT(offset);
};
CaseInsensitiveInputStream.prototype.mark = function() {
return this._stream.mark();
};
CaseInsensitiveInputStream.prototype.release = function(marker) {
return this._stream.release(marker);
};
CaseInsensitiveInputStream.prototype.seek = function(_index) {
return this._stream.getText(start, stop);
};
CaseInsensitiveInputStream.prototype.getText = function(start, stop) {
return this._stream.getText(start, stop);
};
CaseInsensitiveInputStream.prototype.toString = function() {
return this._stream.toString();
};
exports.CaseInsensitiveInputStream = CaseInsensitiveInputStream;

View File

@ -0,0 +1,37 @@
package antlr
import (
"unicode"
)
// CaseChangingStream wraps an existing CharStream, but upper cases, or
// lower cases the input before it is tokenized.
type CaseChangingStream struct {
CharStream
upper bool
}
// NewCaseChangingStream returns a new CaseChangingStream that forces
// all tokens read from the underlying stream to be either upper case
// or lower case based on the upper argument.
func NewCaseChangingStream(in CharStream, upper bool) *CaseChangingStream {
return &CaseChangingStream{
in, upper,
}
}
// LA gets the value of the symbol at offset from the current position
// from the underlying CharStream and converts it to either upper case
// or lower case.
func (is *CaseChangingStream) LA(offset int) int {
in := is.CharStream.LA(offset)
if in < 0 {
// Such as antlr.TokenEOF which is -1
return in
}
if is.upper {
return int(unicode.ToUpper(rune(in)))
}
return int(unicode.ToLower(rune(in)))
}

View File

@ -4,7 +4,7 @@ If you invoke the ANTLR tool without command line arguments, youll get a help
```bash
$ antlr4
ANTLR Parser Generator Version 4.5
ANTLR Parser Generator Version 4.7.1
-o ___ specify output directory where all output is generated
-lib ___ specify location of grammars, tokens files
-atn generate rule augmented transition network diagrams
@ -23,6 +23,7 @@ ANTLR Parser Generator Version 4.5
-XdbgSTWait wait for STViz to close before continuing
-Xforce-atn use the ATN simulator for all predictions
-Xlog dump lots of logging info to antlr-timestamp.log
-Xexact-output-dir all output goes into -o dir regardless of paths/package
```
Here are more details on the options:
@ -159,3 +160,175 @@ This option creates a log file containing lots of information messages from ANTL
$ antlr4 -Xlog T.g4
wrote ./antlr-2012-09-06-17.56.19.log
```
## `-Xexact-output-dir`
(*See the [discussion](https://github.com/antlr/antlr4/pull/2065)*).
All output goes into `-o` dir regardless of paths/package.
* Output `-o` directory specifier is the exact directory containing the output. Previously it would include the relative path specified on the grammar itself for the purposes of packages.
**new**: `-o /tmp subdir/T.g4` => `/tmp/subdir/T.java`
**old**: `-o /tmp subdir/T.g4` => `/tmp/T.java`
* Previously we looked for the tokens vocab file in the `-lib` dir or in the output dir. **New**: also look in the directory containing the grammar, particularly if it it is specified with a path.
### Example for the output directory (4.7)
Here is the existing 4.7 functionality.
(For these examples, assume a4.7 and a4.7.1 are aliases to the right version of ANTLR's `org.antlr.v4.Tool`.)
```bash
$ cd /tmp/parrt
$ tree
.
├── B.g4
└── src
└── pkg
└── A.g4
$ a4.7 -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build/
└── src
└── pkg
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
Now, let's build a grammar that sits in the current directory:
```bash
$ a4.7 -o /tmp/build B.g4
$ tree /tmp/build
/tmp/build
├── B.tokens
├── BBaseListener.java
├── BLexer.java
├── BLexer.tokens
├── BListener.java
├── BParser.java
└── src
└── pkg
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
Finally, if we don't specify the output directory, it paid attention to the relative path specified on the input grammar:
```bash
$ a4.7 src/pkg/A.g4
$ tree
.
├── B.g4
└── src
└── pkg
├── A.g4
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
### Example for the output directory (4.7.1 with -Xexact-output-dir)
Now, the output directory is the exact directory where output is generated regardless of relative paths on the grammar
```bash
$ cd /tmp/parrt
$ a4.7.1 -Xexact-output-dir -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
If you use the package option, it still does not change where the output is generated if you use `-o`
```bash
$ a4.7.1 -Xexact-output-dir -package pkg -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
4.7.1 does however add the package specification into the generated files:
```bash
$ grep package /tmp/build/A*.java
/tmp/build/ABaseListener.java:package pkg;
/tmp/build/ALexer.java:package pkg;
/tmp/build/AListener.java:package pkg;
/tmp/build/AParser.java:package pkg;
```
Compare this to 4.7:
```bash
$ a4.7 -package pkg -o /tmp/build src/pkg/A.g4
beast:/tmp/parrt $ tree /tmp/build
/tmp/build
└── src
└── pkg
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java
```
### Example of where it looks for tokens vocab
In 4.7, we got an error for an obvious case that should work:
```bash
$ cd /tmp/parrt
$ tree
.
└── src
└── pkg
├── L.g4
└── P.g4
$ a4.7 -o /tmp/build src/pkg/*.g4
error(160): P.g4:2:21: cannot find tokens file /tmp/build/L.tokens
warning(125): P.g4:3:4: implicit definition of token A in parser
```
In 4.7.1 it looks in the directory containing the grammars as well:
```bash
$ a4.7.1 -o /tmp/build src/pkg/*.g4
$ tree /tmp/build
/tmp/build
├── L.java
├── L.tokens
├── P.java
├── P.tokens
├── PBaseListener.java
├── PListener.java
└── src
└── pkg
├── L.java
└── L.tokens
```

View File

@ -13,7 +13,7 @@
</parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
<packaging>pom</packaging>
<name>ANTLR 4</name>

View File

@ -9,7 +9,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<artifactId>antlr4-runtime-test-annotations</artifactId>

View File

@ -10,7 +10,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
</parent>
<artifactId>antlr4-runtime-testsuite</artifactId>
<name>ANTLR 4 Runtime Tests (2nd generation)</name>

View File

@ -9,7 +9,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<artifactId>antlr4-runtime-test-annotation-processors</artifactId>

View File

@ -43,7 +43,7 @@ See the docs and the book to learn about writing lexer and parser grammars.
### Step 4: Generate the C# code
This can be done either from the cmd line, or by adding a custom pre-build command in your project.
At minimal, the cmd line should look as follows: ``java -jar antlr4-4.7.jar -Dlanguage=CSharp grammar.g4``
At minimal, the cmd line should look as follows: ``java -jar antlr4-4.7.1.jar -Dlanguage=CSharp grammar.g4``
This will generate the files, which you can then integrate in your project.
This is just a quick start. The tool has many useful options to control generation, please refer to its documentation.

View File

@ -42,8 +42,8 @@ using System.Runtime.InteropServices;
// You can specify all the values or you can default the Build and Revision Numbers
// by using the '*' as shown below:
// [assembly: AssemblyVersion("1.0.*")]
[assembly: AssemblyVersion("4.7")]
[assembly: AssemblyVersion("4.7.1")]
#if !COMPACT
[assembly: AssemblyFileVersion("4.7")]
[assembly: AssemblyInformationalVersion("4.7")]
[assembly: AssemblyFileVersion("4.7.1")]
[assembly: AssemblyInformationalVersion("4.7.1")]
#endif

View File

@ -1 +1 @@
4.7
4.7.1

View File

@ -66,7 +66,7 @@ set(ANTLR4CPP_EXTERNAL_ROOT ${CMAKE_BINARY_DIR}/externals/antlr4cpp)
# external repository
# GIT_REPOSITORY https://github.com/antlr/antlr4.git
set(ANTLR4CPP_EXTERNAL_REPO "https://github.com/antlr/antlr4.git")
set(ANTLR4CPP_EXTERNAL_TAG "4.7")
set(ANTLR4CPP_EXTERNAL_TAG "4.7.1")
if(NOT EXISTS "${ANTLR4CPP_JAR_LOCATION}")
message(FATAL_ERROR "Unable to find antlr tool. ANTLR4CPP_JAR_LOCATION:${ANTLR4CPP_JAR_LOCATION}")

View File

@ -6,7 +6,7 @@
:: Download the ANLTR jar and place it in the same folder as this script (or adjust the LOCATION var accordingly).
set LOCATION=antlr-4.7-complete.jar
set LOCATION=antlr-4.7.1-complete.jar
java -jar %LOCATION% -Dlanguage=Cpp -listener -visitor -o generated/ -package antlrcpptest TLexer.g4 TParser.g4
::java -jar %LOCATION% -Dlanguage=Cpp -listener -visitor -o generated/ -package antlrcpptest -XdbgST TLexer.g4 TParser.g4
::java -jar %LOCATION% -Dlanguage=Java -listener -visitor -o generated/ -package antlrcpptest TLexer.g4 TParser.g4

View File

@ -7,7 +7,7 @@
using namespace antlr4;
const std::string RuntimeMetaData::VERSION = "4.7";
const std::string RuntimeMetaData::VERSION = "4.7.1";
std::string RuntimeMetaData::getRuntimeVersion() {
return VERSION;

View File

@ -231,10 +231,10 @@ func (c *CommonTokenStream) previousTokenOnChannel(i, channel int) int {
return i
}
// getHiddenTokensToRight collects all tokens on a specified channel to the
// GetHiddenTokensToRight collects all tokens on a specified channel to the
// right of the current token up until we see a token on DEFAULT_TOKEN_CHANNEL
// or EOF. If channel is -1, it finds any non-default channel token.
func (c *CommonTokenStream) getHiddenTokensToRight(tokenIndex, channel int) []Token {
func (c *CommonTokenStream) GetHiddenTokensToRight(tokenIndex, channel int) []Token {
c.lazyInit()
if tokenIndex < 0 || tokenIndex >= len(c.tokens) {
@ -256,10 +256,10 @@ func (c *CommonTokenStream) getHiddenTokensToRight(tokenIndex, channel int) []To
return c.filterForChannel(from, to, channel)
}
// getHiddenTokensToLeft collects all tokens on channel to the left of the
// GetHiddenTokensToLeft collects all tokens on channel to the left of the
// current token until we see a token on DEFAULT_TOKEN_CHANNEL. If channel is
// -1, it finds any non default channel token.
func (c *CommonTokenStream) getHiddenTokensToLeft(tokenIndex, channel int) []Token {
func (c *CommonTokenStream) GetHiddenTokensToLeft(tokenIndex, channel int) []Token {
c.lazyInit()
if tokenIndex < 0 || tokenIndex >= len(c.tokens) {

View File

@ -81,42 +81,42 @@ func TestCommonTokenStreamFetchOffChannel(t *testing.T) {
tokens := NewCommonTokenStream(lexEngine, TokenDefaultChannel)
tokens.Fill()
assert.Nil(tokens.getHiddenTokensToLeft(0, -1))
assert.Nil(tokens.getHiddenTokensToRight(0, -1))
assert.Nil(tokens.GetHiddenTokensToLeft(0, -1))
assert.Nil(tokens.GetHiddenTokensToRight(0, -1))
assert.Equal("[[@0,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.getHiddenTokensToLeft(1, -1)))
assert.Equal("[[@2,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.getHiddenTokensToRight(1, -1)))
assert.Equal("[[@0,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.GetHiddenTokensToLeft(1, -1)))
assert.Equal("[[@2,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.GetHiddenTokensToRight(1, -1)))
assert.Nil(tokens.getHiddenTokensToLeft(2, -1))
assert.Nil(tokens.getHiddenTokensToRight(2, -1))
assert.Nil(tokens.GetHiddenTokensToLeft(2, -1))
assert.Nil(tokens.GetHiddenTokensToRight(2, -1))
assert.Equal("[[@2,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.getHiddenTokensToLeft(3, -1)))
assert.Nil(tokens.getHiddenTokensToRight(3, -1))
assert.Equal("[[@2,0:0=' ',<1>,channel=1,0:-1]]", tokensToString(tokens.GetHiddenTokensToLeft(3, -1)))
assert.Nil(tokens.GetHiddenTokensToRight(3, -1))
assert.Nil(tokens.getHiddenTokensToLeft(4, -1))
assert.Nil(tokens.GetHiddenTokensToLeft(4, -1))
assert.Equal("[[@5,0:0=' ',<1>,channel=1,0:-1], [@6,0:0=' ',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToRight(4, -1)))
tokensToString(tokens.GetHiddenTokensToRight(4, -1)))
assert.Nil(tokens.getHiddenTokensToLeft(5, -1))
assert.Nil(tokens.GetHiddenTokensToLeft(5, -1))
assert.Equal("[[@6,0:0=' ',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToRight(5, -1)))
tokensToString(tokens.GetHiddenTokensToRight(5, -1)))
assert.Equal("[[@5,0:0=' ',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToLeft(6, -1)))
assert.Nil(tokens.getHiddenTokensToRight(6, -1))
tokensToString(tokens.GetHiddenTokensToLeft(6, -1)))
assert.Nil(tokens.GetHiddenTokensToRight(6, -1))
assert.Equal("[[@5,0:0=' ',<1>,channel=1,0:-1], [@6,0:0=' ',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToLeft(7, -1)))
tokensToString(tokens.GetHiddenTokensToLeft(7, -1)))
assert.Equal("[[@8,0:0=' ',<1>,channel=1,0:-1], [@9,0:0='\\n',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToRight(7, -1)))
tokensToString(tokens.GetHiddenTokensToRight(7, -1)))
assert.Nil(tokens.getHiddenTokensToLeft(8, -1))
assert.Nil(tokens.GetHiddenTokensToLeft(8, -1))
assert.Equal("[[@9,0:0='\\n',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToRight(8, -1)))
tokensToString(tokens.GetHiddenTokensToRight(8, -1)))
assert.Equal("[[@8,0:0=' ',<1>,channel=1,0:-1]]",
tokensToString(tokens.getHiddenTokensToLeft(9, -1)))
assert.Nil(tokens.getHiddenTokensToRight(9, -1))
tokensToString(tokens.GetHiddenTokensToLeft(9, -1)))
assert.Nil(tokens.GetHiddenTokensToRight(9, -1))
}

View File

@ -49,7 +49,7 @@ var tokenTypeMapCache = make(map[string]int)
var ruleIndexMapCache = make(map[string]int)
func (b *BaseRecognizer) checkVersion(toolVersion string) {
runtimeVersion := "4.7"
runtimeVersion := "4.7.1"
if runtimeVersion != toolVersion {
fmt.Println("ANTLR runtime and generated code versions disagree: " + runtimeVersion + "!=" + toolVersion)
}

View File

@ -103,11 +103,11 @@ func TreesfindAllRuleNodes(t ParseTree, ruleIndex int) []ParseTree {
func TreesfindAllNodes(t ParseTree, index int, findTokens bool) []ParseTree {
nodes := make([]ParseTree, 0)
TreesFindAllNodes(t, index, findTokens, nodes)
treesFindAllNodes(t, index, findTokens, &nodes)
return nodes
}
func TreesFindAllNodes(t ParseTree, index int, findTokens bool, nodes []ParseTree) {
func treesFindAllNodes(t ParseTree, index int, findTokens bool, nodes *[]ParseTree) {
// check this node (the root) first
t2, ok := t.(TerminalNode)
@ -115,16 +115,16 @@ func TreesFindAllNodes(t ParseTree, index int, findTokens bool, nodes []ParseTre
if findTokens && ok {
if t2.GetSymbol().GetTokenType() == index {
nodes = append(nodes, t2)
*nodes = append(*nodes, t2)
}
} else if !findTokens && ok2 {
if t3.GetRuleIndex() == index {
nodes = append(nodes, t3)
*nodes = append(*nodes, t3)
}
}
// check children
for i := 0; i < t.GetChildCount(); i++ {
TreesFindAllNodes(t.GetChild(i).(ParseTree), index, findTokens, nodes)
treesFindAllNodes(t.GetChild(i).(ParseTree), index, findTokens, nodes)
}
}

View File

@ -353,6 +353,34 @@ func PrintArrayJavaStyle(sa []string) string {
return buffer.String()
}
// The following routines were lifted from bits.rotate* available in Go 1.9.
const uintSize = 32 << (^uint(0) >> 32 & 1) // 32 or 64
// rotateLeft returns the value of x rotated left by (k mod UintSize) bits.
// To rotate x right by k bits, call RotateLeft(x, -k).
func rotateLeft(x uint, k int) uint {
if uintSize == 32 {
return uint(rotateLeft32(uint32(x), k))
}
return uint(rotateLeft64(uint64(x), k))
}
// rotateLeft32 returns the value of x rotated left by (k mod 32) bits.
func rotateLeft32(x uint32, k int) uint32 {
const n = 32
s := uint(k) & (n - 1)
return x<<s | x>>(n-s)
}
// rotateLeft64 returns the value of x rotated left by (k mod 64) bits.
func rotateLeft64(x uint64, k int) uint64 {
const n = 64
s := uint(k) & (n - 1)
return x<<s | x>>(n-s)
}
// murmur hash
const (
c1_32 uint = 0xCC9E2D51
@ -367,11 +395,11 @@ func murmurInit(seed int) int {
func murmurUpdate(h1 int, k1 int) int {
var k1u uint
k1u = uint(k1) * c1_32
k1u = (k1u << 15) | (k1u >> 17) // rotl32(k1u, 15)
k1u = rotateLeft(k1u, 15)
k1u *= c2_32
var h1u = uint(h1) ^ k1u
h1u = (h1u << 13) | (h1u >> 19) // rotl32(h1u, 13)
k1u = rotateLeft(k1u, 13)
h1u = h1u*5 + 0xe6546b64
return int(h1u)
}

View File

@ -9,7 +9,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<artifactId>antlr4-runtime</artifactId>

View File

@ -67,7 +67,7 @@ public class RuntimeMetaData {
* omitted.</li>
* </ul>
*/
public static final String VERSION = "4.7";
public static final String VERSION = "4.7.1";
/**
* Gets the currently executing version of the ANTLR 4 runtime library.

View File

@ -21,7 +21,7 @@ Recognizer.ruleIndexMapCache = {};
Recognizer.prototype.checkVersion = function(toolVersion) {
var runtimeVersion = "4.7";
var runtimeVersion = "4.7.1";
if (runtimeVersion!==toolVersion) {
console.log("ANTLR runtime and generated code versions disagree: "+runtimeVersion+"!="+toolVersion);
}

View File

@ -1,6 +1,6 @@
{
"name": "antlr4",
"version": "4.7.0",
"version": "4.7.1",
"description": "JavaScript runtime for ANTLR4",
"main": "src/antlr4/index.js",
"repository": "antlr/antlr4.git",

View File

@ -1,13 +1,13 @@
from distutils.core import setup
from setuptools import setup
setup(
name='antlr4-python2-runtime',
version='4.7',
packages=['antlr4', 'antlr4.atn', 'antlr4.dfa', 'antlr4.tree', 'antlr4.error', 'antlr4.xpath'],
package_dir={'': 'src'},
version='4.7.1',
url='http://www.antlr.org',
license='BSD',
packages=['antlr4', 'antlr4.atn', 'antlr4.dfa', 'antlr4.tree', 'antlr4.error', 'antlr4.xpath'],
package_dir={'': 'src'},
author='Eric Vergnaud, Terence Parr, Sam Harwell',
author_email='eric.vergnaud@wanadoo.fr',
description='ANTLR 4.7 runtime for Python 2.7.6'
description='ANTLR 4.7.1 runtime for Python 2.7.12'
)

View File

@ -30,7 +30,7 @@ class Recognizer(object):
return major, minor
def checkVersion(self, toolVersion):
runtimeVersion = "4.7"
runtimeVersion = "4.7.1"
rvmajor, rvminor = self.extractVersion(runtimeVersion)
tvmajor, tvminor = self.extractVersion(toolVersion)
if rvmajor!=tvmajor or rvminor!=tvminor:

View File

@ -39,7 +39,7 @@ class TestLexer(Lexer):
def __init__(self, input=None):
super(TestLexer, self).__init__(input)
self.checkVersion("4.7")
self.checkVersion("4.7.1")
self._interp = LexerATNSimulator(self, self.atn, self.decisionsToDFA, PredictionContextCache())
self._actions = None
self._predicates = None
@ -95,7 +95,7 @@ class TestLexer2(Lexer):
def __init__(self, input=None):
super(TestLexer2, self).__init__(input)
self.checkVersion("4.7")
self.checkVersion("4.7.1")
self._interp = LexerATNSimulator(self, self.atn, self.decisionsToDFA, PredictionContextCache())
self._actions = None
self._predicates = None

View File

@ -1,13 +1,13 @@
from distutils.core import setup
from setuptools import setup
setup(
name='antlr4-python3-runtime',
version='4.7',
version='4.7.1',
packages=['antlr4', 'antlr4.atn', 'antlr4.dfa', 'antlr4.tree', 'antlr4.error', 'antlr4.xpath'],
package_dir={'': 'src'},
url='http://www.antlr.org',
license='BSD',
author='Eric Vergnaud, Terence Parr, Sam Harwell',
author_email='eric.vergnaud@wanadoo.fr',
description='ANTLR 4.7 runtime for Python 3.4.0'
description='ANTLR 4.7.1 runtime for Python 3.6.3'
)

View File

@ -33,7 +33,7 @@ class Recognizer(object):
return major, minor
def checkVersion(self, toolVersion):
runtimeVersion = "4.7"
runtimeVersion = "4.7.1"
rvmajor, rvminor = self.extractVersion(runtimeVersion)
tvmajor, tvminor = self.extractVersion(toolVersion)
if rvmajor!=tvmajor or rvminor!=tvminor:

View File

@ -119,7 +119,7 @@ class XPathLexer(Lexer):
def __init__(self, input=None):
super().__init__(input)
self.checkVersion("4.7")
self.checkVersion("4.7.1")
self._interp = LexerATNSimulator(self, self.atn, self.decisionsToDFA, PredictionContextCache())
self._actions = None
self._predicates = None

View File

@ -792,7 +792,7 @@ class CLexer(Lexer):
def __init__(self, input=None):
super().__init__(input)
self.checkVersion("4.7")
self.checkVersion("4.7.1")
self._interp = LexerATNSimulator(self, self.atn, self.decisionsToDFA, PredictionContextCache())
self._actions = None
self._predicates = None

View File

@ -915,7 +915,7 @@ class CParser ( Parser ):
def __init__(self, input:TokenStream):
super().__init__(input)
self.checkVersion("4.7")
self.checkVersion("4.7.1")
self._interp = ParserATNSimulator(self, self.atn, self.decisionsToDFA, self.sharedContextCache)
self._predicates = None

View File

@ -63,7 +63,7 @@ public class RuntimeMetaData {
/// omitted, the `-` (hyphen-minus) appearing before it is also
/// omitted.
///
public static let VERSION: String = "4.7"
public static let VERSION: String = "4.7.1"
///
/// Gets the currently executing version of the ANTLR 4 runtime library.

View File

@ -0,0 +1,64 @@
# Get github issues / PR for a release
# Exec with "python github_release_notes.py YOUR_GITHUB_API_ACCESS_TOKEN 4.7.1"
from github import Github
from collections import Counter
import sys
TARGETS = ['csharp', 'cpp', 'go', 'java', 'javascript', 'python2', 'python3', 'swift']
TOKEN=sys.argv[1]
MILESTONE=sys.argv[2]
g = Github(login_or_token=TOKEN)
# Then play with your Github objects:
org = g.get_organization("antlr")
repo = org.get_repo("antlr4")
milestone = [x for x in repo.get_milestones() if x.title==MILESTONE]
milestone = milestone[0]
issues = repo.get_issues(state="closed", milestone=milestone, sort="created", direction="desc")
# # dump bugs fixed
# print()
# print("## Issues fixed")
# for x in issues:
# labels = [l.name for l in x.labels]
# if x.pull_request is None and not ("type:improvement" in labels or "type:feature" in labels):
# print("* [%s](%s) (%s)" % (x.title, x.html_url, ", ".join([l.name for l in x.labels])))
#
#
# print()
# # dump improvements closed for this release (issues or pulls)
# print("## Improvements, features")
# for x in issues:
# labels = [l.name for l in x.labels]
# if ("type:improvement" in labels or "type:feature" in labels):
# print("* [%s](%s) (%s)" % (x.title, x.html_url, ", ".join(labels)))
#
# print()
#
#
# # dump PRs closed for this release by target
# print("## Pull requests grouped by target")
# for target in TARGETS:
# print()
# print(f"### {target} target")
# for x in issues:
# labels = [l.name for l in x.labels]
# if x.pull_request is not None and f"target:{target}" in labels:
# print("* [%s](%s) (%s)" % (x.title, x.html_url, ", ".join(labels)))
#
# dump contributors
print()
print("## Contributors")
user_counts = Counter([x.user.login for x in issues])
users = {x.user.login:x.user for x in issues}
for login,count in user_counts.most_common(10000):
name = users[login].name
logins = f" ({users[login].login})"
if name is None:
name = users[login].login
logins = ""
print(f"* {count:3d} items: [{name}]({users[login].html_url}){logins}")

View File

@ -10,7 +10,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
</parent>
<artifactId>antlr4-tool-testsuite</artifactId>
<name>ANTLR 4 Tool Tests</name>

View File

@ -401,4 +401,39 @@ public class TestSymbolIssues extends BaseJavaToolTest {
testErrors(test, false);
}
@Test public void testUnreachableTokens() {
String[] test = {
"lexer grammar Test;\n" +
"TOKEN1: 'as' 'df' | 'qwer';\n" +
"TOKEN2: [0-9];\n" +
"TOKEN3: 'asdf';\n" +
"TOKEN4: 'q' 'w' 'e' 'r' | A;\n" +
"TOKEN5: 'aaaa';\n" +
"TOKEN6: 'asdf';\n" +
"TOKEN7: 'qwer'+;\n" +
"TOKEN8: 'a' 'b' | 'b' | 'a' 'b';\n" +
"fragment\n" +
"TOKEN9: 'asdf' | 'qwer' | 'qwer';\n" +
"TOKEN10: '\\r\\n' | '\\r\\n';\n" +
"TOKEN11: '\\r\\n';\n" +
"\n" +
"mode MODE1;\n" +
"TOKEN12: 'asdf';\n" +
"\n" +
"fragment A: 'A';",
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:4:0: One of the token TOKEN3 values unreachable. asdf is always overlapped by token TOKEN1\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:5:0: One of the token TOKEN4 values unreachable. qwer is always overlapped by token TOKEN1\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:7:0: One of the token TOKEN6 values unreachable. asdf is always overlapped by token TOKEN1\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:7:0: One of the token TOKEN6 values unreachable. asdf is always overlapped by token TOKEN3\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:9:0: One of the token TOKEN8 values unreachable. ab is always overlapped by token TOKEN8\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:11:0: One of the token TOKEN9 values unreachable. qwer is always overlapped by token TOKEN9\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:12:0: One of the token TOKEN10 values unreachable. \\r\\n is always overlapped by token TOKEN10\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:13:0: One of the token TOKEN11 values unreachable. \\r\\n is always overlapped by token TOKEN10\n" +
"warning(" + ErrorType.TOKEN_UNREACHABLE.code + "): Test.g4:13:0: One of the token TOKEN11 values unreachable. \\r\\n is always overlapped by token TOKEN10\n"
};
testErrors(test, false);
}
}

View File

@ -269,11 +269,11 @@ public class TestToolSyntaxErrors extends BaseJavaToolTest {
"grammar A;\n" +
"tokens{Foo}\n" +
"b : Foo ;\n" +
"X : 'foo' -> popmode;\n" + // "meant" to use -> popMode
"Y : 'foo' -> token(Foo);", // "meant" to use -> type(Foo)
"X : 'foo1' -> popmode;\n" + // "meant" to use -> popMode
"Y : 'foo2' -> token(Foo);", // "meant" to use -> type(Foo)
"error(" + ErrorType.INVALID_LEXER_COMMAND.code + "): A.g4:4:13: lexer command popmode does not exist or is not supported by the current target\n" +
"error(" + ErrorType.INVALID_LEXER_COMMAND.code + "): A.g4:5:13: lexer command token does not exist or is not supported by the current target\n"
"error(" + ErrorType.INVALID_LEXER_COMMAND.code + "): A.g4:4:14: lexer command popmode does not exist or is not supported by the current target\n" +
"error(" + ErrorType.INVALID_LEXER_COMMAND.code + "): A.g4:5:14: lexer command token does not exist or is not supported by the current target\n"
};
super.testErrors(pair, true);
}
@ -283,11 +283,11 @@ public class TestToolSyntaxErrors extends BaseJavaToolTest {
"grammar A;\n" +
"tokens{Foo}\n" +
"b : Foo ;\n" +
"X : 'foo' -> popMode(Foo);\n" + // "meant" to use -> popMode
"Y : 'foo' -> type;", // "meant" to use -> type(Foo)
"X : 'foo1' -> popMode(Foo);\n" + // "meant" to use -> popMode
"Y : 'foo2' -> type;", // "meant" to use -> type(Foo)
"error(" + ErrorType.UNWANTED_LEXER_COMMAND_ARGUMENT.code + "): A.g4:4:13: lexer command popMode does not take any arguments\n" +
"error(" + ErrorType.MISSING_LEXER_COMMAND_ARGUMENT.code + "): A.g4:5:13: missing argument for lexer command type\n"
"error(" + ErrorType.UNWANTED_LEXER_COMMAND_ARGUMENT.code + "): A.g4:4:14: lexer command popMode does not take any arguments\n" +
"error(" + ErrorType.MISSING_LEXER_COMMAND_ARGUMENT.code + "): A.g4:5:14: missing argument for lexer command type\n"
};
super.testErrors(pair, true);
}

View File

@ -9,7 +9,7 @@
<parent>
<groupId>org.antlr</groupId>
<artifactId>antlr4-master</artifactId>
<version>4.7.1-SNAPSHOT</version>
<version>4.7.2-SNAPSHOT</version>
</parent>
<artifactId>antlr4</artifactId>
<name>ANTLR 4 Tool</name>

View File

@ -821,6 +821,12 @@ function <lexer.name>(input) {
<lexer.name>.prototype = Object.create(<if(superClass)><superClass><else>antlr4.Lexer<endif>.prototype);
<lexer.name>.prototype.constructor = <lexer.name>;
Object.defineProperty(<lexer.name>.prototype, "atn", {
get : function() {
return atn;
}
});
<lexer.name>.EOF = antlr4.Token.EOF;
<lexer.tokens:{k | <lexer.name>.<k> = <lexer.tokens.(k)>;}; separator="\n", wrap, anchor>

View File

@ -354,7 +354,7 @@ func getVocabulary() -> Vocabulary {
override <accessLevelNotOpen(parser)>
init(_ input:TokenStream) throws {
RuntimeMetaData.checkVersion("4.7", RuntimeMetaData.VERSION)
RuntimeMetaData.checkVersion("4.7.1", RuntimeMetaData.VERSION)
try super.init(input)
_interp = ParserATNSimulator(self,<p.name>._ATN,<p.name>._decisionToDFA, <parser.name>._sharedContextCache)
}

View File

@ -28,7 +28,7 @@ public class CSharpTarget extends Target {
@Override
public String getVersion() {
return "4.7";
return "4.7.1";
}
@Override

View File

@ -50,7 +50,7 @@ public class CppTarget extends Target {
}
public String getVersion() {
return "4.7";
return "4.7.1";
}
public boolean needsHeader() { return true; }

View File

@ -71,7 +71,7 @@ public class GoTarget extends Target {
@Override
public String getVersion() {
return "4.7";
return "4.7.1";
}
public Set<String> getBadWords() {

View File

@ -51,7 +51,7 @@ public class JavaScriptTarget extends Target {
@Override
public String getVersion() {
return "4.7";
return "4.7.1";
}
public Set<String> getBadWords() {

View File

@ -94,7 +94,7 @@ public class Python2Target extends Target {
@Override
public String getVersion() {
return "4.7";
return "4.7.1";
}
public Set<String> getBadWords() {

View File

@ -96,7 +96,7 @@ public class Python3Target extends Target {
@Override
public String getVersion() {
return "4.7";
return "4.7.1";
}
/** Avoid grammar symbols in this set to prevent conflicts in gen'd code. */

View File

@ -87,7 +87,7 @@ public class SwiftTarget extends Target {
@Override
public String getVersion() {
return "4.7"; // Java and tool versions move in lock step
return "4.7.1"; // Java and tool versions move in lock step
}
public Set<String> getBadWords() {

View File

@ -108,6 +108,7 @@ public class SemanticPipeline {
}
symcheck.checkForModeConflicts(g);
symcheck.checkForUnreachableTokens(g);
assignChannelTypes(g, collector.channelDefs);

View File

@ -7,7 +7,9 @@
package org.antlr.v4.semantics;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.Tree;
import org.antlr.v4.automata.LexerATNFactory;
import org.antlr.v4.parse.ANTLRLexer;
import org.antlr.v4.parse.ANTLRParser;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.tool.Alternative;
@ -23,6 +25,7 @@ import org.antlr.v4.tool.LexerGrammar;
import org.antlr.v4.tool.Rule;
import org.antlr.v4.tool.ast.AltAST;
import org.antlr.v4.tool.ast.GrammarAST;
import org.antlr.v4.tool.ast.TerminalAST;
import java.util.ArrayList;
import java.util.Collection;
@ -39,42 +42,31 @@ import java.util.Set;
* Side-effect: strip away redef'd rules.
*/
public class SymbolChecks {
Grammar g;
SymbolCollector collector;
Map<String, Rule> nameToRuleMap = new HashMap<String, Rule>();
Grammar g;
SymbolCollector collector;
Map<String, Rule> nameToRuleMap = new HashMap<String, Rule>();
Set<String> tokenIDs = new HashSet<String>();
Map<String, Set<String>> actionScopeToActionNames = new HashMap<String, Set<String>>();
// DoubleKeyMap<String, String, GrammarAST> namedActions =
// new DoubleKeyMap<String, String, GrammarAST>();
Map<String, Set<String>> actionScopeToActionNames = new HashMap<String, Set<String>>();
public ErrorManager errMgr;
protected final Set<String> reservedNames = new HashSet<String>();
{
reservedNames.addAll(LexerATNFactory.getCommonConstants());
}
public SymbolChecks(Grammar g, SymbolCollector collector) {
this.g = g;
this.collector = collector;
public SymbolChecks(Grammar g, SymbolCollector collector) {
this.g = g;
this.collector = collector;
this.errMgr = g.tool.errMgr;
for (GrammarAST tokenId : collector.tokenIDRefs) {
tokenIDs.add(tokenId.getText());
}
/*
System.out.println("rules="+collector.rules);
System.out.println("rulerefs="+collector.rulerefs);
System.out.println("tokenIDRefs="+collector.tokenIDRefs);
System.out.println("terminals="+collector.terminals);
System.out.println("strings="+collector.strings);
System.out.println("tokensDef="+collector.tokensDefs);
System.out.println("actions="+collector.actions);
System.out.println("scopes="+collector.scopes);
*/
}
for (GrammarAST tokenId : collector.tokenIDRefs) {
tokenIDs.add(tokenId.getText());
}
}
public void process() {
public void process() {
// methods affect fields, but no side-effects outside this object
// So, call order sensitive
// First collect all rules for later use in checkForLabelConflict()
@ -83,7 +75,6 @@ public class SymbolChecks {
}
checkReservedNames(g.rules.values());
checkActionRedefinitions(collector.namedActions);
checkForTokenConflicts(collector.tokenIDRefs); // sets tokenIDs
checkForLabelConflicts(g.rules.values());
}
@ -116,21 +107,14 @@ public class SymbolChecks {
}
}
public void checkForTokenConflicts(List<GrammarAST> tokenIDRefs) {
// for (GrammarAST a : tokenIDRefs) {
// Token t = a.token;
// String ID = t.getText();
// tokenIDs.add(ID);
// }
}
/** Make sure a label doesn't conflict with another symbol.
* Labels must not conflict with: rules, tokens, scope names,
* return values, parameters, and rule-scope dynamic attributes
* defined in surrounding rule. Also they must have same type
* for repeated defs.
*/
public void checkForLabelConflicts(Collection<Rule> rules) {
/**
* Make sure a label doesn't conflict with another symbol.
* Labels must not conflict with: rules, tokens, scope names,
* return values, parameters, and rule-scope dynamic attributes
* defined in surrounding rule. Also they must have same type
* for repeated defs.
*/
public void checkForLabelConflicts(Collection<Rule> rules) {
for (Rule r : rules) {
checkForAttributeConflicts(r);
@ -213,7 +197,7 @@ public class SymbolChecks {
// Such behavior is referring to the fact that the warning is typically reported on the actual label redefinition,
// but for left-recursive rules the warning is reported on the enclosing rule.
org.antlr.runtime.Token token = r instanceof LeftRecursiveRule
? ((GrammarAST) r.ast.getChild(0)).getToken()
? ((GrammarAST) r.ast.getChild(0)).getToken()
: labelPair.label.token;
errMgr.grammarError(
ErrorType.LABEL_TYPE_CONFLICT,
@ -227,7 +211,7 @@ public class SymbolChecks {
(labelPair.type.equals(LabelType.RULE_LABEL) || labelPair.type.equals(LabelType.RULE_LIST_LABEL))) {
org.antlr.runtime.Token token = r instanceof LeftRecursiveRule
? ((GrammarAST) r.ast.getChild(0)).getToken()
? ((GrammarAST) r.ast.getChild(0)).getToken()
: labelPair.label.token;
String prevLabelOp = prevLabelPair.type.equals(LabelType.RULE_LIST_LABEL) ? "+=" : "=";
String labelOp = labelPair.type.equals(LabelType.RULE_LIST_LABEL) ? "+=" : "=";
@ -291,11 +275,11 @@ public class SymbolChecks {
for (Attribute attribute : attributes.attributes.values()) {
if (ruleNames.contains(attribute.name)) {
errMgr.grammarError(
errorType,
g.fileName,
attribute.token != null ? attribute.token : ((GrammarAST)r.ast.getChild(0)).token,
attribute.name,
r.name);
errorType,
g.fileName,
attribute.token != null ? attribute.token : ((GrammarAST) r.ast.getChild(0)).token,
attribute.name,
r.name);
}
}
}
@ -308,11 +292,11 @@ public class SymbolChecks {
Set<String> conflictingKeys = attributes.intersection(referenceAttributes);
for (String key : conflictingKeys) {
errMgr.grammarError(
errorType,
g.fileName,
attributes.get(key).token != null ? attributes.get(key).token : ((GrammarAST) r.ast.getChild(0)).token,
key,
r.name);
errorType,
g.fileName,
attributes.get(key).token != null ? attributes.get(key).token : ((GrammarAST)r.ast.getChild(0)).token,
key,
r.name);
}
}
@ -341,8 +325,122 @@ public class SymbolChecks {
}
}
// CAN ONLY CALL THE TWO NEXT METHODS AFTER GRAMMAR HAS RULE DEFS (see semanticpipeline)
/**
* Algorithm steps:
* 1. Collect all simple string literals (i.e. 'asdf', 'as' 'df', but not [a-z]+, 'a'..'z')
* for all lexer rules in each mode except of autogenerated tokens ({@link #getSingleTokenValues(Rule) getSingleTokenValues})
* 2. Compare every string literal with each other ({@link #checkForOverlap(Grammar, Rule, Rule, List<String>, List<String>) checkForOverlap})
* and throw TOKEN_UNREACHABLE warning if the same string found.
* Complexity: O(m * n^2 / 2), approximately equals to O(n^2)
* where m - number of modes, n - average number of lexer rules per mode.
* See also testUnreachableTokens unit test for details.
*/
public void checkForUnreachableTokens(Grammar g) {
if (g.isLexer()) {
LexerGrammar lexerGrammar = (LexerGrammar)g;
for (List<Rule> rules : lexerGrammar.modes.values()) {
// Collect string literal lexer rules for each mode
List<Rule> stringLiteralRules = new ArrayList<>();
List<List<String>> stringLiteralValues = new ArrayList<>();
for (int i = 0; i < rules.size(); i++) {
Rule rule = rules.get(i);
List<String> ruleStringAlts = getSingleTokenValues(rule);
if (ruleStringAlts != null && ruleStringAlts.size() > 0) {
stringLiteralRules.add(rule);
stringLiteralValues.add(ruleStringAlts);
}
}
// Check string sets intersection
for (int i = 0; i < stringLiteralRules.size(); i++) {
List<String> firstTokenStringValues = stringLiteralValues.get(i);
Rule rule1 = stringLiteralRules.get(i);
checkForOverlap(g, rule1, rule1, firstTokenStringValues, stringLiteralValues.get(i));
// Check fragment rules only with themself
if (!rule1.isFragment()) {
for (int j = i + 1; j < stringLiteralRules.size(); j++) {
Rule rule2 = stringLiteralRules.get(j);
if (!rule2.isFragment()) {
checkForOverlap(g, rule1, stringLiteralRules.get(j), firstTokenStringValues, stringLiteralValues.get(j));
}
}
}
}
}
}
}
/**
* {@return} list of simple string literals for rule {@param rule}
*/
private List<String> getSingleTokenValues(Rule rule)
{
List<String> values = new ArrayList<>();
for (Alternative alt : rule.alt) {
if (alt != null) {
// select first alt if token has a command
Tree rootNode = alt.ast.getChildCount() == 2 &&
alt.ast.getChild(0) instanceof AltAST && alt.ast.getChild(1) instanceof GrammarAST
? alt.ast.getChild(0)
: alt.ast;
if (rootNode.getTokenStartIndex() == -1) {
continue; // ignore autogenerated tokens from combined grammars that start with T__
}
// Ignore alt if contains not only string literals (repetition, optional)
boolean ignore = false;
StringBuilder currentValue = new StringBuilder();
for (int i = 0; i < rootNode.getChildCount(); i++) {
Tree child = rootNode.getChild(i);
if (!(child instanceof TerminalAST)) {
ignore = true;
break;
}
TerminalAST terminalAST = (TerminalAST)child;
if (terminalAST.token.getType() != ANTLRLexer.STRING_LITERAL) {
ignore = true;
break;
}
else {
String text = terminalAST.token.getText();
currentValue.append(text.substring(1, text.length() - 1));
}
}
if (!ignore) {
values.add(currentValue.toString());
}
}
}
return values;
}
/**
* For same rule compare values from next index:
* TOKEN_WITH_SAME_VALUES: 'asdf' | 'asdf';
* For different rules compare from start value:
* TOKEN1: 'asdf';
* TOKEN2: 'asdf';
*/
private void checkForOverlap(Grammar g, Rule rule1, Rule rule2, List<String> firstTokenStringValues, List<String> secondTokenStringValues) {
for (int i = 0; i < firstTokenStringValues.size(); i++) {
int secondTokenInd = rule1 == rule2 ? i + 1 : 0;
String str1 = firstTokenStringValues.get(i);
for (int j = secondTokenInd; j < secondTokenStringValues.size(); j++) {
String str2 = secondTokenStringValues.get(j);
if (str1.equals(str2)) {
errMgr.grammarError(ErrorType.TOKEN_UNREACHABLE, g.fileName,
((GrammarAST) rule2.ast.getChild(0)).token, rule2.name, str2, rule1.name);
}
}
}
}
// CAN ONLY CALL THE TWO NEXT METHODS AFTER GRAMMAR HAS RULE DEFS (see semanticpipeline)
public void checkRuleArgs(Grammar g, List<GrammarAST> rulerefs) {
if ( rulerefs==null ) return;
for (GrammarAST ref : rulerefs) {
@ -351,12 +449,12 @@ public class SymbolChecks {
GrammarAST arg = (GrammarAST)ref.getFirstChildWithType(ANTLRParser.ARG_ACTION);
if ( arg!=null && (r==null || r.args==null) ) {
errMgr.grammarError(ErrorType.RULE_HAS_NO_ARGS,
g.fileName, ref.token, ruleName);
g.fileName, ref.token, ruleName);
}
else if ( arg==null && (r!=null&&r.args!=null) ) {
else if ( arg==null && (r!=null && r.args!=null) ) {
errMgr.grammarError(ErrorType.MISSING_RULE_ARGS,
g.fileName, ref.token, ruleName);
g.fileName, ref.token, ruleName);
}
}
}
@ -365,18 +463,18 @@ public class SymbolChecks {
for (GrammarAST dot : qualifiedRuleRefs) {
GrammarAST grammar = (GrammarAST)dot.getChild(0);
GrammarAST rule = (GrammarAST)dot.getChild(1);
g.tool.log("semantics", grammar.getText()+"."+rule.getText());
g.tool.log("semantics", grammar.getText()+"."+rule.getText());
Grammar delegate = g.getImportedGrammar(grammar.getText());
if ( delegate==null ) {
errMgr.grammarError(ErrorType.NO_SUCH_GRAMMAR_SCOPE,
g.fileName, grammar.token, grammar.getText(),
rule.getText());
g.fileName, grammar.token, grammar.getText(),
rule.getText());
}
else {
if ( g.getRule(grammar.getText(), rule.getText())==null ) {
errMgr.grammarError(ErrorType.NO_SUCH_RULE_IN_SCOPE,
g.fileName, rule.token, grammar.getText(),
rule.getText());
g.fileName, rule.token, grammar.getText(),
rule.getText());
}
}
}

View File

@ -1074,6 +1074,21 @@ public enum ErrorType {
"unicode property escapes not allowed in lexer charset range: <arg>",
ErrorSeverity.ERROR),
/**
* Compiler Warning 184.
*
* <p>The token value overlapped by another token or self</p>
*
* <pre>
* TOKEN1: 'value';
* TOKEN2: 'value'; // warning
* </pre>
*/
TOKEN_UNREACHABLE(
184,
"One of the token <arg> values unreachable. <arg2> is always overlapped by token <arg3>",
ErrorSeverity.WARNING),
/*
* Backward incompatibility errors
*/