more doc
This commit is contained in:
parent
1c3b4515ea
commit
e1073410f8
|
@ -0,0 +1,242 @@
|
|||
# Integrating ANTLR JavaScript parsers with ACE editor
|
||||
|
||||
Having the ability to parse code other than JavaScript is great, but nowadays users expect to be able to edit code with nice edit features such as keyword highlighting, indentation and brace matching, and advanced ones such as syntax checking.
|
||||
|
||||
I have been through the process of integrating an ANTLR parser with ACE, the dominant code editor for web based code editing. Information about ACE can be found on their web site.
|
||||
|
||||
This page describes my experience, and humbly aims to help you get started. It is not however a reference guide, and no support is provided.
|
||||
|
||||
## Architecture
|
||||
|
||||
The ACE editor is organized as follows
|
||||
|
||||
1. The editor itself is a <div> which once initialized comprises a number of elements. This UI element is responsible for the display, and the generation of edit events.
|
||||
1. The editor relies on a Session, which manages events and configuration.
|
||||
1. The code itself is stored in a Document. Any insertion or deletion of text is reflected in the Document.
|
||||
1. Keyword highlighting, indentation and brace matching are delegated to a mode. There is no direct equivalent of an ACE mode in ANTLR. While keywords are the equivalent of ANTLR lexer tokens, indentation and brace matching are edit tasks, not parsing ones. A given ACE editor can only have one mode, which corresponds to the language being edited. There is no need for ANTLR integration to support keyword highlighting, indentation and brace matching.
|
||||
1. Syntax checking is delegated to a worker. This is where ANTLR integration is needed. If syntax checking is enabled, ACE asks the mode to create a worker. In JavaScript, workers run in complete isolation i.e. they don't share code or variables with other workers, or with the HTML page itself.
|
||||
1. The below diagram describes how the whole system works. In green are the components *you* need to provide. You'll notice that there is no need to load ANTLR in the HTML page itself. You'll also notice that ACE maintains a document in each thread. This is done through low level events sent by the ACE session to the worker which describe the delta. Once applied to the worker document, a high level event is triggered, which is easy to handle since at this point the worker document is a perfect copy of the UI document.
|
||||
|
||||
<img src=images/ACE-Architecture.001.png>
|
||||
|
||||
## Step-by-step guide
|
||||
|
||||
The first thing to do is to create an editor in your html page. This is thoroughly described in the ACE documentation, so we'll just sum it up here:
|
||||
|
||||
```xml
|
||||
<script src="../js/ace/ace.js" type="text/javascript" charset="utf-8"></script>
|
||||
<script>
|
||||
var editor = ace.edit("editor");
|
||||
</script>
|
||||
```
|
||||
|
||||
This should give you a working editor. You may want to control its sizing using CSS. I personally load the editor in an iframe and set its style to position: absolute, top: 0, left: 0 etc... but I'm sure you know better than me how to achieve results.
|
||||
|
||||
The second thing to do is to configure the ACE editor to use your mode i.e. language configuration. A good place to start is to inherit from the built-in TextMode. The following is a very simple example, which only caters for comments, literals, and a limited subset of separators and keywords :
|
||||
|
||||
```javascript
|
||||
ace.define('ace/mode/my-mode',["require","exports","module","ace/lib/oop","ace/mode/text","ace/mode/text_highlight_rules", "ace/worker/worker_client" ], function(require, exports, module) {
|
||||
var oop = require("ace/lib/oop");
|
||||
var TextMode = require("ace/mode/text").Mode;
|
||||
var TextHighlightRules = require("ace/mode/text_highlight_rules").TextHighlightRules;
|
||||
|
||||
var MyHighlightRules = function() {
|
||||
var keywordMapper = this.createKeywordMapper({
|
||||
"keyword.control": "if|then|else",
|
||||
"keyword.operator": "and|or|not",
|
||||
"keyword.other": "class",
|
||||
"storage.type": "int|float|text",
|
||||
"storage.modifier": "private|public",
|
||||
"support.function": "print|sort",
|
||||
"constant.language": "true|false"
|
||||
}, "identifier");
|
||||
this.$rules = {
|
||||
"start": [
|
||||
{ token : "comment", regex : "//" },
|
||||
{ token : "string", regex : '["](?:(?:\\\\.)|(?:[^"\\\\]))*?["]' },
|
||||
{ token : "constant.numeric", regex : "0[xX][0-9a-fA-F]+\\b" },
|
||||
{ token : "constant.numeric", regex: "[+-]?\\d+(?:(?:\\.\\d*)?(?:[eE][+-]?\\d+)?)?\\b" },
|
||||
{ token : "keyword.operator", regex : "!|%|\\\\|/|\\*|\\-|\\+|~=|==|<>|!=|<=|>=|=|<|>|&&|\\|\\|" },
|
||||
{ token : "punctuation.operator", regex : "\\?|\\:|\\,|\\;|\\." },
|
||||
{ token : "paren.lparen", regex : "[[({]" },
|
||||
{ token : "paren.rparen", regex : "[\\])}]" },
|
||||
{ token : "text", regex : "\\s+" },
|
||||
{ token: keywordMapper, regex: "[a-zA-Z_$][a-zA-Z0-9_$]*\\b" }
|
||||
]
|
||||
};
|
||||
};
|
||||
oop.inherits(MyHighlightRules, TextHighlightRules);
|
||||
|
||||
var MyMode = function() {
|
||||
this.HighlightRules = MyHighlightRules;
|
||||
};
|
||||
oop.inherits(MyMode, TextMode);
|
||||
|
||||
(function() {
|
||||
|
||||
this.$id = "ace/mode/my-mode";
|
||||
|
||||
}).call(MyMode.prototype);
|
||||
|
||||
exports.Mode = MyMode;
|
||||
});
|
||||
```
|
||||
|
||||
Now if you store the above in a file called "my-mode.js", setting the ACE Editor becomes straightforward:
|
||||
|
||||
```xml
|
||||
<script src="../js/ace/ace.js" type="text/javascript" charset="utf-8"></script>
|
||||
<script src="../js/my-mode.js" type="text/javascript" charset="utf-8"></script>
|
||||
<script>
|
||||
var editor = ace.edit("editor");
|
||||
editor.getSession().setMode("ace/mode/my-mode");
|
||||
</script>
|
||||
```
|
||||
|
||||
At this point you should have a working editor, able to highlight keywords. You may wonder why you need to set the tokens when you have already done so in your ANTLR lexer grammar. First, ACE expects a classification (control, operator, type...) which does not exist in ANTLR. Second, there is no need for ANTLR to achieve this, since ACE comes with its own lexer.
|
||||
|
||||
Ok, now that we have a working editor comes the time where we need syntax validation. This is where the worker comes in the picture.
|
||||
|
||||
Creating the worker is the responsibility of the mode you provide. So you need to enhance it with something like the following:
|
||||
|
||||
```javascript
|
||||
var WorkerClient = require("ace/worker/worker_client").WorkerClient;
|
||||
this.createWorker = function(session) {
|
||||
this.$worker = new WorkerClient(["ace"], "ace/worker/my-worker", "MyWorker", "../js/my-worker.js");
|
||||
this.$worker.attachToDocument(session.getDocument());
|
||||
|
||||
this.$worker.on("errors", function(e) {
|
||||
session.setAnnotations(e.data);
|
||||
});
|
||||
|
||||
this.$worker.on("annotate", function(e) {
|
||||
session.setAnnotations(e.data);
|
||||
});
|
||||
|
||||
this.$worker.on("terminate", function() {
|
||||
session.clearAnnotations();
|
||||
});
|
||||
|
||||
return this.$worker;
|
||||
|
||||
};
|
||||
```
|
||||
|
||||
The above code needs to be placed in the existing worker, after:
|
||||
|
||||
```javascript
|
||||
this.$id = "ace/mode/my-mode";
|
||||
```
|
||||
|
||||
Please note that the mode code runs on the UI side, not the worker side. The event handlers here are for events sent by the worker, not to the worker.
|
||||
|
||||
Obviously the above won't work out of the box, because you need to provide the "my-worker.js" file.
|
||||
|
||||
Creating a worker from scratch is not something I've tried. Simply put, your worker needs to handle all messages sent by ACE using the WorkerClient created by the mode. This is not a simple task, and is better delegated to existing ACE code, so we can focus on tasks specific to our language.
|
||||
|
||||
What I did is I started from "mode-json.js", a rather simple worker which comes with ACE, stripped out all JSON validation related stuff out of it, and saved the remaining code in a file name "worker-base.js" which you can find [here](resources/worker-base.js). Once this done, I was able to create a simple worker, as follows:
|
||||
|
||||
```javascript
|
||||
importScripts("worker-base.js");
|
||||
ace.define('ace/worker/my-worker',["require","exports","module","ace/lib/oop","ace/worker/mirror"], function(require, exports, module) {
|
||||
"use strict";
|
||||
|
||||
var oop = require("ace/lib/oop");
|
||||
var Mirror = require("ace/worker/mirror").Mirror;
|
||||
|
||||
var MyWorker = function(sender) {
|
||||
Mirror.call(this, sender);
|
||||
this.setTimeout(200);
|
||||
this.$dialect = null;
|
||||
};
|
||||
|
||||
oop.inherits(MyWorker, Mirror);
|
||||
|
||||
(function() {
|
||||
|
||||
this.onUpdate = function() {
|
||||
var value = this.doc.getValue();
|
||||
var annotations = validate(value);
|
||||
this.sender.emit("annotate", annotations);
|
||||
};
|
||||
|
||||
}).call(MyWorker.prototype);
|
||||
|
||||
exports.MyWorker = MyWorker;
|
||||
});
|
||||
|
||||
var validate = function(input) {
|
||||
return [ { row: 0, column: 0, text: "MyMode says Hello!", type: "error" } ];
|
||||
};
|
||||
```
|
||||
|
||||
At this point, you should have an editor which displays an error icon next to the first line. When you hover over the error icon, it should display: MyMode says Hello!. Is that not a friendly worker? Yum.
|
||||
|
||||
What remains to be done is have our validate function actually validate the input. Finally ANTLR comes in the picture!
|
||||
|
||||
To start with, let's load ANTLR and your parser, listener etc.. Easy, since you could write:
|
||||
|
||||
```
|
||||
var antlr4 = require('antlr4/index');
|
||||
```
|
||||
|
||||
This may work, but it's actually unreliable. The reason is that the require function used by ANTLR, which exactly mimics the NodeJS require function, uses a different syntax than the require function that comes with ACE. So we need to bring in a require function that conforms to the NodeJS syntax. I personally use one that comes from Torben Haase's Honey project, which you can find here. But hey, now we're going to have 2 'require' functions not compatible with each other! Indeed, this is why you need to take special care, as follows:
|
||||
|
||||
```
|
||||
// load nodejs compatible require
|
||||
var ace_require = require;
|
||||
require = undefined;
|
||||
var Honey = { 'requirePath': ['..'] }; // walk up to js folder, see Honey docs
|
||||
importScripts("../lib/require.js");
|
||||
var antlr4_require = require;
|
||||
require = ace_require;
|
||||
Now it's safe to load antlr, and the parsers generated for your language. Assuming that your language files (generated or hand-built) are in a folder with an index.js file that calls require for each file, your parser loading code can be as simple as follows:
|
||||
// load antlr4 and myLanguage
|
||||
var antlr4, mylanguage;
|
||||
try {
|
||||
require = antlr4_require;
|
||||
antlr4 = require('antlr4/index');
|
||||
mylanguage = require('mylanguage/index');
|
||||
} finally {
|
||||
require = ace_require;
|
||||
}
|
||||
Please note the try-finally construct. ANTLR uses 'require' synchronously so it's perfectly safe to ignore the ACE 'require' while running ANTLR code. ACE itself does not guarantee synchronous execution, so you are much safer always switching 'require' back to 'ace_require'.
|
||||
Now detecting deep syntax errors in your code is a task for your ANTLR listener or visitor or whatever piece of code you've delegated this to. We're not going to describe this here, since it would require some knowledge of your language. However, detecting grammar syntax errors is something ANTLR does beautifully (isn't that why you went for ANTLR in the first place?). So what we will illustrate here is how to report grammar syntax errors. I have no doubt that from there, you will be able to extend the validator to suit your specific needs.
|
||||
Whenever ANTLR encounters an unexpected token, it fires an error. By default, the error is routed to an error listener which simply writes to the console.
|
||||
What we need to do is replace this listener by our own listener, se we can route errors to the ACE editor. First, let's create such a listener:
|
||||
// class for gathering errors and posting them to ACE editor
|
||||
var AnnotatingErrorListener = function(annotations) {
|
||||
antlr4.error.ErrorListener.call(this);
|
||||
this.annotations = annotations;
|
||||
return this;
|
||||
};
|
||||
|
||||
AnnotatingErrorListener.prototype = Object.create(antlr4.error.ErrorListener.prototype);
|
||||
AnnotatingErrorListener.prototype.constructor = AnnotatingErrorListener;
|
||||
|
||||
AnnotatingErrorListener.prototype.syntaxError = function(recognizer, offendingSymbol, line, column, msg, e) {
|
||||
this.annotations.push({
|
||||
row: line - 1,
|
||||
column: column,
|
||||
text: msg,
|
||||
type: "error"
|
||||
});
|
||||
};
|
||||
|
||||
With this, all that remains to be done is plug the listener in when we parse the code. Here is how I do it:
|
||||
|
||||
var validate = function(input) {
|
||||
var stream = new antlr4.InputStream(input);
|
||||
var lexer = new mylanguage.MyLexer(stream);
|
||||
var tokens = new antlr4.CommonTokenStream(lexer);
|
||||
var parser = new mylanguage.MyParser(tokens);
|
||||
var annotations = [];
|
||||
var listener = new AnnotatingErrorListener(annotations)
|
||||
parser.removeErrorListeners();
|
||||
parser.addErrorListener(listener);
|
||||
parser.parseMyRule();
|
||||
return annotations;
|
||||
};
|
||||
You know what? That's it! You now have an ACE editor that does syntax validation using ANTLR! I hope you find this useful, and simple enough to get started.
|
||||
What I did not address here is packaging, not something I'm an expert at. The good news is that it makes development simple, since I don't have to run any compilation process. I just edit my code, reload my editor page, and check how it goes.
|
||||
Now wait, hey! How do you debug this? Well, as usual, using Chrome, since neither Firefox or Safari are able to debug worker code. What a shame...
|
|
@ -0,0 +1,99 @@
|
|||
# C♯
|
||||
|
||||
See also [Sam Harwell's Alternative C# target](https://github.com/tunnelvisionlabs/antlr4cs)
|
||||
|
||||
### Which frameworks are supported?
|
||||
|
||||
The C# runtime is CLS compliant, and only requires a corresponding 3.5 .Net framework.
|
||||
|
||||
In practice, the runtime has been extensively tested against:
|
||||
|
||||
* Microsoft .Net 3.5 framework
|
||||
* Mono .Net 3.5 framework
|
||||
|
||||
No issue was found, so you should find that the runtime works pretty much against any recent .Net framework.
|
||||
|
||||
### How do I get started?
|
||||
|
||||
You will find full instructions on the Git web page for ANTLR C# runtime.
|
||||
|
||||
### How do I use the runtime from my project?
|
||||
|
||||
(i.e., How do I run the generated lexer and/or parser?)
|
||||
|
||||
Let's suppose that your grammar is named, as above, "MyGrammar".
|
||||
|
||||
Let's suppose this parser comprises a rule named "StartRule"
|
||||
|
||||
The tool will have generated for you the following files:
|
||||
|
||||
* MyGrammarLexer.cs
|
||||
* MyGrammarParser.cs
|
||||
* MyGrammarListener.cs (if you have not activated the -no-listener option)
|
||||
* MyGrammarBaseListener.js (if you have not activated the -no-listener option)
|
||||
* MyGrammarVisitor.js (if you have activated the -visitor option)
|
||||
* MyGrammarBaseVisitor.js (if you have activated the -visitor option)
|
||||
|
||||
Now a fully functioning code might look like the following:
|
||||
|
||||
```
|
||||
using Antlr4.Runtime;
|
||||
|
||||
public void MyParseMethod() {
|
||||
String input = "your text to parse here";
|
||||
AntlrInputStream stream = new InputStream(input);
|
||||
ITokenSource lexer = new MyGrammarLexer(stream);
|
||||
ITokenStream tokens = new CommonTokenStream(lexer);
|
||||
MyGrammarParser parser = new MyGrammarParser(tokens);
|
||||
parser.buildParseTrees = true;
|
||||
IParseTree tree = parser.StartRule();
|
||||
}
|
||||
```
|
||||
|
||||
This program will work. But it won't be useful unless you do one of the following:
|
||||
|
||||
* you visit the parse tree using a custom listener
|
||||
* you visit the parse tree using a custom visitor
|
||||
* your grammar comprises production code (like AntLR3)
|
||||
|
||||
(please note that production code is target specific, so you can't have multi target grammars that include production code)
|
||||
|
||||
### How do I create and run a custom listener?
|
||||
|
||||
Let's suppose your MyGrammar grammar comprises 2 rules: "key" and "value".
|
||||
|
||||
The antlr4 tool will have generated the following listener (only partial code shown here):
|
||||
|
||||
```
|
||||
interface IMyGrammarParserListener : IParseTreeListener {
|
||||
void EnterKey (MyGrammarParser.KeyContext context);
|
||||
void ExitKey (MyGrammarParser.KeyContext context);
|
||||
void EnterValue (MyGrammarParser.ValueContext context);
|
||||
void ExitValue (MyGrammarParser.ValueContext context);
|
||||
}
|
||||
```
|
||||
|
||||
In order to provide custom behavior, you might want to create the following class:
|
||||
|
||||
```
|
||||
class KeyPrinter : MyGrammarBaseListener {
|
||||
// override default listener behavior
|
||||
void ExitKey (MyGrammarParser.KeyContext context) {
|
||||
Console.WriteLine("Oh, a key!");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
In order to execute this listener, you would simply add the following lines to the above code:
|
||||
|
||||
|
||||
```
|
||||
...
|
||||
IParseTree tree = parser.StartRule() - only repeated here for reference
|
||||
KeyPrinter printer = new KeyPrinter();
|
||||
ParseTreeWalker.DEFAULT.walk(printer, tree);
|
||||
```
|
||||
|
||||
Further information can be found from The Definitive ANTLR Reference book.
|
||||
|
||||
The C# implementation of ANTLR is as close as possible to the Java one, so you shouldn't find it difficult to adapt the examples for C#.
|
Binary file not shown.
After Width: | Height: | Size: 87 KiB |
|
@ -49,7 +49,7 @@ This documentation is a reference and summarizes grammar syntax and the key sema
|
|||
|
||||
* [ANTLR Tool Command Line Options](tool-options.md)
|
||||
|
||||
* Runtime Libraries and Code Generation Targets
|
||||
* [Runtime Libraries and Code Generation Targets](targets.md)
|
||||
|
||||
* Parser and lexer interpreters
|
||||
|
||||
|
|
|
@ -0,0 +1,157 @@
|
|||
# JavaScript
|
||||
|
||||
## Which browsers are supported?
|
||||
|
||||
In theory, all browsers supporting ECMAScript 5.1.
|
||||
|
||||
In practice, this target has been extensively tested against:
|
||||
|
||||
* Firefox 34.0.5
|
||||
* Safari 8.0.2
|
||||
* Chrome 39.0.2171
|
||||
* Explorer 11.0.3
|
||||
|
||||
The tests were conducted using Selenium. No issue was found, so you should find that the runtime works pretty much against any recent JavaScript engine.
|
||||
|
||||
## Is NodeJS supported?
|
||||
|
||||
The runtime has also been extensively tested against Node.js 0.10.33. No issue was found.
|
||||
|
||||
## How to create a JavaScript lexer or parser?
|
||||
|
||||
This is pretty much the same as creating a Java lexer or parser, except you need to specify the language target, for example:
|
||||
|
||||
```bash
|
||||
$ antlr4 -Dlanguage=JavaScript MyGrammar.g4
|
||||
```
|
||||
|
||||
For a full list of antlr4 tool options, please visit the [tool documentation page](tool-options.md).
|
||||
|
||||
## Where can I get the runtime?
|
||||
|
||||
Once you've generated the lexer and/or parser code, you need to download the runtime.
|
||||
|
||||
The JavaScript runtime is available from the ANTLR web site [download section](http://www.antlr.org/download/index.html). The runtime is provided in the form of source code, so no additional installation is required.
|
||||
|
||||
We will not document here how to refer to the runtime from your project, since this would differ a lot depending on your project type and IDE.
|
||||
|
||||
## How do I get the runtime in my browser?
|
||||
|
||||
The runtime is quite big and is currently maintained in the form of around 50 scripts, which follow the same structure as the runtimes for other targets (Java, C#, Python...).
|
||||
|
||||
This structure is key in keeping code maintainable and consistent across targets.
|
||||
|
||||
However, it would be a bit of a problem when it comes to get it into a browser. Nobody wants to write 50 times:
|
||||
|
||||
```
|
||||
<script src='lib/myscript.js'>
|
||||
```
|
||||
|
||||
In order to avoid having to do this, and also to have the exact same code for browsers and Node.js, we rely on a script which provides the equivalent of the Node.js 'require' function.
|
||||
|
||||
This script is provided by Torben Haase, and is NOT part of ANTLR JavaScript runtime, although the runtime heavily relies on it. Please note that syntax for 'require' in NodeJS is different from the one implemented by RequireJS and similar frameworks.
|
||||
|
||||
So in short, assuming you have at the root of your web site, both the 'antlr4' directory and a 'lib' directory with 'require.js' inside it, all you need to put in your HTML header is the following:
|
||||
|
||||
```xml
|
||||
<script src='lib/require.js'>
|
||||
<script>
|
||||
var antlr4 = require('antlr4/index');
|
||||
</script>
|
||||
```
|
||||
|
||||
This will load the runtime asynchronously.
|
||||
|
||||
## How do I get the runtime in Node.js?
|
||||
|
||||
Right now, there is no npm package available, so you need to register a link instead. This can be done by running the following command from the antlr4 directory:
|
||||
|
||||
```bash
|
||||
$ npm link antlr4
|
||||
```
|
||||
|
||||
This will install antlr4 using the package.son descriptor that comes with the script.
|
||||
|
||||
## How do I run the generated lexer and/or parser?
|
||||
|
||||
Let's suppose that your grammar is named, as above, "MyGrammar". Let's suppose this parser comprises a rule named "StartRule". The tool will have generated for you the following files:
|
||||
|
||||
* MyGrammarLexer.js
|
||||
* MyGrammarParser.js
|
||||
* MyGrammarListener.js (if you have not activated the -no-listener option)
|
||||
* MyGrammarVisitor.js (if you have activated the -visitor option)
|
||||
|
||||
(Developers used to Java/C# ANTLR will notice that there is no base listener or visitor generated, this is because JavaScript having no support for interfaces, the generated listener and visitor are fully fledged classes)
|
||||
|
||||
Now a fully functioning script might look like the following:
|
||||
|
||||
```javascript
|
||||
var input = "your text to parse here"
|
||||
var chars = new antlr4.InputStream(input);
|
||||
var lexer = new MyGrammarLexer.MyGrammarLexer(chars);
|
||||
var tokens = new antlr4.CommonTokenStream(lexer);
|
||||
var parser = new MyGrammarParser.MyGrammarParser(tokens);
|
||||
parser.buildParseTrees = true;
|
||||
var tree = parser.MyStartRule();
|
||||
```
|
||||
|
||||
This program will work. But it won't be useful unless you do one of the following:
|
||||
|
||||
* you visit the parse tree using a custom listener
|
||||
* you visit the parse tree using a custom visitor
|
||||
* your grammar comprises production code (like AntLR3)
|
||||
|
||||
(please note that production code is target specific, so you can't have multi target grammars that include production code)
|
||||
|
||||
## How do I create and run a custom listener?
|
||||
|
||||
Let's suppose your MyGrammar grammar comprises 2 rules: "key" and "value". The antlr4 tool will have generated the following listener:
|
||||
|
||||
```javascript
|
||||
MyGrammarListener = function(ParseTreeListener) {
|
||||
// some code here
|
||||
}
|
||||
// some code here
|
||||
MyGrammarListener.prototype.enterKey = function(ctx) {};
|
||||
MyGrammarListener.prototype.exitKey = function(ctx) {};
|
||||
MyGrammarListener.prototype.enterValue = function(ctx) {};
|
||||
MyGrammarListener.prototype.exitValue = function(ctx) {};
|
||||
```
|
||||
|
||||
In order to provide custom behavior, you might want to create the following class:
|
||||
|
||||
```javascript
|
||||
KeyPrinter = function() {
|
||||
MyGrammarListener.call(this); // inherit default listener
|
||||
return this;
|
||||
};
|
||||
|
||||
// inherit default listener
|
||||
KeyPrinter.prototype = Object.create(MyGrammarListener.prototype);
|
||||
KeyPrinter.prototype.constructor = KeyPrinter;
|
||||
|
||||
// override default listener behavior
|
||||
KeyPrinter.prototype.exitKey = function(ctx) {
|
||||
console.log("Oh, a key!");
|
||||
};
|
||||
```
|
||||
|
||||
In order to execute this listener, you would simply add the following lines to the above code:
|
||||
|
||||
```javascript
|
||||
...
|
||||
tree = parser.StartRule() - only repeated here for reference
|
||||
var printer = new KeyPrinter();
|
||||
antlr4.tree.ParseTreeWalker.DEFAULT.walk(printer, tree);
|
||||
```
|
||||
|
||||
## How do I integrate my parser with ACE editor?
|
||||
|
||||
This specific task is described in this [dedicated page](ace-javascript-target.md).
|
||||
|
||||
## How can I learn more about ANTLR?
|
||||
|
||||
|
||||
Further information can be found from "The definitive ANTLR 4 reference" book.
|
||||
|
||||
The JavaScript implementation of ANTLR is as close as possible to the Java one, so you shouldn't find it difficult to adapt the book's examples to JavaScript.
|
|
@ -0,0 +1,128 @@
|
|||
# Python (2 and 3)
|
||||
|
||||
The examples from the ANTLR 4 book converted to Python are [here](https://github.com/jszheng/py3antlr4book).
|
||||
|
||||
There are 2 Python targets: `Python2` and `Python3`. This is because there is only limited compatibility between those 2 versions of the language. Please refer to the [Python documentation](https://wiki.python.org/moin/Python2orPython3) for full details.
|
||||
|
||||
How to create a Python lexer or parser?
|
||||
This is pretty much the same as creating a Java lexer or parser, except you need to specify the language target, for example:
|
||||
|
||||
```
|
||||
$ antlr4 -Dlanguage=Python2 MyGrammar.g4
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```
|
||||
$ antlr4 -Dlanguage=Python3 MyGrammar.g4
|
||||
```
|
||||
|
||||
For a full list of antlr4 tool options, please visit the tool documentation page.
|
||||
|
||||
## Where can I get the runtime?
|
||||
|
||||
Once you've generated the lexer and/or parser code, you need to download the runtime. The Python runtimes are available from PyPI:
|
||||
|
||||
* https://pypi.python.org/pypi/antlr4-python2-runtime/
|
||||
* https://pypi.python.org/pypi/antlr4-python3-runtime/
|
||||
|
||||
The runtimes are provided in the form of source code, so no additional installation is required.
|
||||
|
||||
We will not document here how to refer to the runtime from your Python project, since this would differ a lot depending on your project type and IDE.
|
||||
|
||||
## How do I run the generated lexer and/or parser?
|
||||
|
||||
Let's suppose that your grammar is named, as above, "MyGrammar". Let's suppose this parser comprises a rule named "StartRule". The tool will have generated for you the following files:
|
||||
|
||||
* MyGrammarLexer.py
|
||||
* MyGrammarParser.py
|
||||
* MyGrammarListener.py (if you have not activated the -no-listener option)
|
||||
* MyGrammarVisitor.py (if you have activated the -visitor option)
|
||||
|
||||
(Developers used to Java/C# AntLR will notice that there is no base listener or visitor generated, this is because Python having no support for interfaces, the generated listener and visitor are fully fledged classes)
|
||||
|
||||
Now a fully functioning script might look like the following:
|
||||
|
||||
```python
|
||||
from antlr4 import *
|
||||
from MyGrammarLexer import MyGrammarLexer
|
||||
from MyGrammarParser import MyGrammarParser
|
||||
|
||||
def main(argv):
|
||||
input = FileStream(argv[1])
|
||||
lexer = MyGrammarLexer(input)
|
||||
stream = CommonTokenStream(lexer)
|
||||
parser = MyGrammarParser(stream)
|
||||
tree = parser.StartRule()
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv)
|
||||
```
|
||||
|
||||
This program will work. But it won't be useful unless you do one of the following:
|
||||
|
||||
* you visit the parse tree using a custom listener
|
||||
* you visit the parse tree using a custom visitor
|
||||
* your grammar comprises production code (like ANTLR3)
|
||||
|
||||
(please note that production code is target specific, so you can't have multi target grammars that include production code, except for very limited use cases, see below)
|
||||
|
||||
## How do I create and run a custom listener?
|
||||
|
||||
Let's suppose your MyGrammar grammar comprises 2 rules: "key" and "value". The antlr4 tool will have generated the following listener:
|
||||
|
||||
```python
|
||||
class MyGrammarListener(ParseTreeListener):
|
||||
def enterKey(self, ctx):
|
||||
pass
|
||||
def exitKey(self, ctx):
|
||||
pass
|
||||
def enterValue(self, ctx):
|
||||
pass
|
||||
def exitValue(self, ctx):
|
||||
pass
|
||||
```
|
||||
|
||||
In order to provide custom behavior, you might want to create the following class:
|
||||
|
||||
```python
|
||||
class KeyPrinter(MyGrammarListener):
|
||||
def exitKey(self, ctx):
|
||||
print("Oh, a key!")
|
||||
```
|
||||
|
||||
In order to execute this listener, you would simply add the following lines to the above code:
|
||||
|
||||
```
|
||||
...
|
||||
tree = parser.StartRule() - only repeated here for reference
|
||||
printer = KeyPrinter()
|
||||
walker = ParseTreeWalker()
|
||||
walker.walk(printer, tree)
|
||||
```
|
||||
|
||||
Further information can be found from the ANTLR 4 definitive guide.
|
||||
|
||||
The Python implementation of ANTLR is as close as possible to the Java one, so you shouldn't find it difficult to adapt the examples for Python.
|
||||
|
||||
## Target agnostic grammars
|
||||
|
||||
If your grammar is targeted to Python only, you may ignore the following. But if your goal is to get your Java parser to also run in Python, then you might find it useful.
|
||||
|
||||
1. Do not embed production code inside your grammar. This is not portable and will not be. Move all your code to listeners or visitors.
|
||||
1. The only production code absolutely required to sit with the grammar should be semantic predicates, like:
|
||||
```
|
||||
ID {$text.equals("test")}?
|
||||
```
|
||||
|
||||
Unfortunately, this is not portable, but you can work around it. The trick involves:
|
||||
|
||||
* deriving your parser from a parser you provide, such as BaseParser
|
||||
* implementing utility methods in this BaseParser, such as "isEqualText"
|
||||
* adding a "self" field to the Java/C# BaseParser, and initialize it with "this"
|
||||
|
||||
Thanks to the above, you should be able to rewrite the above semantic predicate as follows:
|
||||
|
||||
```
|
||||
ID {$self.isEqualText($text,"test")}?
|
||||
```
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,24 @@
|
|||
# Runtime Libraries and Code Generation Targets
|
||||
|
||||
This page lists the available and upcoming ANTLR runtimes. Please note that you won't find here language specific code generators. This is because there is only one tool, written in Java, which is able to generate lexer and parser code for all targets, through command line options. The tool can be invoked from the command line, or any integration plugin to popular IDEs and build systems: Eclipse, IntelliJ, Visual Studio, Maven. So whatever your environment and target is, you should be able to run the tool and produce the code in the targeted language. As of writing, the available targets are the following:
|
||||
|
||||
* **Java**<br>
|
||||
The [ANTLR v4 book](http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference) has a decent summary of the runtime library. We have added a useful XPath feature since the book was printed that lets you select bits of parse trees.
|
||||
<br>[Runtime API](http://www.antlr.org/api/Java/index.html)
|
||||
<br>See [Getting Started with ANTLR v4](getting-started.md)
|
||||
|
||||
* [C#](csharp-target.md)
|
||||
* [Python](python-target.md) (2 and 3)
|
||||
* [JavaScript](javascript-target.md)
|
||||
* Swift (not yet available)
|
||||
* C++ (not yet available)
|
||||
|
||||
## Target feature parity
|
||||
|
||||
New features generally appear in the Java target and then migrate to the other targets, but these other targets don't always get updated in the same overall tool release. This section tries to identify features added to Java that have not been added to the other targets.
|
||||
|
||||
|Feature|Java|C♯|JavaScript|Python2|Python3|Swift|C++|
|
||||
|-|-|-|-|-|-|-|-|
|
||||
|Ambiguous tree construction|4.5.1|-|-|-|-|-|-|
|
||||
|
||||
|
Loading…
Reference in New Issue