diff --git a/contributors.txt b/contributors.txt index 7d3b44a41..fb6040c78 100644 --- a/contributors.txt +++ b/contributors.txt @@ -88,3 +88,4 @@ YYYY/MM/DD, github id, Full name, email 2015/12/17, sebadur, Sebastian Badur, sebadur@users.noreply.github.com 2015/12/23, pboyer, Peter Boyer, peter.b.boyer@gmail.com 2015/12/24, dtymon, David Tymon, david.tymon@gmail.com +2016/03/27, beardlybread, Bradley Steinbacher, bradley.j.steinbacher@gmail.com diff --git a/doc/actions.md b/doc/actions.md index 4ca01f9ec..91b6de1e4 100644 --- a/doc/actions.md +++ b/doc/actions.md @@ -2,7 +2,7 @@ In Chapter 10, Attributes and Actions, we learned how to embed actions within grammars and looked at the most common token and rule attributes. This section summarizes the important syntax and semantics from that chapter and provides a complete list of all available attributes. (You can learn more about actions in the grammar from the free excerpt on listeners and actions.) -Actions are blocks of text written in the target language and enclosed in curly braces. The recognizer triggers them according to their locations within the grammar. For example, the following rule emits found a decl after the parser has seen a valid declaration: +Actions are blocks of text written in the target language and enclosed in curly braces. The recognizer triggers them according to their locations within the grammar. For example, the following rule emits "found a decl" after the parser has seen a valid declaration: ``` decl: type ID ';' {System.out.println("found a decl");} ; diff --git a/doc/grammars.md b/doc/grammars.md index 3a8e77b60..c40d974b6 100644 --- a/doc/grammars.md +++ b/doc/grammars.md @@ -83,7 +83,7 @@ $ grun MyELang stat If there were any `tokens` specifications, the main grammar would merge the token sets. Any named actions such as `@members` would be merged. In general, you should avoid named actions and actions within rules in imported grammars since that limits their reuse. ANTLR also ignores any options in imported grammars. -Imported grammars can also import other grammars. ANTLR pursues all imported grammars in a depth-first fashion. If two or more imported grammars define ruler, ANTLR chooses the first version of `r` it finds. In the following diagram, ANTLR examines grammars in the following order `Nested`, `G1`, `G3`, `G2`. +Imported grammars can also import other grammars. ANTLR pursues all imported grammars in a depth-first fashion. If two or more imported grammars define rule `r`, ANTLR chooses the first version of `r` it finds. In the following diagram, ANTLR examines grammars in the following order `Nested`, `G1`, `G3`, `G2`. diff --git a/doc/left-recursion.md b/doc/left-recursion.md index a08f4357d..3430e10e9 100644 --- a/doc/left-recursion.md +++ b/doc/left-recursion.md @@ -1,6 +1,6 @@ # Left-recursive rules -The most natural expression of a some common language constructs is left recursive. For example C declarators and arithmetic expressions. Unfortunately, left recursive specifications of arithmetic expressions are typically ambiguous but much easier to write out than the multiple levels required in a typical top-down grammar. Here is a sample ANTLR 4 grammar with a left recursive expression rule: +The most natural expression of some common language constructs is left recursive. For example C declarators and arithmetic expressions. Unfortunately, left recursive specifications of arithmetic expressions are typically ambiguous but much easier to write out than the multiple levels required in a typical top-down grammar. Here is a sample ANTLR 4 grammar with a left recursive expression rule: ``` stat: expr '=' expr ';' // e.g., x=y; or x=f(x); diff --git a/doc/lexer-rules.md b/doc/lexer-rules.md index 2141b4303..adda9e8b0 100644 --- a/doc/lexer-rules.md +++ b/doc/lexer-rules.md @@ -171,7 +171,7 @@ error(126): P.g4:3:4: cannot create implicit token for string literal '&' in non ## Lexer Rule Actions -An ANTLR lexer creates a Token object after matching a lexical rule. Each request for a token starts in Lexer.nextToken, which calls emit once it has identified a token.emit collects information from the current state of the lexer to build the token. It accesses fields `_type`, `_text`, `_channel`, `_tokenStartCharIndex`, `_tokenStartLine`, and `_tokenStartCharPositionInLine`. You can set the state of these with the various setter methods such as `setType`. For example, the following rule turns `enum` into an identifier if `enumIsKeyword` is false. +An ANTLR lexer creates a Token object after matching a lexical rule. Each request for a token starts in `Lexer.nextToken`, which calls `emit` once it has identified a token. `emit` collects information from the current state of the lexer to build the token. It accesses fields `_type`, `_text`, `_channel`, `_tokenStartCharIndex`, `_tokenStartLine`, and `_tokenStartCharPositionInLine`. You can set the state of these with the various setter methods such as `setType`. For example, the following rule turns `enum` into an identifier if `enumIsKeyword` is false. ``` ENUM : 'enum' {if (!enumIsKeyword) setType(Identifier);} ; @@ -255,7 +255,8 @@ WS : [ \r\t\n]+ -> skip ; ``` For multiple 'type()' commands, only the rightmost has an effect. -channel() + +### channel() ``` BLOCK_COMMENT diff --git a/doc/options.md b/doc/options.md index 1b2c2591b..36906db20 100644 --- a/doc/options.md +++ b/doc/options.md @@ -25,7 +25,7 @@ options {...} ## Rule Element Options -Token options have the form `T` as we saw in Section 5.4, [Dealing with Precedence, Left Recursion, and Associativity](http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). The only token option is assocand it accepts values left and right. Here’s a sample grammar with a left-recursive expression rule that specifies a token option on the `^` exponent operator token: +Token options have the form `T` as we saw in Section 5.4, [Dealing with Precedence, Left Recursion, and Associativity](http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). The only token option is `assoc`, and it accepts values `left` and `right`. Here’s a sample grammar with a left-recursive expression rule that specifies a token option on the `^` exponent operator token: ``` grammar ExprLR; @@ -40,7 +40,7 @@ INT : '0'..'9'+ ; WS : [ \n]+ -> skip ; ``` -Semantic predicates also accept an option, per [Catching failed semantic predicates](http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). The only valid option is the fail option, which takes either a string literal in double-quotes or an action that evaluates to a string. The string literal or string result from the action should be the message to emit upon predicate failure. +Semantic predicates also accept an option, per [Catching failed semantic predicates](http://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). The only valid option is the `fail` option, which takes either a string literal in double-quotes or an action that evaluates to a string. The string literal or string result from the action should be the message to emit upon predicate failure. ``` ints[int max] diff --git a/doc/predicates.md b/doc/predicates.md index af1d278e7..09998c425 100644 --- a/doc/predicates.md +++ b/doc/predicates.md @@ -29,7 +29,7 @@ expr: {istype()}? ID '(' expr ')' // ctor-style typecast ; ``` -The parser will only predict an expr from stat when `istype()||isfunc()` evaluates to true. This makes sense because the parser should only choose to match an expression if the upcoming `ID` is a type name or function name. It wouldn't make sense to just test one of the predicates in this case. Note that, when the parser gets to expritself, the parsing decision tests the predicates individually, one for each alternative. +The parser will only predict an expr from stat when `istype()||isfunc()` evaluates to true. This makes sense because the parser should only choose to match an expression if the upcoming `ID` is a type name or function name. It wouldn't make sense to just test one of the predicates in this case. Note that, when the parser gets to `expr` itself, the parsing decision tests the predicates individually, one for each alternative. If multiple predicates occur in a sequence, the parser joins them with the `&&` operator. For example, consider changing `stat` to include a predicate before the call `toexpr`: @@ -72,7 +72,7 @@ stat: {System.out.println("goto"); allowgoto=true;} {java5}? 'goto' ID ';' If we can't execute the action during prediction, we shouldn't evaluate the `{java5}?` predicate because it depends on that action. -The prediction process also can't see through token references. Token references have the side effect of advancing the input one symbol. A predicate that tested the current input symbol would find itself out of sync if the parser shifted it over the token reference. For example, in the following grammar, the predicates expectgetCurrentToken to return an ID token. +The prediction process also can't see through token references. Token references have the side effect of advancing the input one symbol. A predicate that tested the current input symbol would find itself out of sync if the parser shifted it over the token reference. For example, in the following grammar, the predicates expect `getCurrentToken` to return an `ID` token. ``` stat: '{' decl '}' diff --git a/doc/tree-matching.md b/doc/tree-matching.md index 5b7d94b58..f4c6d278c 100644 --- a/doc/tree-matching.md +++ b/doc/tree-matching.md @@ -29,7 +29,7 @@ ParseTreeMatch m = p.match(t); if ( m.succeeded() ) {...} ``` -We can also test for specific expressions or token values. For example, the following checks to see if t is an expression consisting of an identifier added to 0: +We can also test for specific expressions or token values. For example, the following checks to see if `t` is an expression consisting of an identifier added to 0: ```java ParseTree t = ...; // assume t is an expression