v4: Add configurable performance unit test

[git-p4: depot-paths = "//depot/code/antlr4/main/": change = 9498]
This commit is contained in:
sharwell 2011-11-30 09:37:12 -08:00
parent 49ea01136c
commit 1ba52d6f54
3 changed files with 2255 additions and 0 deletions

View File

@ -0,0 +1,929 @@
/*
[The "BSD licence"]
Copyright (c) 2007-2008 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The name of the author may not be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/** A Java 1.5 grammar for ANTLR v3 derived from the spec
*
* This is a very close representation of the spec; the changes
* are comestic (remove left recursion) and also fixes (the spec
* isn't exactly perfect). I have run this on the 1.4.2 source
* and some nasty looking enums from 1.5, but have not really
* tested for 1.5 compatibility.
*
* I built this with: java -Xmx100M org.antlr.Tool java.g
* and got two errors that are ok (for now):
* java.g:691:9: Decision can match input such as
* "'0'..'9'{'E', 'e'}{'+', '-'}'0'..'9'{'D', 'F', 'd', 'f'}"
* using multiple alternatives: 3, 4
* As a result, alternative(s) 4 were disabled for that input
* java.g:734:35: Decision can match input such as "{'$', 'A'..'Z',
* '_', 'a'..'z', '\u00C0'..'\u00D6', '\u00D8'..'\u00F6',
* '\u00F8'..'\u1FFF', '\u3040'..'\u318F', '\u3300'..'\u337F',
* '\u3400'..'\u3D2D', '\u4E00'..'\u9FFF', '\uF900'..'\uFAFF'}"
* using multiple alternatives: 1, 2
* As a result, alternative(s) 2 were disabled for that input
*
* You can turn enum on/off as a keyword :)
*
* Version 1.0 -- initial release July 5, 2006 (requires 3.0b2 or higher)
*
* Primary author: Terence Parr, July 2006
*
* Version 1.0.1 -- corrections by Koen Vanderkimpen & Marko van Dooren,
* October 25, 2006;
* fixed normalInterfaceDeclaration: now uses typeParameters instead
* of typeParameter (according to JLS, 3rd edition)
* fixed castExpression: no longer allows expression next to type
* (according to semantics in JLS, in contrast with syntax in JLS)
*
* Version 1.0.2 -- Terence Parr, Nov 27, 2006
* java spec I built this from had some bizarre for-loop control.
* Looked weird and so I looked elsewhere...Yep, it's messed up.
* simplified.
*
* Version 1.0.3 -- Chris Hogue, Feb 26, 2007
* Factored out an annotationName rule and used it in the annotation rule.
* Not sure why, but typeName wasn't recognizing references to inner
* annotations (e.g. @InterfaceName.InnerAnnotation())
* Factored out the elementValue section of an annotation reference. Created
* elementValuePair and elementValuePairs rules, then used them in the
* annotation rule. Allows it to recognize annotation references with
* multiple, comma separated attributes.
* Updated elementValueArrayInitializer so that it allows multiple elements.
* (It was only allowing 0 or 1 element).
* Updated localVariableDeclaration to allow annotations. Interestingly the JLS
* doesn't appear to indicate this is legal, but it does work as of at least
* JDK 1.5.0_06.
* Moved the Identifier portion of annotationTypeElementRest to annotationMethodRest.
* Because annotationConstantRest already references variableDeclarator which
* has the Identifier portion in it, the parser would fail on constants in
* annotation definitions because it expected two identifiers.
* Added optional trailing ';' to the alternatives in annotationTypeElementRest.
* Wouldn't handle an inner interface that has a trailing ';'.
* Swapped the expression and type rule reference order in castExpression to
* make it check for genericized casts first. It was failing to recognize a
* statement like "Class<Byte> TYPE = (Class<Byte>)...;" because it was seeing
* 'Class<Byte' in the cast expression as a less than expression, then failing
* on the '>'.
* Changed createdName to use typeArguments instead of nonWildcardTypeArguments.
* Changed the 'this' alternative in primary to allow 'identifierSuffix' rather than
* just 'arguments'. The case it couldn't handle was a call to an explicit
* generic method invocation (e.g. this.<E>doSomething()). Using identifierSuffix
* may be overly aggressive--perhaps should create a more constrained thisSuffix rule?
*
* Version 1.0.4 -- Hiroaki Nakamura, May 3, 2007
*
* Fixed formalParameterDecls, localVariableDeclaration, forInit,
* and forVarControl to use variableModifier* not 'final'? (annotation)?
*
* Version 1.0.5 -- Terence, June 21, 2007
* --a[i].foo didn't work. Fixed unaryExpression
*
* Version 1.0.6 -- John Ridgway, March 17, 2008
* Made "assert" a switchable keyword like "enum".
* Fixed compilationUnit to disallow "annotation importDeclaration ...".
* Changed "Identifier ('.' Identifier)*" to "qualifiedName" in more
* places.
* Changed modifier* and/or variableModifier* to classOrInterfaceModifiers,
* modifiers or variableModifiers, as appropriate.
* Renamed "bound" to "typeBound" to better match language in the JLS.
* Added "memberDeclaration" which rewrites to methodDeclaration or
* fieldDeclaration and pulled type into memberDeclaration. So we parse
* type and then move on to decide whether we're dealing with a field
* or a method.
* Modified "constructorDeclaration" to use "constructorBody" instead of
* "methodBody". constructorBody starts with explicitConstructorInvocation,
* then goes on to blockStatement*. Pulling explicitConstructorInvocation
* out of expressions allowed me to simplify "primary".
* Changed variableDeclarator to simplify it.
* Changed type to use classOrInterfaceType, thus simplifying it; of course
* I then had to add classOrInterfaceType, but it is used in several
* places.
* Fixed annotations, old version allowed "@X(y,z)", which is illegal.
* Added optional comma to end of "elementValueArrayInitializer"; as per JLS.
* Changed annotationTypeElementRest to use normalClassDeclaration and
* normalInterfaceDeclaration rather than classDeclaration and
* interfaceDeclaration, thus getting rid of a couple of grammar ambiguities.
* Split localVariableDeclaration into localVariableDeclarationStatement
* (includes the terminating semi-colon) and localVariableDeclaration.
* This allowed me to use localVariableDeclaration in "forInit" clauses,
* simplifying them.
* Changed switchBlockStatementGroup to use multiple labels. This adds an
* ambiguity, but if one uses appropriately greedy parsing it yields the
* parse that is closest to the meaning of the switch statement.
* Renamed "forVarControl" to "enhancedForControl" -- JLS language.
* Added semantic predicates to test for shift operations rather than other
* things. Thus, for instance, the string "< <" will never be treated
* as a left-shift operator.
* In "creator" we rule out "nonWildcardTypeArguments" on arrayCreation,
* which are illegal.
* Moved "nonWildcardTypeArguments into innerCreator.
* Removed 'super' superSuffix from explicitGenericInvocation, since that
* is only used in explicitConstructorInvocation at the beginning of a
* constructorBody. (This is part of the simplification of expressions
* mentioned earlier.)
* Simplified primary (got rid of those things that are only used in
* explicitConstructorInvocation).
* Lexer -- removed "Exponent?" from FloatingPointLiteral choice 4, since it
* led to an ambiguity.
*
* This grammar successfully parses every .java file in the JDK 1.5 source
* tree (excluding those whose file names include '-', which are not
* valid Java compilation units).
*
* June 26, 2008
*
* conditionalExpression had wrong precedence x?y:z.
*
* February 26, 2011
* added left-recursive expression rule
*
* Known remaining problems:
* "Letter" and "JavaIDDigit" are wrong. The actual specification of
* "Letter" should be "a character for which the method
* Character.isJavaIdentifierStart(int) returns true." A "Java
* letter-or-digit is a character for which the method
* Character.isJavaIdentifierPart(int) returns true."
*/
grammar Java;
options {backtrack=true; memoize=true;}
@lexer::members {
protected boolean enumIsKeyword = true;
protected boolean assertIsKeyword = true;
}
// starting point for parsing a java file
/* The annotations are separated out to make parsing faster, but must be associated with
a packageDeclaration or a typeDeclaration (and not an empty one). */
compilationUnit
: annotations
( packageDeclaration importDeclaration* typeDeclaration*
| classOrInterfaceDeclaration typeDeclaration*
)
| packageDeclaration? importDeclaration* typeDeclaration*
;
packageDeclaration
: 'package' qualifiedName ';'
;
importDeclaration
: 'import' 'static'? qualifiedName ('.' '*')? ';'
;
typeDeclaration
: classOrInterfaceDeclaration
| ';'
;
classOrInterfaceDeclaration
: classOrInterfaceModifiers (classDeclaration | interfaceDeclaration)
;
classOrInterfaceModifiers
: classOrInterfaceModifier*
;
classOrInterfaceModifier
: annotation // class or interface
| 'public' // class or interface
| 'protected' // class or interface
| 'private' // class or interface
| 'abstract' // class or interface
| 'static' // class or interface
| 'final' // class only -- does not apply to interfaces
| 'strictfp' // class or interface
;
modifiers
: modifier*
;
classDeclaration
: normalClassDeclaration
| enumDeclaration
;
normalClassDeclaration
: 'class' Identifier typeParameters?
('extends' type)?
('implements' typeList)?
classBody
;
typeParameters
: '<' typeParameter (',' typeParameter)* '>'
;
typeParameter
: Identifier ('extends' typeBound)?
;
typeBound
: type ('&' type)*
;
enumDeclaration
: ENUM Identifier ('implements' typeList)? enumBody
;
enumBody
: '{' enumConstants? ','? enumBodyDeclarations? '}'
;
enumConstants
: enumConstant (',' enumConstant)*
;
enumConstant
: annotations? Identifier arguments? classBody?
;
enumBodyDeclarations
: ';' (classBodyDeclaration)*
;
interfaceDeclaration
: normalInterfaceDeclaration
| annotationTypeDeclaration
;
normalInterfaceDeclaration
: 'interface' Identifier typeParameters? ('extends' typeList)? interfaceBody
;
typeList
: type (',' type)*
;
classBody
: '{' classBodyDeclaration* '}'
;
interfaceBody
: '{' interfaceBodyDeclaration* '}'
;
classBodyDeclaration
: ';'
| 'static'? block
| modifiers memberDecl
;
memberDecl
: genericMethodOrConstructorDecl
| memberDeclaration
| 'void' Identifier voidMethodDeclaratorRest
| Identifier constructorDeclaratorRest
| interfaceDeclaration
| classDeclaration
;
memberDeclaration
: type (methodDeclaration | fieldDeclaration)
;
genericMethodOrConstructorDecl
: typeParameters genericMethodOrConstructorRest
;
genericMethodOrConstructorRest
: (type | 'void') Identifier methodDeclaratorRest
| Identifier constructorDeclaratorRest
;
methodDeclaration
: Identifier methodDeclaratorRest
;
fieldDeclaration
: variableDeclarators ';'
;
interfaceBodyDeclaration
: modifiers interfaceMemberDecl
| ';'
;
interfaceMemberDecl
: interfaceMethodOrFieldDecl
| interfaceGenericMethodDecl
| 'void' Identifier voidInterfaceMethodDeclaratorRest
| interfaceDeclaration
| classDeclaration
;
interfaceMethodOrFieldDecl
: type Identifier interfaceMethodOrFieldRest
;
interfaceMethodOrFieldRest
: constantDeclaratorsRest ';'
| interfaceMethodDeclaratorRest
;
methodDeclaratorRest
: formalParameters ('[' ']')*
('throws' qualifiedNameList)?
( methodBody
| ';'
)
;
voidMethodDeclaratorRest
: formalParameters ('throws' qualifiedNameList)?
( methodBody
| ';'
)
;
interfaceMethodDeclaratorRest
: formalParameters ('[' ']')* ('throws' qualifiedNameList)? ';'
;
interfaceGenericMethodDecl
: typeParameters (type | 'void') Identifier
interfaceMethodDeclaratorRest
;
voidInterfaceMethodDeclaratorRest
: formalParameters ('throws' qualifiedNameList)? ';'
;
constructorDeclaratorRest
: formalParameters ('throws' qualifiedNameList)? constructorBody
;
constantDeclarator
: Identifier constantDeclaratorRest
;
variableDeclarators
: variableDeclarator (',' variableDeclarator)*
;
variableDeclarator
: variableDeclaratorId ('=' variableInitializer)?
;
constantDeclaratorsRest
: constantDeclaratorRest (',' constantDeclarator)*
;
constantDeclaratorRest
: ('[' ']')* '=' variableInitializer
;
variableDeclaratorId
: Identifier ('[' ']')*
;
variableInitializer
: arrayInitializer
| expression
;
arrayInitializer
: '{' (variableInitializer (',' variableInitializer)* (',')? )? '}'
;
modifier
: annotation
| 'public'
| 'protected'
| 'private'
| 'static'
| 'abstract'
| 'final'
| 'native'
| 'synchronized'
| 'transient'
| 'volatile'
| 'strictfp'
;
packageOrTypeName
: qualifiedName
;
enumConstantName
: Identifier
;
typeName
: qualifiedName
;
type
: classOrInterfaceType ('[' ']')*
| primitiveType ('[' ']')*
;
classOrInterfaceType
: Identifier typeArguments? ('.' Identifier typeArguments? )*
;
primitiveType
: 'boolean'
| 'char'
| 'byte'
| 'short'
| 'int'
| 'long'
| 'float'
| 'double'
;
variableModifier
: 'final'
| annotation
;
typeArguments
: '<' typeArgument (',' typeArgument)* '>'
;
typeArgument
: type
| '?' (('extends' | 'super') type)?
;
qualifiedNameList
: qualifiedName (',' qualifiedName)*
;
formalParameters
: '(' formalParameterDecls? ')'
;
formalParameterDecls
: variableModifiers type formalParameterDeclsRest
;
formalParameterDeclsRest
: variableDeclaratorId (',' formalParameterDecls)?
| '...' variableDeclaratorId
;
methodBody
: block
;
constructorBody
: '{' explicitConstructorInvocation? blockStatement* '}'
;
explicitConstructorInvocation
: nonWildcardTypeArguments? ('this' | 'super') arguments ';'
| expression '.' nonWildcardTypeArguments? 'super' arguments ';'
;
qualifiedName
: Identifier ('.' Identifier)*
;
literal
: integerLiteral
| FloatingPointLiteral
| CharacterLiteral
| StringLiteral
| booleanLiteral
| 'null'
;
integerLiteral
: HexLiteral
| OctalLiteral
| DecimalLiteral
;
booleanLiteral
: 'true'
| 'false'
;
// ANNOTATIONS
annotations
: annotation+
;
annotation
: '@' annotationName ( '(' ( elementValuePairs | elementValue )? ')' )?
;
annotationName
: Identifier ('.' Identifier)*
;
elementValuePairs
: elementValuePair (',' elementValuePair)*
;
elementValuePair
: Identifier '=' elementValue
;
elementValue
: expression
| annotation
| elementValueArrayInitializer
;
elementValueArrayInitializer
: '{' (elementValue (',' elementValue)*)? (',')? '}'
;
annotationTypeDeclaration
: '@' 'interface' Identifier annotationTypeBody
;
annotationTypeBody
: '{' (annotationTypeElementDeclaration)* '}'
;
annotationTypeElementDeclaration
: modifiers annotationTypeElementRest
;
annotationTypeElementRest
: type annotationMethodOrConstantRest ';'
| normalClassDeclaration ';'?
| normalInterfaceDeclaration ';'?
| enumDeclaration ';'?
| annotationTypeDeclaration ';'?
;
annotationMethodOrConstantRest
: annotationMethodRest
| annotationConstantRest
;
annotationMethodRest
: Identifier '(' ')' defaultValue?
;
annotationConstantRest
: variableDeclarators
;
defaultValue
: 'default' elementValue
;
// STATEMENTS / BLOCKS
block
: '{' blockStatement* '}'
;
blockStatement
: localVariableDeclarationStatement
| classOrInterfaceDeclaration
| statement
;
localVariableDeclarationStatement
: localVariableDeclaration ';'
;
localVariableDeclaration
: variableModifiers type variableDeclarators
;
variableModifiers
: variableModifier*
;
statement
: block
| ASSERT expression (':' expression)? ';'
| 'if' parExpression statement (options {k=1;}:'else' statement)?
| 'for' '(' forControl ')' statement
| 'while' parExpression statement
| 'do' statement 'while' parExpression ';'
| 'try' block
( catches 'finally' block
| catches
| 'finally' block
)
| 'switch' parExpression '{' switchBlockStatementGroups '}'
| 'synchronized' parExpression block
| 'return' expression? ';'
| 'throw' expression ';'
| 'break' Identifier? ';'
| 'continue' Identifier? ';'
| ';'
| statementExpression ';'
| Identifier ':' statement
;
catches
: catchClause (catchClause)*
;
catchClause
: 'catch' '(' formalParameter ')' block
;
formalParameter
: variableModifiers type variableDeclaratorId
;
switchBlockStatementGroups
: (switchBlockStatementGroup)*
;
/* The change here (switchLabel -> switchLabel+) technically makes this grammar
ambiguous; but with appropriately greedy parsing it yields the most
appropriate AST, one in which each group, except possibly the last one, has
labels and statements. */
switchBlockStatementGroup
: switchLabel+ blockStatement*
;
switchLabel
: 'case' constantExpression ':'
| 'case' enumConstantName ':'
| 'default' ':'
;
forControl
options {k=3;} // be efficient for common case: for (ID ID : ID) ...
: enhancedForControl
| forInit? ';' expression? ';' forUpdate?
;
forInit
: localVariableDeclaration
| expressionList
;
enhancedForControl
: variableModifiers type Identifier ':' expression
;
forUpdate
: expressionList
;
// EXPRESSIONS
parExpression
: '(' expression ')'
;
expressionList
: expression (',' expression)*
;
statementExpression
: expression
;
constantExpression
: expression
;
expression
: parExpression
| 'this'
| 'super'
| literal
| Identifier
| expression '.' Identifier
| expression '.' 'class' // should be type.class but causes backtracking
| expression '.' 'this'
| expression '.' 'super' '(' expressionList? ')'
| expression '.' 'super' '.' Identifier arguments?
| expression '.' 'new' Identifier '(' expressionList? ')'
| expression '.' explicitGenericInvocation
| 'new' creator
| expression '[' expression ']'
| '(' type ')' expression
| expression ('++' | '--')
| expression '(' expressionList? ')'
| ('+'|'-'|'++'|'--') expression
| ('~'|'!') expression
| expression ('*'|'/'|'%') expression
| expression ('+'|'-') expression
| expression ('<' '<' | '>' '>' '>' | '>' '>') expression
| expression ('<' '=' | '>' '=' | '>' | '<') expression
| expression 'instanceof' type
| expression ('==' | '!=') expression
| expression '&' expression
| expression '^'<assoc=right> expression
| expression '|' expression
| expression '&&' expression
| expression '||' expression
| expression '?' expression ':' expression
| expression
('='<assoc=right>
| '+='<assoc=right>
| '-='<assoc=right>
| '*='<assoc=right>
| '/='<assoc=right>
| '&='<assoc=right>
| '|='<assoc=right>
| '^='<assoc=right>
| '>' '>' '='<assoc=right>
| '>' '>' '>' '='<assoc=right>
| '<' '<' '='<assoc=right>
| '%='<assoc=right>) expression
;
creator
: nonWildcardTypeArguments createdName classCreatorRest
| createdName (arrayCreatorRest | classCreatorRest)
;
createdName
: classOrInterfaceType
| primitiveType
;
innerCreator
: nonWildcardTypeArguments? Identifier classCreatorRest
;
arrayCreatorRest
: '['
( ']' ('[' ']')* arrayInitializer
| expression ']' ('[' expression ']')* ('[' ']')*
)
;
classCreatorRest
: arguments classBody?
;
explicitGenericInvocation
: nonWildcardTypeArguments Identifier arguments
;
nonWildcardTypeArguments
: '<' typeList '>'
;
selector
: '.' Identifier arguments?
| '.' 'this'
| '.' 'super' superSuffix
| '.' 'new' innerCreator
| '[' expression ']'
;
superSuffix
: arguments
| '.' Identifier arguments?
;
arguments
: '(' expressionList? ')'
;
// LEXER
HexLiteral : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
DecimalLiteral : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
OctalLiteral : '0' ('0'..'7')+ IntegerTypeSuffix? ;
fragment
HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
IntegerTypeSuffix : ('l'|'L') ;
FloatingPointLiteral
: ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
| '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
| ('0'..'9')+ Exponent FloatTypeSuffix?
| ('0'..'9')+ FloatTypeSuffix
| '0' ('x'|'X')
( HexDigit+ '.' HexDigit* Exponent? FloatTypeSuffix?
| '.' HexDigit+ Exponent? FloatTypeSuffix?
| HexDigit+ Exponent FloatTypeSuffix?
| HexDigit+ FloatTypeSuffix
)
;
fragment
Exponent : ('e'|'E'|'p'|'P') ('+'|'-')? ('0'..'9')+ ;
fragment
FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
CharacterLiteral
: '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
;
StringLiteral
: '"' ( EscapeSequence | ~('\\'|'"') )* '"'
;
fragment
EscapeSequence
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UnicodeEscape
| OctalEscape
;
fragment
OctalEscape
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UnicodeEscape
: '\\' 'u' HexDigit HexDigit HexDigit HexDigit
;
ENUM: 'enum' {if (!enumIsKeyword) $type=Identifier;}
;
ASSERT
: 'assert' {if (!assertIsKeyword) $type=Identifier;}
;
Identifier
: Letter (Letter|JavaIDDigit)*
;
/**I found this char range in JavaCC's grammar, but Letter and Digit overlap.
Still works, but...
*/
fragment
Letter
: '\u0024' |
'\u0041'..'\u005a' |
'\u005f' |
'\u0061'..'\u007a' |
'\u00c0'..'\u00d6' |
'\u00d8'..'\u00f6' |
'\u00f8'..'\u00ff' |
'\u0100'..'\u1fff' |
'\u3040'..'\u318f' |
'\u3300'..'\u337f' |
'\u3400'..'\u3d2d' |
'\u4e00'..'\u9fff' |
'\uf900'..'\ufaff'
;
fragment
JavaIDDigit
: '\u0030'..'\u0039' |
'\u0660'..'\u0669' |
'\u06f0'..'\u06f9' |
'\u0966'..'\u096f' |
'\u09e6'..'\u09ef' |
'\u0a66'..'\u0a6f' |
'\u0ae6'..'\u0aef' |
'\u0b66'..'\u0b6f' |
'\u0be7'..'\u0bef' |
'\u0c66'..'\u0c6f' |
'\u0ce6'..'\u0cef' |
'\u0d66'..'\u0d6f' |
'\u0e50'..'\u0e59' |
'\u0ed0'..'\u0ed9' |
'\u1040'..'\u1049'
;
WS : (' '|'\r'|'\t'|'\u000C'|'\n')+ {$channel=HIDDEN;}
;
COMMENT
: '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;
LINE_COMMENT
: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
;

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,297 @@
package org.antlr.v4.test;
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.misc.Nullable;
import org.antlr.v4.runtime.tree.ParseTree;
import org.antlr.v4.runtime.tree.ParseTreeListener;
import org.antlr.v4.runtime.tree.ParseTreeWalker;
import org.junit.Assert;
import org.junit.Test;
import org.antlr.v4.runtime.atn.ParserATNSimulator;
import java.io.*;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.Collection;
public class TestPerformance extends BaseTest {
/** Parse all java files under this package within the JDK_SOURCE_ROOT. */
private static final String TOP_PACKAGE = "java";
/** True to load java files from sub-packages of {@link #TOP_PACKAGE}. */
private static final boolean RECURSIVE = true;
/**
* True to use the Java grammar with expressions in the v4 left-recursive syntax (Java-LR.g). False to use
* the standard grammar (Java.g). In either case, the grammar is renamed in the temporary directory to Java.g
* before compiling.
*/
private static final boolean USE_LR_GRAMMAR = false;
/**
* True to specify the -Xforceatn option when generating the grammar, forcing all decisions in JavaParser to
* be handled by {@link ParserATNSimulator#adaptivePredict}.
*/
private static final boolean FORCE_ATN = false;
/** Parse each file with JavaParser.compilationUnit */
private static final boolean RUN_PARSER = true;
/** True to use {@link BailErrorStrategy}, False to use {@link DefaultErrorStrategy} */
private static final boolean BAIL_ON_ERROR = false;
/** This value is passed to {@link BaseRecognizer#setBuildParseTree}. */
private static final boolean BUILD_PARSE_TREES = false;
/**
* Use ParseTreeWalker.DEFAULT.walk with the BlankJavaParserListener to show parse tree walking overhead.
* If {@link #BUILD_PARSE_TREES} is false, the listener will instead be called during the parsing process via
* {@link BaseRecognizer#setListener}.
*/
private static final boolean BLANK_LISTENER = false;
/**
* If true, a single JavaLexer will be used, and {@link Lexer#setInputStream} will be called to initialize it
* for each source file. In this mode, the cached DFA will be persisted throughout the lexing process.
*/
private static final boolean REUSE_LEXER = true;
/**
* If true, a single JavaParser will be used, and {@link Parser#setInputStream} will be called to initialize it
* for each source file. In this mode, the cached DFA will be persisted throughout the parsing process.
*/
private static final boolean REUSE_PARSER = true;
/**
* If true, the shared lexer and parser are reset after each pass. If false, all passes after the first will
* be fully "warmed up", which makes them faster and can compare them to the first warm-up pass, but it will
* not distinguish bytecode load/JIT time from warm-up time during the first pass.
*/
private static final boolean CLEAR_DFA = false;
/** Total number of passes to make over the source */
private static final int PASSES = 4;
private Lexer sharedLexer;
private Parser sharedParser;
@SuppressWarnings({"FieldCanBeLocal"})
private ParseTreeListener<Token> sharedListener;
private int tokenCount;
@Test
// @Ignore
public void compileJdk() throws IOException {
compileParser(USE_LR_GRAMMAR);
JavaParserFactory factory = getParserFactory();
String jdkSourceRoot = System.getenv("JDK_SOURCE_ROOT");
if (jdkSourceRoot == null) {
System.err.println("The JDK_SOURCE_ROOT environment variable must be set for performance testing.");
return;
}
if (!TOP_PACKAGE.isEmpty()) {
jdkSourceRoot = jdkSourceRoot + '/' + TOP_PACKAGE.replace('.', '/');
}
File directory = new File(jdkSourceRoot);
assertTrue(directory.isDirectory());
Collection<CharStream> sources = loadSources(directory, RECURSIVE);
System.out.format("Lex=true, Parse=%s, ForceAtn=%s, Bail=%s, BuildParseTree=%s, BlankListener=%s\n",
RUN_PARSER, FORCE_ATN, BAIL_ON_ERROR, BUILD_PARSE_TREES, BLANK_LISTENER);
parse1(factory, sources);
for (int i = 0; i < PASSES - 1; i++) {
if (CLEAR_DFA) {
sharedLexer = null;
sharedParser = null;
}
parse2(factory, sources);
}
}
/**
* This method is separate from {@link #parse2} so the first pass can be distinguished when analyzing
* profiler results.
*/
protected void parse1(JavaParserFactory factory, Collection<CharStream> sources) {
System.gc();
parseSources(factory, sources);
}
/**
* This method is separate from {@link #parse1} so the first pass can be distinguished when analyzing
* profiler results.
*/
protected void parse2(JavaParserFactory factory, Collection<CharStream> sources) {
System.gc();
parseSources(factory, sources);
}
protected Collection<CharStream> loadSources(File directory, boolean recursive) {
Collection<CharStream> result = new ArrayList<CharStream>();
loadSources(directory, recursive, result);
return result;
}
protected void loadSources(File directory, boolean recursive, Collection<CharStream> result) {
assert directory.isDirectory();
File[] sources = directory.listFiles(new FilenameFilter() {
@Override
public boolean accept(File dir, String name) {
return name.toLowerCase().endsWith(".java");
}
});
for (File file : sources) {
try {
CharStream input = new ANTLRFileStream(file.getAbsolutePath());
result.add(input);
} catch (IOException ex) {
}
}
if (recursive) {
File[] children = directory.listFiles();
for (File child : children) {
if (child.isDirectory()) {
loadSources(child, true, result);
}
}
}
}
protected void parseSources(JavaParserFactory factory, Collection<CharStream> sources) {
long startTime = System.currentTimeMillis();
tokenCount = 0;
int inputSize = 0;
for (CharStream input : sources) {
input.seek(0);
inputSize += input.size();
// this incurred a great deal of overhead and was causing significant variations in performance results.
//System.out.format("Parsing file %s\n", file.getAbsolutePath());
try {
factory.parseFile(input);
} catch (IllegalStateException ex) {
ex.printStackTrace(System.out);
}
}
System.out.format("Total parse time for %d files (%d KB, %d tokens): %dms\n",
sources.size(),
inputSize / 1024,
tokenCount,
System.currentTimeMillis() - startTime);
}
protected void compileParser(boolean leftRecursive) throws IOException {
String grammarFileName = "Java.g";
String sourceName = leftRecursive ? "Java-LR.g" : "Java.g";
String body = load(sourceName, null);
@SuppressWarnings({"ConstantConditions"})
String[] extraOptions = FORCE_ATN ? new String[] {"-Xforceatn"} : new String[0];
boolean success = rawGenerateAndBuildRecognizer(grammarFileName, body, "JavaParser", "JavaLexer", false, extraOptions);
assertTrue(success);
}
protected String load(String fileName, @Nullable String encoding)
throws IOException
{
if ( fileName==null ) {
return null;
}
String fullFileName = getClass().getPackage().getName().replace('.', '/') + '/' + fileName;
int size = 65000;
InputStreamReader isr;
InputStream fis = getClass().getClassLoader().getResourceAsStream(fullFileName);
if ( encoding!=null ) {
isr = new InputStreamReader(fis, encoding);
}
else {
isr = new InputStreamReader(fis);
}
try {
char[] data = new char[size];
int n = isr.read(data);
return new String(data, 0, n);
}
finally {
isr.close();
}
}
protected JavaParserFactory getParserFactory() {
try {
ClassLoader loader = new URLClassLoader(new URL[] { new File(tmpdir).toURI().toURL() }, ClassLoader.getSystemClassLoader());
@SuppressWarnings({"unchecked"})
final Class<? extends Lexer> lexerClass = (Class<? extends Lexer>)loader.loadClass("JavaLexer");
@SuppressWarnings({"unchecked"})
final Class<? extends Parser> parserClass = (Class<? extends Parser>)loader.loadClass("JavaParser");
@SuppressWarnings({"unchecked"})
final Class<? extends ParseTreeListener<Token>> listenerClass = (Class<? extends ParseTreeListener<Token>>)loader.loadClass("BlankJavaListener");
this.sharedListener = listenerClass.newInstance();
final Constructor<? extends Lexer> lexerCtor = lexerClass.getConstructor(CharStream.class);
final Constructor<? extends Parser> parserCtor = parserClass.getConstructor(TokenStream.class);
// construct initial instances of the lexer and parser to deserialize their ATNs
lexerCtor.newInstance(new ANTLRInputStream(""));
parserCtor.newInstance(new CommonTokenStream());
return new JavaParserFactory() {
@SuppressWarnings({"PointlessBooleanExpression"})
@Override
public void parseFile(CharStream input) {
try {
if (REUSE_LEXER && sharedLexer != null) {
sharedLexer.setInputStream(input);
} else {
sharedLexer = lexerCtor.newInstance(input);
}
CommonTokenStream tokens = new CommonTokenStream(sharedLexer);
tokens.fill();
tokenCount += tokens.size();
if (!RUN_PARSER) {
return;
}
if (REUSE_PARSER && sharedParser != null) {
sharedParser.setInputStream(tokens);
} else {
sharedParser = parserCtor.newInstance(tokens);
sharedParser.setBuildParseTree(BUILD_PARSE_TREES);
if (!BUILD_PARSE_TREES && BLANK_LISTENER) {
sharedParser.setListener(sharedListener);
}
if (BAIL_ON_ERROR) {
sharedParser.setErrorHandler(new BailErrorStrategy<Token>());
}
}
Method parseMethod = parserClass.getMethod("compilationUnit");
Object parseResult = parseMethod.invoke(sharedParser);
assert parseResult instanceof ParseTree;
if (BUILD_PARSE_TREES && BLANK_LISTENER) {
ParseTreeWalker.DEFAULT.walk(sharedListener, (ParseTree)parseResult);
}
} catch (Exception e) {
e.printStackTrace(System.out);
throw new IllegalStateException(e);
}
}
};
} catch (Exception e) {
e.printStackTrace(System.out);
lastTestFailed = true;
Assert.fail(e.getMessage());
throw new IllegalStateException(e);
}
}
protected interface JavaParserFactory {
void parseFile(CharStream input);
}
}