Skip to content
Andreas Dangel edited this page Jan 23, 2023 · 34 revisions

⚠️ This page is outdated. Please refer to PMD 7 Tracking Issue #3898 for the current status.


This is WIP detailed doc about individual AST changes.

TOC

Annotations

  • What: Annotations are consolidated into a single node. SingleMemberAnnotation, NormalAnnotation and MarkerAnnotation are removed in favour of Annotation. The Name node is removed, replaced by a ClassOrInterfaceType.
  • Why: Those different node types implement a syntax-only distinction, that only makes semantically equivalent annotations have different possible representations. For example, @A and @A() are semantically equivalent, yet they were parsed as MarkerAnnotation resp. NormalAnnotation. Similarly, @A("") and @A(value="") were parsed as SingleMemberAnnotation resp. NormalAnnotation. This also makes parsing much simpler. The nested ClassOrInterface type is used to share the disambiguation logic.
  • #2282 [java] Use single node for annotations
Code Old AST New AST
@A
+ Annotation
  + MarkerAnnotation
    + Name "A"
+ Annotation
  + ClassOrInterfaceType "A"
@A()
+ Annotation
  + NormalAnnotation
    + Name "A"
+ Annotation "A"
  + ClassOrInterfaceType "A"
  + AnnotationMemberList
@A(value="v")
+ Annotation
  + NormalAnnotation
    + Name "A"
    + MemberValuePairs
      + MemberValuePair "value"
        + MemberValue
          + PrimaryExpression
            + PrimaryPrefix
              + Literal '"v"'
+ Annotation "A"
  + ClassOrInterfaceType "A"
  + AnnotationMemberList
    + MemberValuePair "value" [@Shorthand=false()]
      + StringLiteral '"v"'
@A("v")
+ Annotation
  + SingleMemberAnnotation
    + Name "A"
    + MemberValue
      + PrimaryExpression
        + PrimaryPrefix
          + Literal '"v"'
+ Annotation "A"
  + ClassOrInterfaceType "A"
  + AnnotationMemberList
    + MemberValuePair "value" [@Shorthand=true()]
      + StringLiteral '"v"'
@A(value="v", on=true)
+ Annotation
  + NormalAnnotation
    + Name "A"
    + MemberValuePairs
      + MemberValuePair "value"
        + MemberValue
          + PrimaryExpression
            + PrimaryPrefix
              + Literal '"v"'
      + MemberValuePair "on"
        + MemberValue
          + PrimaryExpression
            + PrimaryPrefix
              + Literal
                + BooleanLiteral [@True=true()]
+ Annotation "A"
  + ClassOrInterfaceType "A"
  + AnnotationMemberList
    + MemberValuePair "value" [@Shorthand=false()]
      + StringLiteral '"v"'
    + MemberValuePair "on"
      + BooleanLiteral [@True=true()]

Annotation nesting

  • What: Annotations are now nested within the node, to which they are applied to. E.g. if a method is annotated, the Annotation node is now a child of a ModifierList, inside the MethodDeclaration.
  • Why: Fixes a lot of inconsistencies, where sometimes the annotations were inside the node, and sometimes just somewhere in the parent, with no real structure.
  • #1875 [java] Move annotations inside the node they apply to
Code Old AST New AST

Method

@A
public void set(int x) { }
ClassOrInterfaceBodyDeclaration
  + Annotation
    + MarkerAnnotation
      + Name "A"
  + MethodDeclaration
    + ResultType[@Void=true]
    + ...
MethodDeclaration
  + ModifierList
    + Annotation "A"
  + VoidType
  + ...

Top-level type declaration

@A class C {}
TypeDeclaration
  + Annotation
    + MarkerAnnotation
      + Name "A"
  + ClassOrInterfaceDeclaration
    + ClassOrInterfaceBody
TypeDeclaration
  + ClassOrInterfaceDeclaration
     + ModifierList
       + Annotation "A"
     + ClassOrInterfaceBody

Cast expression

(@A T.@B S) expr

N/A (Parse error)

CastExpression
  + ClassOrInterfaceType "S"
    + Annotation "B"
    + ClassOrInterfaceType "T"
      + Annotation "A"
  + (Expression `expr`)

Cast expression with intersection

(@A T & S) expr
CastExpression
  + MarkerAnnotation "A"
  + ClassOrInterfaceType "T"
  + ClassOrInterfaceType "S"
  + (Expression `expr`)
CastExpression
  + IntersectionType
    + ClassOrInterfaceType "T"
      + Annotation "A"
    + ClassOrInterfaceType "S"
  + (Expression `expr`)

Notice @A binds to T, not T & S

Constructor call

new @A T()
AllocationExpression
  + MarkerAnnotation "A"
  + Type
    + ReferenceType
      + ClassOrInterfaceType "T"
  + Arguments
ConstructorCall
  + ClassOrInterfaceType "T"
    + Annotation "A"
  + ArgumentsList

Array allocation

new @A int[0]
AllocationExpression
  + MarkerAnnotation "A"
  + Type
    + PrimitiveType "int"
  + ArrayDimsAndInits
    + Expression
      + PrimaryExpression
        + Literal "0"
ArrayAllocation
  + PrimitiveType "int"
    + Annotation "A"
  + ArrayAllocationDims
    + ArrayDimExpr
      + NumericLiteral "0"

Array type

@A int @B[]

N/A (parse error)

ArrayType
  + PrimitiveType "int"
    + Annotation "A"
  + ArrayTypeDims
    + ArrayTypeDim
      + Annotation "B"

Notice @A binds to int, not int[]

Type parameters

<@A T, @B S extends @C Object>
TypeParameters
  + MarkerAnnotation "A"
  + TypeParameter "T"
  + MarkerAnnotation "B"
  + TypeParameter "S"
    + MarkerAnnotation "C"
    + TypeBound
      + ReferenceType
        + ClassOrInterfaceType "Object"
TypeParameters
  + TypeParameter "T"
    + Annotation "A"
  + TypeParameter "S"
    + Annotation "B"
    + ClassOrInterfaceType "Object"
      + Annotation "C"
  
  • TypeParameters now only can have TypeParameter as a child
  • Annotations that apply to the param are in the param
  • Annotations that apply to the bound are in the type
  • This removes the need for TypeBound, because annotations are cleanly placed.

Enum constants

enum {
 @A E1, @B E2   
}
EnumBody
  + MarkerAnnotation "A"
  + EnumConstant "E1"
  + MarkerAnnotation "B"
  + EnumConstant "E2"
EnumBody
  + EnumConstant "E1"
    + ModifierList
      + Annotation "A"
    + VariableDeclaratorId "E1"
  + EnumConstant "E2"
    + ModifierList
      + Annotation "B"
    + VariableDeclaratorId "E1"
  • Annotations are not just randomly in the enum body anymore

Types

Type and ReferenceType

  • What: those two nodes are turned into interfaces, implemented by concrete syntax nodes. See their javadoc for exactly what nodes implement them.
  • Why:
    • some syntactic contexts only allow reference types, other allow any kind of type. If you want to match all types of a program, then matching Type would be the intuitive solution. But in 6.0.x, it wouldn't have sufficed, since in some contexts, no Type node was pushed, only a ReferenceType
    • Regardless of the original syntactic context, any reference type is a type, and searching for ASTType should yield all the types in the tree.
    • Using interfaces allows to abstract behaviour and make a nicer and safer API.
Code Old AST New AST
// in the context of a variable declaration
List<String> strs;
└─ Type (1)
   └─ ReferenceType
      └─ ClassOrInterfaceType "List"
         └─ TypeArguments
            └─ TypeArgument
               └─ ReferenceType (2) 
                  └─ ClassOrInterfaceType "String"
  1. Notice that there is a Type node here, since a local var can have a primitive type
  2. In contrast, notice that there is no Type here, since only reference types are allowed as type arguments
└─ ClassOrInterfaceType "List"
   └─ TypeArguments
      └─ ClassOrInterfaceType "String"

ClassOrInterfaceType implements ASTReferenceType, which implements ASTType.

Array changes

What: Additional nodes ArrayType, ArrayTypeDim, ArrayTypeDims, ArrayAllocation.

Why: Support annotated array types (#997 Java8 parsing corner case with annotated array types)

Examples:

Code Old AST New AST
String[][] myArray;
└─ Type
   └─ ReferenceType[ @Array = true() ][ @ArrayDepth = 2 ]
      └─ ClassOrInterfaceType
└─ ArrayType[ @ArrayDepth = 2 ]
   ├─ ClassOrInterfaceType
   └─ ArrayDimensions[ @Size = 2 ]
      ├─ ArrayTypeDim
      └─ ArrayTypeDim
String @Annotation1[] @Annotation2[] myArray;

n/a (parse error)

└─ ArrayType
   ├─ ClassOrInterfaceType
   └─ ArrayDimensions
      ├─ ArrayTypeDim
      │  └─ Annotation[ @AnnotationName = 'Annotation1' ]
      └─ ArrayTypeDim
         └─ Annotation[ @AnnotationName = 'Annotation2' ]
new int[2][]
new @Bar int[3][2]
new Foo[] { f, g }
AllocationExpression
  + PrimitiveType "int"
  + ArrayDimsAndInits
    + Expression
      + PrimaryExpression
        + PrimaryPrefix
          + Literal "2"
AllocationExpression
  + Annotation
    + MarkerAnnotation
      + Name "Bar"
  + PrimitiveType "int"
  + ArrayDimsAndInits
    + Expression
      + PrimaryExpression
        + PrimaryPrefix
          + Literal "3"
    + Expression
      + PrimaryExpression
        + PrimaryPrefix
          + Literal "2"
AllocationExpression
  + ClassOrInterfaceType "Foo"
  + ArrayDimsAndInits
    + ArrayInitializer
      + VariableInitializer
        + Expression
          + PrimaryExpression
            + PrimaryPrefix
              + Name "f"
      + VariableInitializer
        + Expression
          + PrimaryExpression
            + PrimaryPrefix
              + Name "g"
ArrayAllocation
  + ArrayType
    + PrimitiveType "int"
    + ArrayDimensions
      + ArrayDimExpr
        + NumericLiteral "2"
      + ArrayTypeDim
ArrayAllocation
  + ArrayType
    + PrimitiveType "int"
      + MarkerAnnotation "Bar"
    + ArrayDimensions
      + ArrayDimExpr
        + NumericLiteral "3"
      + ArrayDimExpr
        + NumericLiteral "2"
ArrayAllocation
  + ArrayType
    + ClassOrInterfaceType "Foo"
    + ArrayDimensions
      + ArrayTypeDim
  + ArrayInitializer
    + VariableAccess "f"
    + VariableAccess "g"

ClassOrInterfaceType nesting

Code Old AST New AST
Map.Entry<K,V>
ClassOrInterfaceType "Map.Entry"
   + TypeArguments
     + TypeArgument
       + ReferenceType
         + ClassOrInterfaceType "K"
     + TypeArgument
       + ReferenceType
         + ClassOrInterfaceType "V"
ClassOrInterfaceType "Entry"
   + ClassOrInterfaceType "Map"
   + TypeArguments
     + ClassOrInterfaceType "K"
     + ClassOrInterfaceType "V"

First<K>.Second.Third<V>
ClassOrInterfaceType "First.Second.Third"
   + TypeArguments
     + TypeArgument
       + ReferenceType
         + ClassOrInterfaceType "K"
   + TypeArguments
     + TypeArgument
       + ReferenceType
         + ClassOrInterfaceType "V"

ClassOrInterfaceType "Third"
   - ClassOrInterfaceType "Second"
      - ClassOrInterfaceType "First"
          - TypeArguments
             - ClassOrInterfaceType "K"
   - TypeArguments
     - ClassOrInterfaceType "V"

TypeArgument and WildcardType

  • What:
    • TypeArgument is removed. Instead, the TypeArguments node contains directly a sequence of Type nodes. To support this, the new node type WildcardType captures the syntax previously parsed as a TypeArgument.
    • The WildcardBounds node is removed. Instead, the bound is a direct child of the WildcardType.
  • Why: Because wildcard types are types in their own right, and having a node to represent them skims several levels of nesting off.
Code Old AST New AST
Entry<String, ? extends Node>
ClassOrInterfaceType "Entry"
   + TypeArguments
     + TypeArgument
       + ReferenceType
         + ClassOrInterfaceType "String"
     + TypeArgument[@UpperBound = true()]
       + WildcardBounds
         + ReferenceType
           + ClassOrInterfaceType "Node"
ClassOrInterfaceType "Entry"
   + TypeArguments
     + ClassOrInterfaceType "String"
     + WildcardType[@UpperBound = true()]
       + ClassOrInterfaceType "Node"

List<?>
ClassOrInterfaceType "List"
   + TypeArguments
     + TypeArgument
ClassOrInterfaceType "List"
   + TypeArguments
     + WildcardType

Declarations

Import and Package declarations

  • What: Remove the Name node in imports and package declaration nodes.
  • Why: Name is a TypeNode, but it's equivalent to AmbiguousName in that it describes nothing about what it represents. The name in an import may represent a method name, a type name, a field name... It's too ambiguous to treat in the parser and could just be the image of the import, or package, or module.
  • #1888 [java] Remove Name nodes in Import- and PackageDeclaration
Code Old AST New AST
import java.util.ArrayList;
import static java.util.Comparator.reverseOrder;
import java.util.*;
ImportDeclaration
  + Name "java.util.ArrayList"
ImportDeclaration[@Static=true()]
  + Name "java.util.Comparator.reverseOrder"
ImportDeclaration[@ImportOnDemand=true()]
  + Name "java.util"
ImportDeclaration "java.util.ArrayList"
ImportDeclaration[@Static=true()] "java.util.Comparator.reverseOrder"
ImportDeclaration[@ImportOnDemand=true()] "java.util"
package com.example.tool;
PackageDeclaration
  + Name "com.example.tool"
PackageDeclaration "com.example.tool"
+ ModifierList

Modifier lists

  • What: AccessNode is now based on a node: ModifierList. That node represents modifiers occurring before a declaration. It provides a flexible API to query modifiers, both explicit and implicit. All declaration nodes now have such a modifier list, even if it's implicit (no explicit modifiers).
  • Why: AccessNode gave a lot of irrelevant methods to its subtypes. E.g. ASTFieldDeclaration::isSynchronized makes no sense. Now, these irrelevant methods don't clutter the API. The API of ModifierList is both more general and flexible
  • See #2259 [java] Rework AccessNode
Code Old AST New AST

Method

@A
public void set(final int x, int y) { }
ClassOrInterfaceBodyDeclaration
  + Annotation
    + MarkerAnnotation
      + Name "A"
  + MethodDeclaration[@Public = true()]
    + ResultType[@Void=true]
    + MethodDeclarator
      + FormalParameters
        + FormalParameter[@Final = true()]
          + VariableDeclaratorId "x"
        + FormalParameter[@Final = false()]
          + VariableDeclaratorId "y"
MethodDeclaration
  + ModifierList[@Modifiers=("public")]
    + Annotation "A"
  + VoidType
  + FormalParameters
    + FormalParameter
      + ModifierList[@Modifiers=("final")]
      + VariableDeclaratorId "x"
    + FormalParameter
      + ModifierList[@Modifiers=()]
      + VariableDeclaratorId "y"

Top-level type declaration

public @A class C {}
TypeDeclaration
  + Annotation
    + MarkerAnnotation
      + Name "A"
  + ClassOrInterfaceDeclaration[@Public=true()]
    + ClassOrInterfaceBody
TypeDeclaration
  + ClassOrInterfaceDeclaration
     + ModifierList[@Modifiers=("public")]
       + MarkerAnnotation "A"
     + ClassOrInterfaceBody

Flattened body declarations

  • What: Removes ClassOrInterfaceBodyDeclaration, TypeDeclaration, and AnnotationTypeMemberDeclaration. These were unnecessary since annotations are nested (see above Annotation nesting).
  • Why: This flattens the tree, makes it less verbose and simpler.
  • #2300 [java] Flatten body declarations
Code Old AST New AST
public class Flat {
    private int f;
}
└─ CompilationUnit
   └─ TypeDeclaration
      └─ ClassOrInterfaceDeclaration[ @SimpleName = 'Flat' ]
         └─ ClassOrInterfaceBody
            └─ ClassOrInterfaceBodyDeclaration
               └─ FieldDeclaration
                  ├─ Type
                  │  └─ PrimitiveType
                  └─ VariableDeclarator
                     └─ VariableDeclaratorId[ @VariableName = 'f' ]
└─ CompilationUnit
   └─ ClassOrInterfaceDeclaration[ @SimpleName = 'Flat' ]
      ├─ ModifierList
      └─ ClassOrInterfaceBody
         └─ FieldDeclaration
            ├─ ModifierList
            ├─ PrimitiveType
            └─ VariableDeclarator
               └─ VariableDeclaratorId[ @VariableName = 'f' ]
public @interface FlatAnnotation {
    String value() default "";
}
└─ CompilationUnit
   └─ TypeDeclaration
      └─ AnnotationTypeDeclaration
         └─ AnnotationTypeBody
            └─ AnnotationTypeMemberDeclaration
               └─ AnnotationMethodDeclaration
                  ├─ Type
                  │  └─ ReferenceType
                  │     └─ ClassOrInterfaceType
                  └─ DefaultValue
                     └─ MemberValue
                        └─ PrimaryExpression
                           └─ PrimaryPrefix
                              └─ Literal
└─ CompilationUnit
   └─ AnnotationTypeDeclaration
      ├─ ModifierList
      └─ AnnotationTypeBody
         └─ MethodDeclaration
            ├─ ModifierList
            ├─ ClassOrInterfaceType
            ├─ FormalParameters
            └─ DefaultValue
               └─ StringLiteral

Module declarations

  • What: Removes the generic Name node and uses instead ClassOrInterfaceType where appropriate. Also uses specific node types for different directives (requires, exports, uses, provides).
  • Why: Simplify queries, support type resolution
  • #3890 [java] Improve module grammar
Code Old AST New AST
open module com.example.foo {
    requires com.example.foo.http;
    requires java.logging;
    requires transitive com.example.foo.network;

    exports com.example.foo.bar;
    exports com.example.foo.internal to com.example.foo.probe;

    uses com.example.foo.spi.Intf;

    provides com.example.foo.spi.Intf with com.example.foo.Impl;
}
└─ CompilationUnit
   └─ ModuleDeclaration[ @Image = 'com.example.foo' ][ @Open = true() ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ]
      │  └─ ModuleName[ @Image = 'com.example.foo.http' ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ]
      │  └─ ModuleName[ @Image = 'java.logging' ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ][ @RequiresModifier = 'TRANSITIVE' ]
      │  └─ ModuleName[ @Image = 'com.example.foo.network' ]
      ├─ ModuleDirective[ @Type = 'EXPORTS' ]
      │  └─ Name[ @Image = 'com.example.foo.bar' ]
      ├─ ModuleDirective[ @Type = 'EXPORTS' ]
      │  ├─ Name[ @Image = 'com.example.foo.internal' ]
      │  └─ ModuleName[ @Image = 'com.example.foo.probe' ]
      ├─ ModuleDirective[ @Type = 'USES' ]
      │  └─ Name[ @Image = 'com.example.foo.spi.Intf' ]
      └─ ModuleDirective[ @Type = 'PROVIDES' ]
         ├─ Name[ @Image = 'com.example.foo.spi.Intf' ]
         └─ Name[ @Image = 'com.example.foo.Impl' ]
└─ CompilationUnit
   └─ ModuleDeclaration[ @Name = 'com.example.foo' ][ @Open = true() ]
      ├─ ModuleName[ @Name = 'com.example.foo' ]
      ├─ ModuleRequiresDirective
      │  └─ ModuleName[ @Name = 'com.example.foo.http' ]
      ├─ ModuleRequiresDirective
      │  └─ ModuleName[ @Name = 'java.logging' ]
      ├─ ModuleRequiresDirective[ @Transitive = true ]
      │  └─ ModuleName[ @Name = 'com.example.foo.network' ]
      ├─ ModuleExportsDirective[ @PackageName = 'com.example.foo.bar' ]
      ├─ ModuleExportsDirective[ @PackageName = 'com.example.foo.internal' ]
      │  └─ ModuleName [ @Name = 'com.example.foo.probe' ]
      ├─ ModuleUsesDirective
      │  └─ ClassOrInterfaceType[ pmd-java:typeIs("com.example.foo.spi.Intf") ]
      └─ ModuleProvidesDirective
         ├─ ClassOrInterfaceType[ pmd-java:typeIs("com.example.foo.spi.Intf") ]
         └─ ClassOrInterfaceType[ pmd-java:typeIs("com.example.foo.Impl") ]

Method and Constructor declarations

Method grammar simplification

  • What: Simplify and align the grammar used for method and constructor declarations. The methods in an annotation type are now also method declarations.
  • Why: The method declaration had an nested node "MethodDeclarator", which was not available for constructor declarations. This made it difficult to write rules, that concern both methods and constructors without explicitly differentiate between these two.
  • #2034 [java] Align method and constructor declaration grammar
Code Old AST New AST
public class Sample {
    public Sample(int arg) throws Exception {
        super();
        greet(arg);
    }
    public void greet(int arg) throws Exception {
        System.out.println("Hello");
    }
}
ConstructorDeclaration "Sample"
  + FormalParameters
    + FormalParameter ...
  + NameList
    + Name "Exception"
  + ExplicitConstructorInvocation
    + Arguments
  + BlockStatement
    + Statement ...
MethodDeclaration
  + ResultType
  + MethodDeclarator "greet"
    + FormatParameters
      + FormalParameter ...
  + NameList
    + Name "Exception"
  + Block
    + BlockStatement
      + Statement ...
ConstructorDeclaration "Sample"
  + ModifierList
  + FormalParameters
    + FormalParameter ...
  + ThrowsList
    + ClassOrInterfaceType ...
  + Block
    + ExplicitConstructorInvocation
      + ArgumentList
    + ExpressionStatement
MethodDeclaration "greet"
  + ModifierList
  + VoidType
  + FormalParameters
    + FormalParameter ...
  + ThrowsList
    + ClassOrInterfaceType ...
  + Block
    + ExpressionStatement
public @interface MyAnnotation {
    int value() default 1;
}
AnnotationTypeDeclaration "MyAnnotation"
  + AnnotationTypeBody
    + AnnotationTypeMemberDeclaration
      + AnnotationMethodDeclaration "value"
        + Type ...
        + DefaultValue ...
AnnotationTypeDeclaration "MyAnnotation"
  + AnnotationTypeBody
    + AnnotationTypeMemberDeclaration
      + MethodDeclaration
        + ModifierList
        + PrimitiveType
        + FormalParameters ...
        + DefaultValue ...

Formal parameters

  • What: Use FormalParameter only for method and constructor declaration. Lambdas use LambdaParameter, catch clauses use CatchParameter
  • Why: FormalParameter's API is different from the other ones.
    • FormalParameter must mention a type node.
    • LambdaParameter can be inferred
    • CatchParameter cannot be varargs
    • CatchParameter can have multiple exception types (a UnionType now)
Code Old AST New AST
try {

} catch (@A IOException | IllegalArgumentException e) {

}
TryStatement
  + Block
  + CatchStatement
    + FormalParameter
      + Annotation "A"
      + Type
        + ReferenceType
          + ClassOrInterfaceType "IOException"
      + Type
        + ReferenceType
          + ClassOrInterfaceType "IllegalArgumentException"
      + VariableDeclaratorId
    + Block
TryStatement
  + Block
  + CatchClause
    + CatchParameter
      + ModifierList
        + Annotation "A"
      + UnionType
        + ClassOrInterfaceType "IOException"
        + ClassOrInterfaceType "IllegalArgumentException"
      + VariableDeclaratorId
    + Block
(a, b) -> {}
c -> {} 
(@A var d) -> {}
(@A int e) -> {}
Expression
  + PrimaryExpression
    + PrimaryPrefix
      + LambdaExpression
        + VariableDeclaratorId "a"
        + VariableDeclaratorId "b"
        + Block

Expression
  + PrimaryExpression
    + PrimaryPrefix
      + LambdaExpression
        + VariableDeclaratorId "c"
        + Block


Expression
  + PrimaryExpression
    + PrimaryPrefix
      + LambdaExpression
        + FormalParameters
          + FormalParameter
            + Annotation "A"
              + ...
            + VariableDeclaratorId "d"
        + Block

        
Expression
  + PrimaryExpression
    + PrimaryPrefix
      + LambdaExpression
        + FormalParameters
          + FormalParameter
            + Annotation "A"
              + ...
            + Type
              + PrimitiveType
            + VariableDeclaratorId "e"
        + Block
LambdaExpression
  + LambdaParameters
    + LambdaParameter
      + ModifierList
      + VariableDeclaratorId "a"
    + LambdaParameter
      + ModifierList
      + VariableDeclaratorId "b"
  + Block

+ LambdaExpression
  + LambdaParameters
    + LambdaParameter
      + ModifierList
      + VariableDeclaratorId "c"
  + Block


LambdaExpression
  + LambdaParameters
    + LambdaParameter
      + ModifierList
        + Annotation "A"
      + VariableDeclaratorId "d"
  + Block

LambdaExpression
  + LambdaParameters
    + LambdaParameter
      + ModifierList
        + Annotation "A"
      + PrimitiveType "int"
      + VariableDeclaratorId "e"
  + Block

New node for explicit receiver parameter

  • What: A separate node type ReceiverParameter is introduced to differentiate it from formal parameters.
  • Why: A receiver parameter is not a formal parameter, even though it looks like one: it doesn't declare a variable, and doesn't affect the arity of the method or constructor. It's so rarely used that giving it its own node avoids matching it by mistake and simplifies the API and grammar of the ubiquitous FormalParameter and VariableDeclaratorId.
  • #1980 [java] Separate receiver parameter from formal parameter
Code Old AST New AST
(@A Foo this, Foo other)
FormalParameters[@ParameterCount = 1]
  + FormalParameter[@ReceiverParameter=true()]
    + ClassOrInterfaceType
       + Annotation "A"
    + VariableDeclaratorId[@Image="this", @ReceiverParameter=true()]
  + FormalParameter
    + ClassOrInterfaceType
    + VariableDeclaratorId "other"
FormalParameters[@ParameterCount = 1]
  + ReceiverParameter
    + ClassOrInterfaceType
       + Annotation "A"
  + FormalParameter
    + ModifierList
    + ClassOrInterfaceType
    + VariableDeclaratorId "other"

Varargs

  • What: parse the varargs ellipsis as an ArrayType
  • Why: this improves regularity of the grammar, and allows type annotations to be added to the ellipsis
Code Old AST New AST
(int... is)
+ FormalParameter[ @Varargs = true() ]
  + Type
    + PrimitiveType "int"
  + VariableDeclaratorId "is"
+ FormalParameter[ @Varargs = true() ]
  + ArrayType
    + PrimitiveType "int"
    + ArrayDimensions
      + ArrayTypeDim[ @Varargs = true() ]
  + VariableDeclaratorId "is"
(int @A ... is)

n/a (parse error)

+ FormalParameter[ @Varargs = true() ]
  + ModifierList
  + ArrayType
    + PrimitiveType "int"
    + ArrayDimensions
      + ArrayTypeDim[ @Varargs = true() ]
        + Annotation "A"
  + VariableDeclaratorId "is"
(int[]... is)
+ FormalParameter[ @Varargs = true() ]
  + ModifierList
  + Type
    + ReferenceType
      + PrimitiveType "int"
  + VariableDeclaratorId "is"
+ FormalParameter[ @Varargs = true() ]
  + ModifierList
  + ArrayType
    + PrimitiveType "int"
    + ArrayDimensions
      + ArrayTypeDim
      + ArrayTypeDim[ @Varargs = true() ]
        + Annotation "A"
  + VariableDeclaratorId "is"

Add void type node to replace ResultType

Code Old AST New AST
void foo();
└─ MethodDeclaration
   └─ ResultType[@Void=true()]
   └─ MethodDeclarator
      └─ FormalParameters
└─ MethodDeclaration
   └─ ModifierList
   └─ VoidType
   └─ FormalParameters

int foo();
└─ MethodDeclaration
   └─ ResultType[@Void=false()]
      └─ Type
         └─ PrimitiveType
   └─ MethodDeclarator
      └─ FormalParameters
└─ MethodDeclaration
   └─ ModifierList
   └─ PrimitiveType
   └─ FormalParameters

Statements

Improve try-with-resources grammar

  • What: The AST representation of a try-with-resources statement has been simplified. It uses now LocalVariableDeclaration unless it is a concise try-with-resources grammar.
  • Why: Simpler integration try-with-resources into symboltable and type resolution.
  • #1897 [java] Improve try-with-resources grammar
Code Old AST New AST
try (InputStream in = new FileInputStream(); OutputStream out = new FileOutputStream();) { }
TryStatement
  + ResourceSpecification
    + Resources
      + Resource
        + Type
          + ReferenceType
            + ClassOrInterfaceType "InputStream"
        + VariableDeclaratorId "in"
        + Expression
          + ...
      + Resource
        + Type
          + ReferenceType
            + ClassOrInterfaceType "OutputStream"
        + VariableDeclaratorId "in"
        + Expression
          + ...
TryStatement
  + ResourceList[@TrailingSemiColon=true()]
    + Resource[@ConciseResource=false()]
      + LocalVariableDeclaration
        + ModifierList
        + Type
        + VariableDeclarator
          + VariableDeclaratorId "in"
          + ConstructorCall
            + ClassOrInterfaceType
            + ArgumentList
    + Resource[@ConciseResource=false()]
      + LocalVariableDeclaration
        + ModifierList
        + Type
        + VariableDeclarator
          + VariableDeclaratorId "in"
          + ConstructorCall
            + ClassOrInterfaceType
            + ArgumentList

InputStream in = new FileInputStream();
try (in) {}
TryStatement
  + ResourceSpecification
    + Resources
      + Resource
        + Name "in"
TryStatement
  + ResourceList[@TrailingSemiColon=false()]
    + Resource[@ConciseResource=true()]
      + VariableAccess[@VariableName='in']

Expressions

Merge unary expressions

Code Old AST New AST
++a;
--b;
c++;
d--;
StatementExpression
  + PreIncrementExpression
    + PrimaryExpression
      + PrimaryPrefix
        + Name "a"
StatementExpression
  + PreDecrementExpression
    + PrimaryExpression
      + PrimaryPrefix
        + Name "b"
StatementExpression
  + PostfixExpression "++"
    + PrimaryExpression
      + PrimaryPrefix
        + Name "c"
StatementExpression
  + PostfixExpression "--"
    + PrimaryExpression
      + PrimaryPrefix
        + Name "d"
StatementExpression
  + UnaryExpression[@Prefix=true()][@Operator="++"]
    + VariableAccess "a"
StatementExpression
  + UnaryExpression[@Prefix=true()][@Operator="--"]
    + VariableAccess "b"
StatementExpression
  + UnaryExpression[@Prefix=false()][@Operator="++"]
    + VariableAccess "c"
StatementExpression
  + UnaryExpression[@Prefix=false()][@Operator="--"]
    + VariableAccess "d"
~a
+a
UnaryExpression[@Image=null]
  + UnaryExpressionNotPlusMinus[@Image="~"]
    + PrimaryExpression
      + PrimaryPrefix
        + Name "a"

UnaryExpression[@Image="+"]
  + PrimaryExpression
    + PrimaryPrefix
      + Name "a"
+ UnaryExpression[@Operator="~"]
  + VariableAccess "a"

+ UnaryExpression[@Operator="+"]
  + VariableAccess "a"

Binary operators are left-recursive

  • What: For each operator, there were separate AST nodes (like AdditiveExpression, AndExpression, ...). These are now unified into a InfixExpression, which gives access to the operator via getOperator() and to the operands (getLhs(), getRhs()). Additionally, the resulting AST is not flat anymore, but a more structured tree.
  • Why: Having different AST node types doesn't add information, that the operator doesn't already provide. The new structure as a result, that the expressions are now parsed left recursive, makes the AST more JLS-like. This makes it easier for the type mapping algorithms. It also provides the information, which operands are used with which operator. This information was lost if more than 2 operands where used and the tree was flattened with PMD 6.
  • #1979 [java] Make binary operators left-recursive
Code Old AST New AST
int i = 1 * 2 * 3 % 4;
Expression
  + MultiplicativeExpression "%"
    + PrimaryExpression
      + PrimaryPrefix
        + Literal "1"
    + PrimaryExpression
      + PrimaryPrefix
        + Literal "2"
    + PrimaryExpression
      + PrimaryPrefix
        + Literal "3"
    + PrimaryExpression
      + PrimaryPrefix
        + Literal "4"
InfixExpression[@Operator='%']
 + InfixExpression[@Operator='*']
   + InfixExpression[@Operator='*']
     + NumericLiteral[@ValueAsInt=1]
     + NumericLiteral[@ValueAsInt=2]
   + NumericLiteral[@ValueAsInt=3]
 + NumericLiteral[@ValueAsInt=4]