B. ADQL tree Part 1

Once parsed, a query is converted into a Java object which is structured as a tree. Here, this tree and its different kinds of objects are described in more details...

The tree generated by the ADQL parser tries to follow the syntax of ADQL. Thus, ADQLQuery is splitted in the following main parts:

Schema of the main tree structure.

For each clause, ADQLQuery has a getter. For instance: getSelect(), getOrderBy(), ... With this class and with any other ADQL object (so, the clauses too), you can get its ADQL expression thanks to its method: toADQL().

Otherwise, ADQLQuery has another interesting function: getResultingColumns(). It returns all the columns selected by a parsed query resolved with a DBChecker. Thus, if all column or table references are resolved, ADQLQuery is able to return the list of all columns which will be returned at its execution. And it also works with "*" in the SELECT clause ! That's to say a "*" is converted into the corresponding list of columns.

Here is an simple example of use of ADQLQuery:

System.out.println("Parsed query:\n"+query.toADQL());

DBColumn[] columns = query.getResultingColumns();
System.out.println(columns.length+" selected columns:");
for(DBColumn col : columns)
	System.out.println("\t* "+col);

In order to stay always up to date, getResultingColumns() builds the list of columns on the fly. That's why you should always do as in the example: call this function one time, keep the returned array in a variable and work only with this variable.

ClauseADQL objects

The clauses SELECT, WHERE, GROUP BY, HAVING and ORDER BY can be viewed as lists of operands or constraints. The only thing that changes is the way they are concatenated: for instance, constraints must always be associated with a logical operator (AND or OR). That's why, all of these clauses are extensions of ADQLList. Consequently, items can be added, removed, got from these clauses as easily as in a Vector or an array.....

Since other parts of ADQL (i.e. IN) behave as lists, these 5 clauses do not extend directly ADQLList but ClauseADQL. Besides, that allows to apply common modifications for any of these ADQL clauses in only one class.

ClauseADQL<ADQLOrder> orderBy = query.getOrderBy();

System.out.println(orderBy.size()+" columns to order:");
for(ADQLOrder ob : orderBy)
	System.out.println("\t* "+ob.toADQL());
ADQList Iterable!

ADQLList implements the interface Iterable. Hence the direct iteration on the clause ORDER BY with a loop for. since Java 1.5

UML class diagram of ClauseADQL and its extensions.


On the contrary to a classic list, the SELECT clause may have two special attributes: DISTINCT and LIMIT. ClauseSelect lets you know whether they are used and their value thanks to: getLimit() and distinctColumns().

Besides, a SELECT clause is not just a list of operands: each selected item may have an alias. That's why, it has been designed to be a list of SelectItem. This class lets associate an ADQLOperand with an alias (if any).

Special case: SELECT *

A particular extension of SelectItem must be used to represent the wild card of a SELECT clause: SelectAllColumns. This SelectItem is always built either with an ADQLQuery (ex: SELECT *) or with an ADQLTable (ex: SELECT data.*) in order to translate "*" into a correct list of columns.

SELECT * ... => SelectAllColumns wildCard = query.getSelect().get(0);.

UML class diagram of ClauseSelect


In ADQL, any type of value (except logical expression) is called "operand". In the ADQL tree ADQLQuery, an operand is an implementation of the interface ADQLOperand. Thus you will have ADQLColumn to represent a column, NumericConstant for a numeric value, Operation for a numeric operation, ...

In order to distinguish operand types, you can use the functions isNumeric() and isString().

Special case: columns

These functions may all return true for some operands - like ADQLColumn. In this case, the type of the operand depends from the context.

UML class diagram of ADQLOperand


The clauses WHERE and HAVING are not like the other clauses because they are logical expressions. It means that all constraints must be linked by a logical operator: AND or OR. It also implies that constraints are not managed exactly in the same way. When a constraint is added in this kind of clause, a logical operator must be given. It can be done thanks to the functions: add(String, ADQLConstraint) and add(int, String, ADQLConstraint).

To get a constraint from this object, you just have to use the classic get(int) function. However, the logical operators between constraints must be extracted separately with getSeparator(int). If you want the operator after a given constraint whose the index is N, you will call this function with N. For the operator before this constraint, you will extract the separator at N-1.

Default "add" function

Of course, since this class also extends ADQLList, it has already a function to add item: add(ADQLObject). In this case, this function is equivalent to:

add(ClauseConstraints.getDefaultSeparator(), constraint)
The default separator is: OR.
UML class diagram of ClauseConstraints


A constraint is any logical expression (i.e. a comparison). It is represented by the interface ADQLConstraint in the ADQL tree. This interface imposes no particular method to implement, but generally, all parameters should be accessible by at least a getter.

UML class diagram of ADQLConstraint

ADQLFunction & GeometryFunction

In ADQL, like in SQL, you can call functions. Some of them return a boolean and so, could be used as constraints. However, ADQL does not manage boolean values: only numeric, string and geometry values are allowed. That's why any function of ADQL is represented in the ADQL tree as an implementation of ADQLOperand.

Theoretically, in ADQL you will found all standard functions of SQL: mathematics (sqrt, power, abs, cos, sin, ...) implemented by MathFunction, and group functions (count, max, min, ...) by SQLFunction. Because the goal of ADQL is to query in specified sky regions, you will also have geometry functions through the class GeometryFunction:

The IVOA definition of ADQL lets a service include its own functions in the language: then, they are called User Defined Functions (UDF). In the ADQL tree, these functions are represented by an implementation of UserDefinedFunction.

By default, ADQLParser allows any unknown function by asking to the factory to create a UserDefinedFunction for each of them. And by default the factory returns a default implementation: DefaultUDF. This behavior can however be changed either by extending ADQLQueryFactory (see A.2. Factory) or by using DBChecker (see A.3. Checker).

UML class diagram of ADQLFunction

ADQLTable & ADQLJoin

On the contrary to the other clauses, the clause FROM is not really a list ! It should be one table or a join between two or several tables. That's why its object representation extends neither ClauseADQL nor ADQLList. This clause is represented by the interface FromContent which can be either one table reference (ADQLTable) or a tables join (ADQLJoin).

Since there are different kinds of join, ADQLJoin is abstract and has one extension per join type: CrossJoin, InnerJoin and OuterJoin.

Cross join

In ADQL A CROSS JOIN B does not explicitely exist! However, when tables are given as a coma separated list, it corresponds to a CROSS JOIN between all the given tables. Hence the class CrossJoin.

An what about JOIN without condition?

The BNF grammar of ADQL allows the following join syntax: SELECT ... FROM A JOIN B WHERE .... However, without join condition, this syntax raises the following interpretation problem: is it a NATURAL INNER JOIN ? or is it a CROSS JOIN ? In order to avoid a such ambiguity, this syntax is forbidden in this library. If the type of the join is not specified, it is considered as an INNER JOIN and so there are two solutions: the keyword JOIN is prefixed by NATURAL or a join condition is given with the keyword ON or USING. Otherwise a ParseException is thrown.

UML class diagram of FromContent