B. ADQL tree Part 1

Once parsed, a query is converted into a Java object whose structure is a tree.
In this documentation section, this tree and its different children are described in more details...

The tree generated by the ADQL parser tries to follow the syntax of ADQL. Thus, ADQLQuery is splitted in the following main parts:

Schema of the main tree structure.

For each clause, ADQLQuery has a getter. For instance: getSelect(), getOrderBy(), ... With this class and with any other ADQL object, you can get its ADQL expression thanks to its method: toADQL().

The function getResultingColumns() of ADQLQuery lets return all the columns selected by a parsed query resolved with a DBChecker. Thus, if all column or table references are resolved, ADQLQuery is able to return the list of all columns which will be returned at its execution. And of course, this is particularly interesting when a selected item is *.

Here is an simple example of use of ADQLQuery:

System.out.println("Parsed query:\n"+query.toADQL());

DBColumn[] columns = query.getResultingColumns();
System.out.println(columns.length+" selected columns:");
for(DBColumn col : columns)
	System.out.println("\t* "+(col.getTable() != null ? col.getTable().getADQLName()+"." : "")+col.getADQLName());
Recommendation

In order to stay always up to date, getResultingColumns() re-builds the list of columns each time it is called. For efficiency reason, it is recommended to call this function, keep its returned array in memory and work with this in-memory result as much as you can.

ClauseADQL objects

The clauses SELECT, WHERE, GROUP BY, HAVING and ORDER BY can be viewed as lists of operands or constraints. The only thing that changes is the way they are concatenated: for instance, constraints must always be associated with a logical operator (AND or OR), but not selected items. That's why, all of these clauses are extensions of ADQLList. Consequently, items can be added, removed, got from these clauses as easily as in a Vector or an array.

Since other parts of ADQL also behave as lists (e.g. IN), these 5 clauses must be treated a bit differently: they do not extend directly ADQLList but ClauseADQL.

ClauseADQL<ADQLOrder> orderBy = query.getOrderBy();
System.out.println("ADQL[OrderBy]:\n"+orderBy);

System.out.println(orderBy.size()+" columns to order:");
for(ADQLOrder ob : orderBy)
	System.out.println("\t* "+ob.toADQL());
ADQList is iterable!

ADQLList implements the interface Iterable. Hence the direct iteration on the clause ORDER BY with a loop for. since Java 1.5

UML class diagram of ClauseADQL and its extensions.

ClauseSelect

On the contrary to a classic list, the SELECT clause may have two special attributes: DISTINCT and LIMIT. ClauseSelect lets you know whether they are used and their value thanks to: getLimit() and distinctColumns().

Besides, a SELECT clause is not just a list of operands: each selected item may have an alias. That's why, it has been designed to be a list of SelectItem. This class lets associate an ADQLOperand with an alias (if any).

Special case: SELECT *

A particular extension of SelectItem must be used to represent the wild card of a SELECT clause: SelectAllColumns. This SelectItem is always built either with an ADQLQuery (ex: SELECT *) or with an ADQLTable (ex: SELECT data.*) in order to translate "*" into a correct list of columns.

Example:
SELECT * ... => SelectAllColumns wildCard = query.getSelect().get(0);.

ADQLOperand

In ADQL, any type of value (except logical expression) is called "operand". In the ADQL tree ADQLQuery, an operand is an implementation of the interface ADQLOperand. Thus you will have ADQLColumn to represent a column, NumericConstant for a numeric value, Operation for a numeric operation, ...

In order to distinguish operand types, you can use the functions isNumeric(), isString() and isGeometry().

Special case: columns

These three functions may all return true for some operands - like ADQLColumn. In this case, the type of the operand depends from the context. Thus, if a DBChecker is used, the exact type of the column can be known. This is actually the only case where only one of these functions will return true and all the others false.

UML class diagram of ADQLOperand.

ClauseConstraints

The clauses WHERE and HAVING are not like the other clauses because they are logical expressions. It particularly means that all constraints must be linked by a logical operator: AND or OR. It implies that constraints are not managed exactly in the same way. When a constraint is added in this kind of clause, a logical operator must be given. It can be done thanks to the functions: add(String, ADQLConstraint) and add(int, String, ADQLConstraint).

To get a constraint from this object, you just have to use the classic get(int) function. However, the logical operators between constraints must be extracted separately with getSeparator(int). If you want the operator after a given constraint whose the index is N, you will call this function with N. For the operator before this constraint, you will extract the separator at N-1.

Default "add" function

Of course, since this class also extends ADQLList, it has already a function to add item: add(ADQLObject). In this case, this function is equivalent to:

add(ClauseConstraints.getDefaultSeparator(), constraint)
The default separator is: OR.
UML class diagram of ClauseConstraints.

ADQLConstraint

A constraint is any logical expression (e.gd comparison). It is represented by the interface ADQLConstraint in the ADQL tree. This interface imposes no particular method to implement, but generally, all parameters should be accessible with a getter function.

UML class diagram of ADQLConstraint.

ADQLFunction & GeometryFunction

In ADQL, like in SQL, you can call functions. Some of them return a boolean and so, could be used as constraints. However, ADQL does not support boolean values: only numeric, string and geometry values are allowed. That's why any function of ADQL is represented in the ADQL tree as an implementation of ADQLOperand.

Theoretically, in ADQL you will found all standard functions of SQL: mathematics (sqrt, power, abs, cos, sin, ...) implemented by MathFunction, and group functions (count, max, min, ...) by SQLFunction. Because the goal of ADQL is to query in specified sky regions, you will also have geometry functions through the class GeometryFunction:

The IVOA definition of ADQL lets a service include its own functions in the language: then, they are called User Defined Functions (UDF). In the ADQL tree, these functions are represented by an implementation of UserDefinedFunction.

By default, ADQLParser allows any unknown function by asking to the factory to create a UserDefinedFunction for each of them. And by default the factory returns a default implementation: DefaultUDF. This behavior can however be changed either by extending ADQLQueryFactory (see A.2. Factory) or by using DBChecker (see A.3. Checker).

UML class diagram of ADQLFunction

ADQLTable & ADQLJoin

On the contrary to the other clauses, the clause FROM is not really a list. It is either a table or a join between two or several tables. That's why its object representation extends neither ClauseADQL nor ADQLList. This clause is represented by the interface FromContent which can be either one table reference (ADQLTable) or a tables join (ADQLJoin).

Since there are different kinds of join, ADQLJoin is abstract and has one extension per join type: CrossJoin, InnerJoin and OuterJoin.

Cross join

In ADQL A CROSS JOIN B does not explicitely exist! However, when tables are given as a coma separated list, it corresponds to a CROSS JOIN between all the given tables. Hence the class CrossJoin.

An what about JOIN without condition?

The BNF grammar of ADQL allows the following join syntax: SELECT ... FROM A JOIN B WHERE .... However, without join condition, this syntax raises the following interpretation problem: is it a NATURAL INNER JOIN? is it a CROSS JOIN? In order to avoid a such ambiguity, this syntax is forbidden in this library. If the type of the join is not specified, it is considered as an INNER JOIN and so there are two solutions: the keyword JOIN is prefixed by NATURAL or a join condition is given with the keyword ON or USING. Otherwise a ParseException is thrown.

UML class diagram of FromContent.