Once parsed, a query is converted into a Java object whose structure is a tree.
In this documentation section, this tree and its different children are described in more details...
The tree generated by the ADQL parser tries to follow the syntax of ADQL. Thus, ADQLQuery is splitted in the following main parts:
For each clause, ADQLQuery has a getter. For instance: getSelect(), getOrderBy(), ... With this class and with any other ADQL object, you can get its ADQL expression thanks to its method: toADQL().
The function getResultingColumns()
of ADQLQuery lets return all the columns selected by a
parsed query resolved with a DBChecker. Thus, if all column or table
references are resolved, ADQLQuery is
able to return the list of all columns which will be returned at its execution. And of course, this is
particularly interesting when a selected item is *
.
Here is an simple example of use of ADQLQuery:
System.out.println("Parsed query:\n"+query.toADQL()); DBColumn[] columns = query.getResultingColumns(); System.out.println(columns.length+" selected columns:"); for(DBColumn col : columns) System.out.println("\t* "+(col.getTable() != null ? col.getTable().getADQLName()+"." : "")+col.getADQLName());
In order to stay always up to date, getResultingColumns() re-builds the list of columns each time it is called. For efficiency reason, it is recommended to call this function, keep its returned array in memory and work with this in-memory result as much as you can.
The clauses SELECT
, WHERE
, GROUP BY
,
HAVING
and ORDER BY
can be viewed as lists of operands or
constraints. The only thing that changes is the way they are concatenated: for instance,
constraints must always be associated with a logical operator (AND
or
OR
), but not selected items. That's why, all of these clauses are extensions of
ADQLList.
Consequently, items can be added, removed, got from these clauses as easily as in
a Vector or an array.
Since other parts of ADQL also behave as lists (e.g. IN
), these 5 clauses
must be treated a bit differently: they do not extend directly
ADQLList but
ClauseADQL.
ClauseADQL<ADQLOrder> orderBy = query.getOrderBy(); System.out.println("ADQL[OrderBy]:\n"+orderBy); System.out.println(orderBy.size()+" columns to order:"); for(ADQLOrder ob : orderBy) System.out.println("\t* "+ob.toADQL());
On the contrary to a classic list, the SELECT
clause may have two special attributes:
DISTINCT
and LIMIT
. ClauseSelect
lets you know whether they are used and their value thanks to:
getLimit() and
distinctColumns().
Besides, a SELECT
clause is not just a list of operands:
each selected item may have an alias. That's why, it has been designed to be a list of
SelectItem. This class
lets associate an ADQLOperand
with an alias (if any).
SELECT *
A particular extension of SelectItem
must be used to represent the wild card of a SELECT
clause: SelectAllColumns. This SelectItem is always
built either with an ADQLQuery (ex: SELECT *
)
or with an ADQLTable (ex: SELECT data.*
)
in order to translate "*" into a correct list of columns.
Example:
SELECT * ...
=> SelectAllColumns wildCard = query.getSelect().get(0);
.
In ADQL, any type of value (except logical expression) is called "operand". In the ADQL tree ADQLQuery, an operand is an implementation of the interface ADQLOperand. Thus you will have ADQLColumn to represent a column, NumericConstant for a numeric value, Operation for a numeric operation, ...
In order to distinguish operand types, you can use the functions isNumeric(), isString() and isGeometry().
These three functions may all return true
for some operands - like ADQLColumn.
In this case, the type of the operand depends from the context. Thus, if a DBChecker is used,
the exact type of the column can be known. This is actually the only case where only one of
these functions will return true
and all the others false
.
The clauses WHERE
and HAVING
are not like the other clauses
because they are logical expressions. It particularly means that all constraints must be linked
by a logical operator: AND
or OR
. It implies that
constraints are not managed exactly in the same way. When a constraint is added
in this kind of clause, a logical operator must be given. It can be done thanks to the
functions: add(String, ADQLConstraint)
and add(int, String, ADQLConstraint).
To get a constraint from this object, you just have to use the classic get(int) function. However, the logical operators between constraints must be extracted separately with getSeparator(int). If you want the operator after a given constraint whose the index is N, you will call this function with N. For the operator before this constraint, you will extract the separator at N-1.
Of course, since this class also extends ADQLList, it has already a function to add item: add(ADQLObject). In this case, this function is equivalent to:
add(ClauseConstraints.getDefaultSeparator(), constraint)The default separator is:
OR
.
A constraint is any logical expression (e.gd comparison). It is represented by the interface ADQLConstraint in the ADQL tree. This interface imposes no particular method to implement, but generally, all parameters should be accessible with a getter function.
In ADQL, like in SQL, you can call functions. Some of them return a boolean and so, could be used as constraints. However, ADQL does not support boolean values: only numeric, string and geometry values are allowed. That's why any function of ADQL is represented in the ADQL tree as an implementation of ADQLOperand.
Theoretically, in ADQL you will found all standard functions of SQL: mathematics (sqrt, power, abs, cos, sin, ...) implemented by MathFunction, and group functions (count, max, min, ...) by SQLFunction. Because the goal of ADQL is to query in specified sky regions, you will also have geometry functions through the class GeometryFunction:
POINT
CIRCLE
The IVOA definition of ADQL lets a service include its own functions in the language: then, they are called User Defined Functions (UDF). In the ADQL tree, these functions are represented by an implementation of UserDefinedFunction.
By default, ADQLParser allows any unknown function by asking to the factory to create a UserDefinedFunction for each of them. And by default the factory returns a default implementation: DefaultUDF. This behavior can however be changed either by extending ADQLQueryFactory (see A.2. Factory) or by using DBChecker (see A.3. Checker).
On the contrary to the other clauses, the clause FROM
is not really a list.
It is either a table or a join between two or several tables. That's why its object
representation extends neither ClauseADQL
nor ADQLList. This clause is represented
by the interface FromContent
which can be either one table reference (ADQLTable)
or a tables join (ADQLJoin).
Since there are different kinds of join, ADQLJoin is abstract and has one extension per join type: CrossJoin, InnerJoin and OuterJoin.
In ADQL A CROSS JOIN B
does not explicitely exist! However, when tables are given as a coma
separated list, it corresponds to a CROSS JOIN between all the given tables. Hence the class
CrossJoin.
The BNF grammar of ADQL allows the following join syntax: SELECT ... FROM A JOIN B WHERE ...
.
However, without join condition, this syntax raises the following interpretation problem:
is it a NATURAL INNER JOIN? is it a CROSS JOIN?
In order to avoid a such ambiguity, this syntax is forbidden in this library. If the type of
the join is not specified, it is considered as an INNER JOIN and so there are two solutions:
the keyword JOIN is prefixed by NATURAL or a join condition is given with the keyword ON or USING.
Otherwise a ParseException is thrown.