The list of all modifications done recently on the library.
The major correction is to change the library used to manage file uploads. Instead of com.oreilly.servlet, UWSLib is now using Apache Commons File Upload.
Here are the other modifications included in this fix:
cos
.
COS was used only for the Base64 encoding and decoding. A Java-6 compatible solution with no dependency is now used instead.
Files uploaded by the user when creating/executing a synchronous job were never deleted after the job execution.
The same problem applied for the tables already uploaded in the database (in
TAP_UPLOAD
) when an error occurred before the end of the UPLOAD process.
Now, in case of error when uploading one or more files, or in case of success of the job, all uploaded files and their corresponding database tables are deleted after the end of the job.
Before correction, if two uploaded tables have been submitted by the user with the same name, or if one uploaded table contained duplicated column names, an obscure error message coming from the database was returned to the user.
Now, duplicated items (tables and columns) are searched before ingestion in the database. When one is detected, an error is immediately returned to the user and the query is aborted.
In this way, it is possible to run two different instances of a UWS/TAP service with a different temporary directory in the same JVM.
See on GitHub: cb5cdd7, 3d22e42, 25f373f, e447a48 and 3f6a3a2
org.json was included in the sources of UWS-/TAP-Lib because of a manual small correction. This is no longer needed. So this library is now a dependency (included in the generated JAR). This has also been upgraded to the version of August 2018.
It often happens that when restarting the Web application container (e.g. Tomcat) the service UWS/TAP can not restart because of corrupted backup files which can not be restored. This generally happens when the backup mode is "one file per user". In such case, some backup files are incomplete because the backup process has been interrupted by the Web application restart.
One easy way to reduce the frequency of this problem was to switch to the backup mode "one for the whole service". But this problem might still occur.
But now, this version of the UWS-Library fixes this error. A small side effect is that some backup files may not be written (or committed) and that you may see some files like "service.backup.temp-...." appear. You can delete these files without any problem ; they are just incomplete backup files whose the writing has been interrupted by the Web application restart. Anyway, the real backup file should still be ok and be recover-able, though very few times not being up-to-date with the very last submitted jobs.
When enabled, it was generating a file each minute on the day before the specified day of week.
For instance: if the log rotation frequency was
W 1 0 0
(so, weekly on Sunday at 00:00).
The rotation was performed on Saturday midnight. But,
because of the bug (i.e. bad index correction), the
rotation kept going on the whole day of Saturday. Since
the rotated file is suffixed by the timestamp with hours
and minutes (no seconds), it actually generated a new
log file for each minute of the saturday. Of course,
each time the file contained only one line (or 2 with
some luck)...which is pretty useless.
This error occurred generally during the backup process while trying to backup the job list of a specific user. If several of his jobs were running and changing state during the backup process, this ConcurrentModificationException was thrown. This generally happens when the same user submits a lot of shorts jobs in the same time.
This exception was due to a non thread-safe usage of
UWSParameters.additionalParams
. To fix this
issue, instead of creating it as a normal
HashMap,
it is now created as a
ConcurrentHashMap.
The same modification has also been applied to
UWSParameters.params.
In addition of the replacement of HashMap
into ConcurrentHashMap
, all
synchronized
blocks have been removed ;
there should not be needed any more.
Before this correction, the value of the attribute
byReference
of the XML node
parameter
was not ending with a double
quotes as it started. This ending quotes was missing.
This was generating an XML syntax error in clients.
The attribute version
in the node job
is never restored ; it is just
informative. Thus, it is now silently ignored.
The attribute length
of an upload was badly spelled.
The attribute mime-type
of a result was backuped as mime-type
but restored
as mime
. Hence the absence of this piece of information when restoring a job.
This bug occurred "just" due to an un-desired inverted test, since UWS-1.1 is implemented (UWSLib-4.3 and TAPLib-2.2).
When custom properties/keys about a user were encountered in a backup file, they were fetched as strings....but it is not systematically true. Hence some errors that could happen in such cases. So, now, they are instead fetched as objects.
In case a MIME-type parameter was not `q` set to a floating point value (e.g. correct is `q=0.8` ; incorrect is `q=abc`), the library was throwing an ugly NumberFormatException. This exception is now caught (and ignored) if it occurs.
The same exception was also thrown for any other parameter whose value is not a floating point. Since only the quality flag (i.e. `q`) is used in UWS-Lib, parameters are now only parsed if it is `q` ; all others are now ignored.
Content-Length
in HTTP responses for large results
...
The HTTP header Content-Length
is now set manually to a long value in the UWS
servlets instead of relying on the function HttpServletResponse.setContentLength(int)
.
Because this latter takes an integer value in parameter it is limiting the indication
about the size of the returned document to about 2GB. This sometimes broke some clients.
This version of UWS-Lib fully implements the UWS 1.1 protocol.
To better discover these new features AND to implement them in your UWS service, look at this UWS-1.1 introduction with UWS-Lib.
<uws>
,
<jobs>
and <job>
.
...
creationTime
to a job description.
...
This property is set automatically by the library and can not be changed.
ARCHIVED
.
...
Additionally, all possible execution phase transitions have been ensured in order to respect as much as possible the UWS-1.1 standard. Considering some incomplete description of some phases, the UWS-Lib had to make a choice on some phase transitions.
On this figure, you can see how UWS-Lib deals with all execution phase transitions.
Since the addition of the execution phase ARCHIVED
,
a UWS can now have different ways to "destroy" jobs. Three job
destruction policies have been implemented in UWS-Lib:
When archiving a job, its former phase is stored in jobInfo under the name 'oldPhase' ONLY IF no jobInfo is already set.
Archiving a job means that all input files and results are destroyed ; the error summary and jobInfo (even if it is a file) are kept.
These filter parameters are additive: their constraints are joint as with an AND operator (except for PHASE parameters ; see above).
If no filter is specified, all jobs EXCEPT the ARCHIVED
ones are listed. The
only way to list ARCHIVED
jobs is to use PHASE=ARCHIVED
(with or
without other filter parameters).
The filtering API has been made in a generic manner so that it is easily possible to create and add its custom filters. See the interface JobFilter and the class JobListRefiner for more details.
It is possible to choose how the blocking mechanism should behave (e.g. what the max. waiting period, how many requests can be blocked in the same time, what happen when the blocking times out, ...).
Indeed, the policy to apply must actually be an extension of the interface BlockingPolicy. Already two implementations are provided in the library (LimitedBlockingPolicy and UserLimitedBlockingPolicy), but a custom policy can perfectly be created and apply to a UWS service.
By default, no policy is set. In such case, the service will block the time specified by the user, which may be -1 (i.e. wait indefinitely). A BlockingPolicy can help controlling the waiting/blocking process and protect the resources of the server.
quote
format
...
The UWS-1.x standard defines the quote as an ISO-8601 date. UWS-Lib stores it as a number of seconds (i.e. estimated job duration).
This fix ensures that this integer/long quote value is returned as a date.
Note: The backup and restoration processes are not affected by this change. The backup file format is still the same: a quote stored as a long value.
The static public variable UWS.VERSION has been added so that clearly indicating which version of the UWS protocol is supported by the used UWSLibrary.
Dealing with several protocol versions in the same time is quite difficult and may significantly alter the UWSLibrary API in an unstable way. That's why, for the UWS library, only one version is implemented (i.e. the last one). To use an older version of the protocol, one must use an older version of the library.
In order to make the log more clear, the ID of a job is set with the ID of the HTTP request at the origin of the job creation. This unique ID is generated for each HTTP request.
jobInfo
to a UWSJob
...
This new feature is actually something from UWS-1.0 which was not supported in the UWS library...shame on me :( But now it is :) Which means that from now on, you can either create a job with a JDL (a document, generally XML, listing all parameters and other information needed to create and run a job ; see the IVOA REC-UWS-1.0 for more details), or merely add more special kind of information to a pending job.
If you are interested by this feature, see the following classes (while waiting for corresponding documentation):
BETA version of a configuration file. With this file, you do not need to implement any HTTPServlet. The only thing you need is to set few properties and implement JobThread.
See on GitHub: 03a31bc... and 2463d5f....
Now, a second (minor) version of the UWS protocol exists. Then, it becomes
important to make clear in the UWSLibrary which version of the protocol
is used. That's why the static attribute
UWS.VERSION
has been added. For the moment it is set to 1.0
.
The function UWSJob.isStopped()
considered the job as stopped when
the interrupted flag is set, even though the thread was still processing
(and the database too in the case of TAP job). Because of that it returned true
and the job phase was ABORTED
....but the thread was still processing.
With this fix, this function does not test any more the interrupted flag
and returns true
only if the thread is really stopped.
So, now it is now sure that a job in the phase ABORTED
is really stopped
and not keep processing in the background without any control.
Besides, if a UWS job was running when the result file was being written while the ABORT signal comes, the partial result file was never deleted even after the deletion of the job. This is also now fixed: all job results will be deleted as soon as the job is aborted.
Regarding the parsing of dates (for instance when setting the destruction date of a job), it was not possible to parse dates with no seconds, no day or no month. Representations in days of year or weeks of year were not possible as well. But this is now possible in respect of ISO-8601. Check out the ISO8601Format class' Javadoc for more details.
About the output of a date, the time zone part of the ISO-8601 representation was missing.
See on GitHub: c64caf2... and dd115f2....
Additionally, a serious source of error has been fixed in ISO8601Format. This class is using static attributes of type DecimalFormat. Unfortunately this type of objects can NOT be accessed by multiple threads simultaneously: it is not thread-safe. Parsing errors, mostly during TAP uploads, have been experienced for this reason.
To solve quickly this issue, the main static public functions of ISO8601Format have been synchronized.
When the content-type was not exactly application/x-www-form-urlencoded
for normal HTTP-POST requests, no parameter was read by the UWS library.
This test of the content-type has now been modified from a strict equality to a startsWith test.
Until now, it was possible to destroy the job by posting ACTION=DELETE with a URL like below:
{root-uws}/{job-list}/{job-id}/foo/bar
That is completely wrong. The correct URL for this action must always be:
{root-uws}/{job-list}/{job-id}
When modifying a job or job list (e.g. deleting a job), the Java exception ConcurrentModificationException may have occurred. Even though this error did not stop the service or did not have any impact (else than a worrying message in the log file), this kind of error should never happen. This was a (very) bad developer choice, which is now fixed.
See on GitHub (and the corresponding GitHub issue)
Additionally, another potential synchronization issue has been fixed in UWSJob about the notification of all JobObservers when a job is modified.
Until now, HTTP-multipart requests were not supported. It was then impossible to upload file(s) that would contain all job attributes and parameters (cf JDL) or would be a job parameter (e.g. a file to process).
This new sub-version of the UWS Library changes the way HTTP parameters are fetched and interpreted. In addition of well-known application/x-www-form-url-encoded POST parameter format, multipart/form-data formatted parameters are now well supported by the library. Besides, it is also possible to customize the way HTTP parameters are fetched by providing its own implementation of the new interface RequestParser. The class used by default by the library is UWSRequestParser.
In previous versions, HTTP responses sent by the UWS Library never specified a character encoding. This is now fixed to UTF-8 for all types of responses.
This encoding can not be changed by library implementers.
XML documents sent by the library were sometimes not valid regarding the UWS XML schema or the W3C. This is now fixed!
The IVOA impose the usage of ISO-8601 to format dates in UWS services. This is now strictly respected by the UWS Library.
Thanks to M. Wenger (CDS), a rotation is now possible on the log files managed by LocalUWSFileManager (see its function setLogRotationFreq(String) ; by default a rotation is performed every day at midnight).
This version is not backward compatible. Unfortunately, no migration page is available for the moment. However, you are strongly encouraged to take a look to the provided examples to see how to create a UWS service with this version and to compare with your own implementation using v3.0.
Sorry for the inconvenience!
In this new version of the library AbstractUWS, BasicUWS, ExtendedUWS, BasicQueuedUWS and ExtendedQueuedUWS does not exist any more. All these classes are replaced by only one concrete class: UWSService. The type of job to manage is not asked any more ; this class is able to deal with several types of job without knowing in advance their type.
This class implements the interface UWS.
By using UWSService, it is still needed to create an HTTPServlet and to forward request to the UWS. The version 4.0 of the library includes which is already an HTTPServlet but also a UWS: UWSServlet. This class is easier to use and avoid writing the same HTTPServlet for each UWS service.
In UWS, the description of a job (i.e. its attributes and parameters) never changes on the contrary to its processing. Thus, both of this parts are now separated in the version 4 of the library: the job description stays in AbstractJob and the processing goes in JobThread.
Consequently, 2 important modifications have been done:
Since a job is not any more an abstract class, there is no more need to have a generic type on some classes.
Thus, the classes UWSAction,
JobList, UWSSerializer
and ExecutionManager are not any more parameterized by a generic type.
So, what was before written new JobList<MyJob>(...)
will be now written
new JobList(...)
.
Because the type of jobs to manage is not specified any more, the UWS library now uses a factory in order to create job's threads. This factory must implement the new interface: UWSFactory. In addition of the creation of job's thread, it is also used to create a job description and to extract UWS parameters from an HTTP request.
An abstract implementation of this interface is also provided: AbstractUWSFactory, avoiding the obvious creation of job descriptions and extraction of UWS parameters. Thus, thanks to this abstract class, only one function must be provided: createJobThread(UWSJob).
In previous versions, job results but also errors and eventually backups were done in different classes in function of the context in which they were needed. Due to that separation, it was more difficult to control where files had to be created. Hence the new interface UWSFileManager which lets define a class which will centralize the management of all files that the service has to deal with. One concrete implementation managing files on the local file system is provided by this new version of the library: LocalUWSFileManager.
This new interface has been designed in order to allow any library user to manage his/her files in the desired way. It is, for instance, fairly possible to write results, errors or logs into a database, on a remote server or into a VOSpace.
As for the file management, it is now possible to log infos, warnings or errors in a nice way using the new interface UWSLog and especially its default implementation DefaultUWSLog.
UWS jobs are managed by the library entirely in memory. It means that in case of service interruption all jobs (running or not) are lost. A frozen and not ideal way to backup and restore jobs was provided in previous versions (i.e. using 2 static functions of UWSToolBox). The version 4.0 of UWS Library includes now the possibility to backup and to restore jobs in a customizable and clear method thanks to another interface: UWSBackupManager. A default implementation is also provided: DefaultUWSBackupManager.
Until the version 4.0, a UWS service user (i.e. job owner) was represented with just his/her name. This was obviously not enough if a service implementation wants to manage user permissions. Consequently, a job owner must now be represented by an object implementing the interface JobOwner. A default implementation of this interface is also provided - DefaultJobOwner - but implementing the interface by yourself is recommended for a better permission management ; few methods need to be implemented and so it should be long to do.
This version is not backward compatible. However, you can follow the instructions given on this migration help in order to migrate your code from v2.0 to v3.0.
A class dedicated to the management of the execution of all jobs has been added: ExecutionManager. Two additional extensions of AbstractUWS are able by default to manage an execution queue.
The job attributes destructionTime and executionDuration are now taken into account by the UWS to respectively destroy and abort the job automatically when the specified time is elapsed.
For the moment no standard has been proposed about the user identification in a UWS. However it is now possible in this library to specify the way a user can be identified.
In the previous version all default UWS behaviors described in the IVOA Recommendation were implemented by default. But now you can also customize them or add new actions. For that they must extend the abstract class UWSAction.
The default format of UWS resources is the XML. Even if it is possible to link a XML document with a XSLT style sheet it could be interesting to provide a way to return all UWS resources in another format. In the version 3 of this library you can define other UWS serializations than the XML one. Then by adding the HTTP header Accept to its request, a UWS user will be able to ask any UWS resource in any format.
The value of the job attributes destructionTime and executionDuration of all managed jobs can be control globally by the UWS. In other words, the UWS administrator can set a default and a maximum value to these attributes. Obviously more controls are not excluded.
With the version 3 of this UWS library you have now the notion of JobObserver. It allows any object declared as JobObserver to be notified at each modification of the observed jobs phase.
Redirections are now done by using the HTTP status code 303 rather than 302.
HTTP methods - GET, POST, PUT and DELETE - can be distinguished directly by the library. This is particularly true for the creation of a job: before, a POST request with at least one parameter was required, but now, any POST request on a jobs list URL is enough.
Simple job attributes (i.e. runID, phase, startTime, ...) are now returned in text/plain rather than application/xml.