Check

The module reads data from source tables with the help of a query defined by the datasqill developer and checks the data situation against a stored target value. The result of the check is documented and converted into a desired execution status of the action. The module has a polling function, with which the SQL query can be executed repeatedly at the specified time interval. The module can be used for example to validate data or for execution control (waiting until a certain condition is satisfied).

Name Meaning
Module Check
Module Class DsModCheck
Type Java
Purpose The module reads data from one or more source tables with the help of a SQL query and evaluates the result against the target value defined by the developer in the action. The SQL query is defined by the datasqill developer via the GUI. All sources used must be in the same database.
Transformation Code SQL query
Sources Source tables in a database
Targets No target, can be integrated as a condition in the data flow

Description

The stored SQL query is executed 1:1 against the source database. The result is compared with the target value stored in the "Expected Result" parameter.

If there is equality, the action is completed with success status.

If there is inequality, the query is executed again after the time specified in the "Polling Interval" parameter (in seconds) has elapsed, provided that the maximum number of query executions stored in the "Poll Times" parameter has not yet been reached.

In case of inequality after the final query, the action ends depending on the "Die if no match" parameter. If it is not set (default), the action ends successfully. If it is set, the action is rated as failed and may halt the data flow in the batch at this point.

If the query returns more than one row, the number of returned rows is used as the comparison value against the target value.

If the query returns no rows, 0 is used as the comparison value against the target value.

Data Sources

The sources of the module are tables that are read via the query defined by the datasqill developer. All tables must be in the same database. For these tables, the datasqill runtime user must have read rights. All source tables used in the SQL query must be connected to the module input in the datasqill GUI in the graphical data model.

Data Target

The module has no standard target. It can be integrated into the data flow to e.g. validate the presence of data in source tables before the loading processes that depend on it can start. In this case, the following object must be connected to the module output in the datasqill GUI in the graphical data model.

Attributes

In the delivered state, the module offers the following attributes in the GUI for the datasqill developer:

Name Type Meaning
Expected Result Number or String Target value with which the result of the SQL query is compared (Default 0).
Poll Times Number Number of query executions before the action is completed (Default 1).
Polling Interval Number Time between two consecutive query repetitions in seconds (Default 60).
Die if no match Boolean If set, the job is marked as failed if there is no equality after all query repetitions (Default FALSE).

Examples

The example checks whether data is present in the required source tables.

checksheet

With the query

SELECT MIN(available)
  FROM ( SELECT SIGN(COUNT(*)) AS available
           FROM stage.region r
          UNION ALL
         SELECT SIGN(COUNT(*)) AS available
           FROM stage.nation n ) AS source_tables

it is determined whether the source tables stage.region and stage.nation contain records and further processing can proceed.

In case of a return value of 1, the tables involved are not empty and the check is successful.

checkaction

The set parameters cause the query to be executed up to 5 times at intervals of 60 seconds each, and in case of still missing data at the end, further processing is halted.