Example Module

Functionality

Our module shall select data from database tables using an SQL query and write it as a comma-separated list to a file. The module is named DsModDemo. It expects one or more source tables and writes the data provided by the SQL query to a file ("Flat File").

Implementation

For simplification, we derive our module class from a base class in which some convenience methods are already implemented.

public class DsModDemo extends BaseModule implements datasqillProgram {

In it we implement the "execute" method:

public datasqillTransformationAnswer execute(datasqillTransformationRequest request) {
    ...
}

and perform the following steps there:

  1. The start time is determined and an empty response is created.
  2. The parameters from the incoming request are read.
  3. The database connection for the source tables is opened.
  4. Depending on the command, the transformation is validated or executed.
  5. Finally, the results (metadata) of the execution are written to the response.

Determining the start time and creating an empty response:

// start time measuring and create response
long startTime = System.nanoTime();
datasqillTransformationAnswer response = new datasqillTransformationAnswer(request.getRequestId(), 0, "DsModDemo", null, 0, null, -1, null, null);

Reading parameters from the incoming request:

// read parameters
Action action = getAction(request);
String sqlQuery = getActionString(action);
String cmd = getCommand(request);
ObjectDs target = getTarget(action);
List<ObjectDs> sources = getSources(action);
Long targetDatabaseID = getDatabaseId(sources);
String delimiter = action.getAttributesString("Delimiter", ",");
boolean append = action.getAttributesChar("Append", 'N') == 'Y';

The helper methods used come from the BaseModule class from which the module class is derived.

The database connection is opened:

// open database connection
datasqillDBConnection dbCon = openConnection(targetDatabaseID);

Depending on the command, the transformation is executed or validated:

// run command
switch (cmd) {
case "Run":
    // execute data transformation
    int count = doRun(dbCon, target, sqlQuery, delimiter, append);
    response.setRowsProcessed(count);
    break;
case "Validate":
    // validate sql query and model
    doValidate(response, dbCon, sqlQuery);
    break;
default:
    throw new DsModException(DsModErrorCodes.ERR_CODE_MISSING_COMMAND, ", got command " + cmd);
}

Finally, the response is filled with result data and returned:

// add result data to response
long hostId = action.getExecutingHostId();
long moduleDuration = System.nanoTime() - startTime;
List<datasqillStatistic> statistics = new ArrayList<>();
statistics.add(new datasqillStatistic(hostId, "module_duration", moduleDuration));
response.setStatistics(statistics);

The complete "execute" method then looks like this:

public datasqillTransformationAnswer execute(datasqillTransformationRequest request) {
    LOGGER.info("Starting module");
    // start time measuring and create response
    long startTime = System.nanoTime();
    datasqillTransformationAnswer response = new datasqillTransformationAnswer(request.getRequestId(), 0,
            "DsModFilterCsvFile", null, 0, null, -1, null, null);
    try {
        // read parameters
        Action action = getAction(request);
        String sqlQuery = getActionString(action);
        String cmd = getCommand(request);
        ObjectDs target = getTarget(action);
        List<ObjectDs> sources = getSources(action);
        Long targetDatabaseID = getDatabaseId(sources);
        String delimiter = action.getAttributesString("Delimiter", ",");
        boolean append = action.getAttributesChar("Append", 'N') == 'Y';
        // open database connection
        datasqillDBConnection dbCon = openConnection(targetDatabaseID);
        // run command
        switch (cmd) {
        case "Run":
            // execute data transformation
            int count = doRun(dbCon, target, sqlQuery, delimiter, append);
            response.setRowsProcessed(count);
            break;
        case "Validate":
            // validate sql query and model
            doValidate(response, dbCon, sqlQuery);
            break;
        default:
            throw new DsModException(DsModErrorCodes.ERR_CODE_MISSING_COMMAND, ", got command " + cmd);
        }
        // add result data to response
        long hostId = action.getExecutingHostId();
        long moduleDuration = System.nanoTime() - startTime;
        List<datasqillStatistic> statistics = new ArrayList<>();
        statistics.add(new datasqillStatistic(hostId, "module_duration", moduleDuration));
        response.setStatistics(statistics);
    } catch (DsModException e) {
        response.setErrorCode(e.getErrorCode());
        response.setErrorMessage(e.getErrorMessage());
    }
    LOGGER.info("Ending module");
    // return response
    return response;
}

The "doRun" method for executing the transformation is relatively simple. It opens the output file, executes the database query, and writes the result sets line by line to the output file:

private int doRun(datasqillDBConnection dbCon, ObjectDs target, String sqlQuery, String delimiter, boolean append)
        throws DsModException {
    // open the csv target file
    String directory = target.getObjectOwner();
    String filename = target.getObjectName();
    File file = new File(directory, filename);
    try (FileWriter fw = new FileWriter(file, append);
            BufferedWriter bw = new BufferedWriter(fw);
            PreparedStatement preparedStatement = dbCon.prepareStatement(sqlQuery);
            ResultSet resultSet = preparedStatement.executeQuery()) {
        // write csv header line
        ResultSetMetaData metaData = resultSet.getMetaData();
        int columnCount = metaData.getColumnCount();
        for (int i = 1; i <= columnCount; i++) {
            bw.write(metaData.getColumnLabel(i));
            if (i < columnCount)
                bw.write(delimiter);
        }
        bw.write("\n");
        // transfer data row by row to csv target file
        int count = 0;
        while (resultSet.next()) {
            for (int i = 1; i <= columnCount; i++) {
                bw.write(resultSet.getString(i));
                if (i < columnCount)
                    bw.write(delimiter);
            }
            bw.write("\n");
            count++;
        }
        // and return number of rows processed
        return count;
    } catch (Exception e) {
        LOGGER.log(Level.SEVERE, e.getMessage(), e);
        throw new DsModException(DsModErrorCodes.ERR_CODE_TRANSFORM_PROBLEM);
    }
}

The validation method is called "doValidate". It relies on library functions of the class "datasqillDBValidate", which perform the actual work:

private boolean doValidate(datasqillTransformationAnswer response, datasqillDBConnection dbConSourceDb,
        String sqlQuery) {
    boolean result = datasqillDBValidate.validateSQLStatement(dbConSourceDb, response, sqlQuery);
    if (result)
        datasqillDBValidate.setStatementDependency(dbConSourceDb, response,
                "SELECT * FROM (" + sqlQuery + ") query WHERE 1=0");
    LOGGER.info("Validation result: " + result);
    return result;
}

In addition, helper classes are used. On the one hand, the aforementioned base class BaseModule with various helper methods. On the other hand, an enumeration DsModErrorCode with all error numbers and error texts and an exception DsModException.

And finally, there is of course the already introduced "main" method:

public static void main(String args[]) {
    datasqillProgramStart.init(args, new DsModDemo());
}

The complete example can be downloaded here.

Deployment

Finally, we want to deploy and use our module in the server. Several steps must be performed for this:

  1. Create Java archive of the module and place it on the server
  2. Insert configuration parameters for the module in the datasqill repository

Build Module

The easiest way to compile the module is with a build program like "Maven". For this, a Project Object Model (POM) is created that references the required libraries and contains build instructions. The example module includes such a POM file. With it, the compilation and build of the result artifacts can be triggered via "Maven":

mvn package

As a result, a Java archive (JAR file) is created, which is placed on the server in the library directory at ~/lib/DsModDemo.jar.

Configuration

Configuration parameters must be stored in the datasqill repository for a module.

The easiest way to do this is via datasqill's deployment tools. For this, one or more JSON configuration files must be created that are deployed to the repository.

The JSON configuration file for the DsModDemo module contains the respective target tables in the repository as well as the record(s) to be inserted. For our module we need entries in two tables

  • vv_sqts_module - Main table for modules
  • vv_sqts_action_attr_definition - Attribute table for modules

The table vv_sqts_module, into which a record must be inserted, has the following columns:

  • module_name: unique module identifier
  • module_name: the module name
  • module_type_id: the module type
  • description: a description
  • module_command: the module command (module name for Java modules)

Furthermore, an entry must be created in the table vv_sqts_action_attr_definition for each attribute:

  • action_type: the module name
  • attr_name: the attribute name
  • attr_data_type: the attribute type (BOOLEAN, CHAR, CLOB, SQL, NUMBER)
  • attr_label: the display name in the datasqill GUI
  • attr_description: the help text in the datasqill GUI
  • attr_position: the position of the attribute in the datasqill GUI

A JSON configuration file dsmoddemo.dsdb could look like this for our example module:

{
  objectList:
  [
    {
      dsdbFormat: 1
      deploymentType: version
      current:
      {
        version: 3.5
        data:
        [
          {
            target: vv_sqts_module
            columns: [ "module", "module_name", "module_type_id", "description", "module_command" ]
            rows: [
              [ "DemoModule", "Write CSV File (Demo)", 2, "Writes data from a database query to a csv file", "DsModDemo" ]
            ]
          }
          {
            target: vv_sqts_action_attr_definition
            columns: [ "action_type", "attr_name", "attr_label", "attr_data_type", "default_value", "attr_description", "attr_position" ]
            rows: [
              [ "DemoModule", "Delimiter", "Column Delimiter", "CHAR", ",", "Column delimiter (default is ',')", 1 ]
              [ "DemoModule", "Append", "Append", "BOOLEAN", "Y", "Append, do not truncate file", 2 ]
            ]
          }
        ]
      }
    }
  ]
}

The module column contains a unique identifier for a module. It is used in other tables for referencing. The module name is displayed in the GUI. The module type is 2, which stands for Java module. A complete overview of the supported types can be found in the "vv_sqts_module_type" table.

To deploy the configuration file to the server, copy it to the server, log in to the server via SSH, and call the deployment script:

deploySchema.sh < dsmoddemo.dsdb
Okt 24, 2020 10:41:34 PM de.softquadrat.datasqill.deployment.LogWrapDeployer insertData
FEIN: Add data to: vv_sqts_module
Okt 24, 2020 10:41:34 PM de.softquadrat.datasqill.deployment.LogWrapDeployer insertData
FEIN: Add data to: vv_sqts_action_attr_definition

Now the module can be used after restarting the GUI:

ActionDialog.png

Notes

The presented example module does not handle special cases such as peculiarities of column names ("case sensitive column names", "sql key word in column names"). There is a "professional" module DsModWriteFlatFile for creating CSV files that correctly handles such column names and protects them in SQL ("quoting").

And finally, the SQL query of course only works when the tables are in a common database. The helper method "getSources" ensures this. It throws an exception both during validation and execution when it detects that the source tables are in different databases.