The module generates an Insert from Select statement based on the SQL query entered by the developer and executes the effective statement. Column mapping between the SQL query and the target table is done by name.
| Name | Meaning |
|---|---|
| Module | Insert from select |
| Module Class | DsModTemplateSQL |
| Type | Java |
| Purpose | Append data to a target table based on a SQL query |
| Transformation Code | SQL query |
| Sources | Source tables in a database |
| Targets | Target table in the same database |
The following example shows how to build an Insert-Select statement with the help of the module. First, the SQL query used:
SELECT c_custkey AS key
, c_name AS name
FROM staging.customer
If the target schema is "demo" and the target table is "d_customer", the module generates the following effective SQL:
INSERT INTO demo.d_customer (
key
, name
)
SELECT key
, name
FROM (
SELECT c_custkey AS key
, c_name AS name
FROM staging.customer
) src
As can be seen in the example, the column names of the target table must be specified as aliases when the source columns have different names. In the example, the target table "d_customer" has the columns "key" and "name". This applies especially when using literals or functions to generate column values, as in this example:
SELECT c_custkey AS key
, c_name AS name
, 'de' AS country
, CURRENT_DATE() AS createdate
FROM staging.customer
The sources of the module are tables that are read via the query defined by the datasqill developer. For these tables, the datasqill runtime user must have read rights. All source tables used in the SQL query must be connected to the module input in the datasqill GUI in the graphical data model.
The target of the module is a single table. For this, the datasqill runtime user must have write rights. The target table must be connected to the module output in the datasqill GUI in the graphical data model.
| Name | Type | Meaning |
|---|---|---|
| Truncate Before | Boolean | The module deletes the target table before writing if this attribute is set to 'Y'. Deletion is attempted with the help of a stored procedure that must be named TRUNC_TABLE. If the stored procedure is not present, the SQL command DELETE is used. |
A more complex example could look like this. With the query
SELECT nation_key, part_key, supplier_key, ps_supplycost AS supplycost
, ps_availqty AS availqty, CASE WHEN MIN(ps_supplycost) OVER (PARTITION BY region_key, part_key) = ps_supplycost THEN 'Y' ELSE 'N' END is_min_supply_cost
FROM demo.d_supplier, staging.partsupp, demo.d_nation, demo.d_part
WHERE ps_partkey = part_key
AND ps_suppkey = supplier_key
AND supplier_nation_key = nation_key
data from the tables demo.d_supplier, staging.partsupp, demo.d_nation and demo.d_part are joined and read.

The module generates the following effective SQL statement:
INSERT into demo.f_supplier_rank (
nation_key
, part_key
, supplier_key
, supplycost
, availqty
, is_min_supply_cost
)
SELECT nation_key
, part_key
, supplier_key
, supplycost
, availqty
, is_min_supply_cost
FROM (
SELECT nation_key, part_key, supplier_key, ps_supplycost AS supplycost
, ps_availqty AS availqty, CASE WHEN MIN(ps_supplycost) OVER (PARTITION BY region_key, part_key) = ps_supplycost THEN 'Y' ELSE 'N' END is_min_supply_cost
FROM demo.d_supplier, staging.partsupp, demo.d_nation, demo.d_part
WHERE ps_partkey = part_key
AND ps_suppkey = supplier_key
AND supplier_nation_key = nation_key
) src
As can be seen, the original SQL query is included exactly and in the same format in the generated one.
When installing the module, additional attributes can be configured
The module can be extended so that the "Loop Query" attribute is activated (see configuring module attributes in the appendix):
| Name | Type | Meaning |
|---|---|---|
| Loop Query | SQL | Loop query to use the results in the main query. |
Then in addition to "Truncate Before" you have another attribute in which you can enter a query. When a loop query is captured, the module executes the main query for each record found. The column results can be used as variables.
Since the main query is executed per result row of the loop query, the result set of the loop query should be kept small. Otherwise you lose the performance of set operations and execution behaves procedurally.
The loop query can also be used very effectively as an IF query. E.g.:
SELECT DISTINCT 1
FROM <source-table>
WHERE <last-change-date> >= <today>-7
You get one row if something has changed in the last 7 days, otherwise none. Thus the main query is executed once or not at all.
In the main query you can use the column names of the loop query as variables. Note that the variables are each inserted as text (string).
Here is an example for monthly processing of data from the last 6 months with Oracle SQL.
Loop Query:
SELECT DISTINCT TO_CHAR(TRUNC(day, 'MM'), 'DD.MM.YYYY') monat
FROM <source-table>
WHERE day >= ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -5)
and the main query for it:
SELECT day, col1, col2, ... , coln
FROM <source-table>
WHERE TRUNC(day, 'MM') = TO_DATE('${monat}', 'DD.MM.YYYY')
datasqill replaces the variable ${monat} in the main query with the value of the monat column from the loop query for each loop iteration. Since the generated column value is transferred as text, a data type conversion must be performed (as in this example).
You can define technical columns that allow you to see afterwards when which rows were inserted or modified:
| Name | Column Definition | Meaning | Value |
|---|---|---|---|
| ins_date | ta_inserted_ts | Inserted at | Filled with the current timestamp (current_date) when inserting |
| ins_by | ta_inserted_by | Inserted by | Filled with the current modifier (current_by) when inserting |
| upd_date | ta_modified_ts | Modified at | Filled with the current timestamp (current_date) when inserting |
| upd_by | ta_modified_by | Modified by | Filled with the current modifier (current_by) when inserting |
| del_date | ta_deletion_ts | Deleted at | Filled with NULL when inserting |
| del_by | ta_deleted_by | Deleted by | Filled with NULL when inserting |
| del_ind | ta_is_deleted | Deletion indicator | Filled with 'N' when inserting |
| current_date | ${get_current_datetime()} | This value defines the current timestamp | In delivered state the current timestamp in date-time (without milliseconds) |
| current_by | ${request_id} | This value defines the current modifier | In delivered state the Request ID of the datasqill action execution |
The column names used (under Column Definition) can be changed so that e.g. ta_inserted_ts is no longer called but insert_date. See also configuring module attributes in the appendix.
If the target table contains some or several of these technical columns, they are filled according to the table above. If you change the target table "demo.d_customer" in the very first example and add the technical columns ta_inserted_ts, ta_inserted_by, ta_modified_ts and ta_modified_by, the following is generated for Postgres:
INSERT INTO demo.d_customer (
key
, name
, ta_modified_by
, ta_modified_ts
, ta_inserted_by
, ta_inserted_ts
)
SELECT key
, name
, 2423589
, DATE_TRUNC('second', CURRENT_TIMESTAMP)
, 2423589
, DATE_TRUNC('second', CURRENT_TIMESTAMP)
FROM (
SELECT c_custkey AS key
, c_name AS name
FROM staging.customer
) src
For Oracle the result would be:
INSERT INTO demo.d_customer (
key
, name
, ta_modified_by
, ta_modified_ts
, ta_inserted_by
, ta_inserted_ts
)
SELECT key
, name
, 2423589
, SYSDATE
, 2423589
, SYSDATE
FROM (
SELECT c_custkey AS key
, c_name AS name
FROM staging.customer
) src