First, the business code of the transformation must be developed, which normally happens outside of datasqill with the development tool of choice. Second, the target tables must be defined and created if they do not yet exist.
The first query would prepare data for a dimension table.
SELECT p_partkey AS part_key
, p_name AS part_name
, p_type AS part_type
, p_size AS part_size
, p_retailprice AS part_price
, SUM(ps_availqty) AS availqty
, AVG(ps_supplycost) AS avg_supplycost
FROM stage.part
, stage.partsupp
WHERE p_partkey = ps_partkey
GROUP BY p_partkey
, p_name
, p_type
, p_size
, p_retailprice
The dimension table could look like this.
CREATE TABLE data_mart.d_part (
part_key NUMERIC UNIQUE NOT NULL
, part_name VARCHAR
, part_type VARCHAR
, part_size NUMERIC
, part_price NUMERIC
, availqty NUMERIC
, avg_supplycost NUMERIC
);
The second query will prepare data for a fact table.
SELECT EXTRACT(year from o_orderdate) AS order_year
, customer_key
, part_key
, SUM(l_extendedprice * (1 - l_discount)) AS revenue
, SUM(l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity) AS profit
FROM stage.lineitem
, stage.partsupp
, stage.orders
, data_mart.d_part
, data_mart.d_supplier
, data_mart.d_customer
WHERE l_orderkey = o_orderkey
AND o_custkey = customer_key
AND l_suppkey = supplier_key
AND l_partkey = part_key
AND l_suppkey = ps_suppkey
AND l_partkey = ps_partkey
GROUP BY 1, 2, 3
The target table would then be.
CREATE TABLE data_mart.f_profit (
order_year NUMERIC
, customer_key NUMERIC
, part_key NUMERIC
, revenue NUMERIC
, profit NUMERIC
);
The target tables must be created in the database. It is important to ensure that the user (used by datasqill for the connection) has write permissions on the tables or receives them via grants.