stateDiagram-v2
direction LR
[*] --> Input
Input --> InternalLogic
InternalLogic --> Output
Output --> [*]
UDFs, SPs, Views
2026-04-01
A User Defined Function (UDF) represents a mapping between inputs and outputs.
\[ f(X) \rightarrow Y \]
In SQL analytics, a function is a calculation unit, not a workflow or process.
What Makes a Function a Function
A function hides internal complexity and exposes only its interface.
stateDiagram-v2
direction LR
[*] --> Input
Input --> InternalLogic
InternalLogic --> Output
Output --> [*]
From the database engine’s perspective, a function moves through fixed stages.
stateDiagram-v2
direction LR
[*] --> Invocation
Invocation --> ParameterBinding
ParameterBinding --> Evaluation
Evaluation --> ResultMaterialization
ResultMaterialization --> Return
Return --> [*]
SQL queries follow a logical evaluation order, not textual order.
stateDiagram-v2
direction LR
[*] --> FROM
FROM --> WHERE
WHERE --> GROUP_BY
GROUP_BY --> HAVING
HAVING --> SELECT
SELECT --> ORDER_BY
ORDER_BY --> [*]
note right of SELECT
Scalar functions
are evaluated here
end note
From an analyst’s perspective, functions are part of a larger analytical journey.
journey
title Analytical Function Usage Journey
section Problem Framing
Identify metric: 5: Analyst
Define inputs: 4: Analyst
section Query Construction
Select function: 4: Analyst
Embed in query: 3: Analyst
section Execution
Bind parameters: 3: Database
Evaluate logic: 2: Database
Return result: 4: Database
section Interpretation
Validate output: 5: Analyst
Now that we understand the theory, we move to practical usage of scalar UDFs.
Scalar UDFs allow us to:
A scalar UDF formalizes a row-level analytical calculation.
A well-designed analytical scalar UDF:
Age segmentation is a foundational analytical dimension.
It is widely used in:
\[\downarrow\]
Customer tenure supports:
\[\downarrow\]
Price tiering is commonly used in:
\[\downarrow\]
Order size classification supports:
\[\downarrow\]
Order activity status is used in:
\[\downarrow\]
Scalar UDFs return one value per invocation.
Table-Valued Functions (TVFs) return a set of rows and behave like virtual tables.
Table-Valued Functions are especially useful in analytics when:
A TVF formalizes a reusable analytical dataset.
A well-designed analytical TVF:
Analysts often need to inspect a customer’s most recent orders together with monetary impact.
Recent order analysis is frequently used in:
CREATE OR REPLACE FUNCTION analytics.fn_recent_orders (
p_customer_id INT,
p_limit INT
)
RETURNS TABLE (
order_id INT,
order_date DATE,
status TEXT,
order_total NUMERIC
)
LANGUAGE sql
STABLE
AS $$
SELECT
o.order_id,
o.order_date,
o.status,
COALESCE(SUM(oi.quantity * p.price), 0) AS order_total
FROM analytics.orders o
JOIN analytics.order_items oi ON oi.order_id = o.order_id
JOIN analytics.products p ON p.product_id = oi.product_id
WHERE o.customer_id = p_customer_id
GROUP BY o.order_id, o.order_date, o.status
ORDER BY o.order_date DESC
LIMIT GREATEST(p_limit, 0);
$$;\[\downarrow\]
Revenue ranking is a core analytical task for understanding product performance.
Top-product analysis is used in:
CREATE OR REPLACE FUNCTION analytics.fn_top_products_by_revenue (
p_limit INT
)
RETURNS TABLE (
product_id INT,
product_name TEXT,
revenue NUMERIC,
total_qty BIGINT
)
LANGUAGE sql
STABLE
AS $$
SELECT
p.product_id,
p.product_name,
COALESCE(SUM(oi.quantity * p.price), 0) AS revenue,
COALESCE(SUM(oi.quantity), 0) AS total_qty
FROM analytics.products p
LEFT JOIN analytics.order_items oi
ON oi.product_id = p.product_id
GROUP BY p.product_id, p.product_name
ORDER BY revenue DESC, total_qty DESC
LIMIT GREATEST(p_limit, 0);
$$;\[\downarrow\]
Geographic aggregation is a recurring requirement in analytics.
Customer geography analysis supports:
CREATE OR REPLACE FUNCTION analytics.fn_customers_by_city (
p_city_id INT
)
RETURNS TABLE (
customer_id INT,
first_name TEXT,
last_name TEXT,
signup_date DATE
)
LANGUAGE sql
STABLE
AS $$
SELECT
c.customer_id,
c.first_name,
c.last_name,
c.signup_date
FROM analytics.customers c
WHERE c.city_id = p_city_id;
$$;\[\downarrow\]
Analytical queries often require denormalized line-item datasets.
Line-item enrichment is used in:
CREATE OR REPLACE FUNCTION analytics.fn_order_items_detailed (
p_order_id INT
)
RETURNS TABLE (
product_id INT,
product_name TEXT,
category TEXT,
quantity INT,
unit_price NUMERIC,
line_total NUMERIC
)
LANGUAGE sql
STABLE
AS $$
SELECT
p.product_id,
p.product_name,
p.category,
oi.quantity,
p.price AS unit_price,
oi.quantity * p.price AS line_total
FROM analytics.order_items oi
JOIN analytics.products p
ON p.product_id = oi.product_id
WHERE oi.order_id = p_order_id;
$$;\[\downarrow\]
Unlike functions, stored procedures represent executable workflows.
They are designed to orchestrate logic, modify data, and control execution flow.
A stored procedure represents a controlled sequence of operations executed as a unit.
Given an input vector \(X\), a procedure performs actions \(A\) and may optionally produce outputs \(O\):
\[ P(X) \rightarrow A \; [\rightarrow O] \]
The defining characteristics of a stored procedure are:
INSERT, UPDATE, DELETE)IF, LOOP, RAISE)stateDiagram-v2
direction LR
[*] --> Input
Input --> Validation
Validation --> BusinessLogic
BusinessLogic --> DataMutation
DataMutation --> Completion
Completion --> [*]
stateDiagram-v2
direction LR
[*] --> Call
Call --> ParameterBinding
ParameterBinding --> Execution
Execution --> TransactionControl
TransactionControl --> End
End --> [*]
Stored procedures exist outside the logical SQL query evaluation pipeline.
flowchart LR
Client --> SP[Stored Procedure]
SP --> SQL[SQL Statements]
SQL --> Tables
They are invoked, not embedded.
Stored procedures exist to:
convention
sp_CALL, not SELECTA well-designed analytical stored procedure:
Maintaining product catalogs often requires insert-or-update logic.
Product upsert is frequently used in:
CREATE OR REPLACE PROCEDURE analytics.sp_upsert_product (
p_product_id INT,
p_product_name TEXT,
p_category TEXT,
p_price NUMERIC
)
LANGUAGE plpgsql
AS $$
BEGIN
IF p_product_id IS NULL THEN
RAISE EXCEPTION 'product_id cannot be NULL';
END IF;
INSERT INTO analytics.products (product_id, product_name, category, price)
VALUES (p_product_id, p_product_name, p_category, p_price)
ON CONFLICT (product_id) DO UPDATE
SET product_name = EXCLUDED.product_name,
category = EXCLUDED.category,
price = EXCLUDED.price;
END;
$$;\[\downarrow\]
Order lifecycle management requires controlled state transitions.
Order status updates are used in:
\[\downarrow\]
Destructive operations must be explicit and guarded.
Customer deletion is used in:
CREATE OR REPLACE PROCEDURE analytics.sp_delete_customer (
p_customer_id INT
)
LANGUAGE plpgsql
AS $$
BEGIN
IF EXISTS (
SELECT 1
FROM analytics.orders
WHERE customer_id = p_customer_id
) THEN
RAISE EXCEPTION 'Cannot delete customer % with existing orders', p_customer_id;
END IF;
DELETE FROM analytics.customers
WHERE customer_id = p_customer_id;
END;
$$;\[\downarrow\]
Procedures are ideal for batch recalculations.
Derived metric refresh is used in:
\[\downarrow\]
\[\downarrow\]
| Aspect | User Defined Function | Stored Procedure |
|---|---|---|
| Primary Role | Calculation | Workflow execution |
| Data Modification | Not allowed | Allowed |
| Transaction Control | Not allowed | Allowed |
| Invocation | SELECT |
CALL |
| Return Value | Required | Optional |
| Use in Queries | Yes | No |
| Typical Usage | Metrics, enrichment | ETL, updates, orchestration |
Important
Views provide a logical abstraction layer over base tables.
They simplify analytical queries by exposing ready-to-use datasets with consistent business semantics.
Views allow analysts to:
A view is a stored SQL query that behaves like a virtual table.
It does not store data itself — it stores logic.
\[ V \rightarrow \text{SELECT logic over base tables} \]
In analytics, views are often used as:
A well-designed analytical view:
Customer dimensions are frequently reused across analytics.
Customer dimension views support:
CREATE OR REPLACE VIEW analytics.v_customers_dim AS
SELECT
c.customer_id,
c.first_name,
c.last_name,
c.age,
c.email,
c.signup_date,
ci.city_name,
r.region_name,
co.country_name
FROM analytics.customers c
LEFT JOIN analytics.cities ci
ON c.city_id = ci.city_id
LEFT JOIN analytics.regions r
ON ci.region_id = r.region_id
LEFT JOIN analytics.countries co
ON r.country_id = co.country_id;Fact views expose transactional data enriched with dimensions.
Order fact views are used in:
CREATE OR REPLACE VIEW analytics.v_orders_fact AS
SELECT
o.order_id,
o.order_date,
o.status,
o.customer_id,
SUM(oi.quantity * p.price) AS order_total,
SUM(oi.quantity) AS total_items
FROM analytics.orders o
JOIN analytics.order_items oi
ON o.order_id = oi.order_id
JOIN analytics.products p
ON p.product_id = oi.product_id
GROUP BY
o.order_id,
o.order_date,
o.status,
o.customer_id;Product performance views centralize revenue logic.
They support:
CREATE OR REPLACE VIEW analytics.v_product_performance AS
SELECT
p.product_id,
p.product_name,
p.category,
SUM(oi.quantity) AS total_qty,
SUM(oi.quantity * p.price) AS total_revenue
FROM analytics.products p
LEFT JOIN analytics.order_items oi
ON p.product_id = oi.product_id
GROUP BY
p.product_id,
p.product_name,
p.category;Filtering logic is commonly centralized in views.
Active-order views are used in:
| Aspect | View | User Defined Function | Stored Procedure |
|---|---|---|---|
| Primary Role | Dataset abstraction | Calculation | Workflow execution |
| Returns | Table (rows) | Scalar or table | Optional |
| Data Modification | Not allowed | Not allowed | Allowed |
| Invocation | SELECT |
SELECT |
CALL |
| Use in Joins | Yes | TVF only | No |
| Transaction Control | No | No | Yes |
| Typical Usage | BI, reporting | Metrics, enrichment | ETL, orchestration |
| Aspect | View | Function | Stored Procedure |
|---|---|---|---|
| Purpose | Query abstraction | Calculation | Workflow execution |
| Data Storage | No | No | No |
| Accept Parameters | No | Yes | Yes |
| Modify Data | No | No | Yes |
| Use in SELECT | Yes | Yes | No |
| Control Flow | No | No | Yes |
| Typical Use | BI layer, semantics | Metrics, logic | ETL, updates |
Try to:
combining Views + UDFs + Stored Procedures into analytical pipelines.