Reporting and Analysis with SQL

UDFs, SPs, Views

Karen Hovhannisyan

2026-04-01

Goals

  • User Defined Functions (UDF)
  • Stored Procedures (SP)
  • Views

User Defined Functions

Core Idea

A User Defined Function (UDF) represents a mapping between inputs and outputs.

\[ f(X) \rightarrow Y \]

In SQL analytics, a function is a calculation unit, not a workflow or process.

What Makes a Function a Function

  • Inputs are explicitly defined
  • Internal logic is encapsulated
  • Output is predictable for a given input
  • Execution has no memory of previous calls

Function as a Black Box

A function hides internal complexity and exposes only its interface.

stateDiagram-v2
direction LR
    [*] --> Input
    Input --> InternalLogic
    InternalLogic --> Output
    Output --> [*]

Function Execution States

From the database engine’s perspective, a function moves through fixed stages.

stateDiagram-v2
direction LR
    [*] --> Invocation
    Invocation --> ParameterBinding
    ParameterBinding --> Evaluation
    Evaluation --> ResultMaterialization
    ResultMaterialization --> Return
    Return --> [*]

Where Functions Live in SQL Processing

SQL queries follow a logical evaluation order, not textual order.

stateDiagram-v2
direction LR
    [*] --> FROM
    FROM --> WHERE
    WHERE --> GROUP_BY
    GROUP_BY --> HAVING
    HAVING --> SELECT
    SELECT --> ORDER_BY
    ORDER_BY --> [*]

    note right of SELECT
        Scalar functions
        are evaluated here
    end note

Function Usage Journey

From an analyst’s perspective, functions are part of a larger analytical journey.

journey
    title Analytical Function Usage Journey
    section Problem Framing
      Identify metric: 5: Analyst
      Define inputs: 4: Analyst
    section Query Construction
      Select function: 4: Analyst
      Embed in query: 3: Analyst
    section Execution
      Bind parameters: 3: Database
      Evaluate logic: 2: Database
      Return result: 4: Database
    section Interpretation
      Validate output: 5: Analyst

Scalar UDFs in Practice

Scalar User Defined Functions in Practice

Now that we understand the theory, we move to practical usage of scalar UDFs.

Scalar UDFs allow us to:

  • Encode business rules once
  • Reuse them consistently across queries
  • Improve readability and maintainability of analytical SQL

General Syntax of a Scalar UDF

A scalar UDF formalizes a row-level analytical calculation.

CREATE OR REPLACE FUNCTION function_name (
    parameter_name data_type,
    ...
)
RETURNS return_data_type
LANGUAGE sql
AS $$
    SELECT expression;
$$;

Analytical Perspective on Scalar UDF Design

A well-designed analytical scalar UDF:

  • Encapsulates a single business definition
  • Accepts clean, minimal inputs
  • Produces one interpretable value
  • Can be reused across queries, views, and dashboards

Example 1: Age Categorization

Age segmentation is a foundational analytical dimension.

It is widely used in:

  • Customer profiling
  • Targeting strategies
  • Cohort analysis

Function Definition and Application

CREATE OR REPLACE FUNCTION fn_age_group (
    p_age INT
)
RETURNS TEXT
LANGUAGE sql
AS $$
    SELECT
        CASE
            WHEN p_age < 25 THEN 'Under 25'
            WHEN p_age BETWEEN 25 AND 39 THEN '25–39'
            WHEN p_age BETWEEN 40 AND 59 THEN '40–59'
            ELSE '60+'
        END;
$$;

\[\downarrow\]

SELECT
    customer_id,
    age,
    fn_age_group(age) AS age_group
FROM analytics.customers;

Example 2: Customer Tenure Classification

Customer tenure supports:

  • Retention analysis
  • Lifecycle modeling
  • CRM segmentation

Example 2: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_customer_tenure (
    p_signup_date DATE
)
RETURNS TEXT
LANGUAGE sql
AS $$
    SELECT
        CASE
            WHEN CURRENT_DATE - p_signup_date < 180 THEN 'New'
            WHEN CURRENT_DATE - p_signup_date < 365 THEN 'Established'
            ELSE 'Loyal'
        END;
$$;

\[\downarrow\]

SELECT
    customer_id,
    signup_date,
    analytics.fn_customer_tenure(signup_date) AS tenure_group
FROM analytics.customers;

Example 3: Product Price Tier

Price tiering is commonly used in:

  • Category benchmarking
  • Demand elasticity analysis
  • Product mix optimization

Example 3: Function Definition

CREATE OR REPLACE FUNCTION analytics.fn_price_tier (
    p_price NUMERIC
)
RETURNS TEXT
LANGUAGE sql
AS $$
    SELECT
        CASE
            WHEN p_price < 20 THEN 'Low'
            WHEN p_price BETWEEN 20 AND 99.99 THEN 'Mid'
            ELSE 'Premium'
        END;
$$;

\[\downarrow\]

SELECT
    product_id,
    product_name,
    price,
    analytics.fn_price_tier(price) AS price_tier
FROM analytics.products;

Example 4: Order Size Classification

Order size classification supports:

  • Basket analysis
  • Purchase behavior analysis
  • Operational load estimation

Example 4: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_order_size (
    p_quantity INT
)
RETURNS TEXT
LANGUAGE sql
AS $$
    SELECT
        CASE
            WHEN p_quantity = 1 THEN 'Single Item'
            WHEN p_quantity BETWEEN 2 AND 4 THEN 'Small Basket'
            ELSE 'Large Basket'
        END;
$$;

\[\downarrow\]

SELECT
    oi.order_id,
    oi.quantity,
    analytics.fn_order_size(oi.quantity) AS order_size
FROM analytics.order_items oi;

Example 5: Order Activity Normalization

Order activity status is used in:

  • Operational reporting
  • KPI standardization
  • Funnel analysis
CREATE OR REPLACE FUNCTION analytics.fn_order_activity (
    p_status TEXT
)
RETURNS TEXT
LANGUAGE sql
AS $$
    SELECT
        CASE
            WHEN p_status IN ('cancelled', 'returned') THEN 'Inactive'
            ELSE 'Active'
        END;
$$;

\[\downarrow\]

SELECT
    order_id,
    status,
    analytics.fn_order_activity(status) AS activity_state
FROM analytics.orders;

Summary Scalar UDFs

  • Scalar UDFs operate at row level
  • They encode business semantics
  • They improve consistency and readability
  • They are building blocks for views and stored procedures

Table-Valued UDFs in Practice

Table-Valued User Defined Functions in Practice

Scalar UDFs return one value per invocation.
Table-Valued Functions (TVFs) return a set of rows and behave like virtual tables.

Table-Valued Functions are especially useful in analytics when:

  • Logic naturally produces multiple rows
  • Queries require reusable derived datasets
  • Complex joins and aggregations must be standardized

General Syntax of a Table-Valued Function

A TVF formalizes a reusable analytical dataset.

CREATE OR REPLACE FUNCTION function_name (
    parameter_name data_type,
    ...
)
RETURNS TABLE (
    column_name data_type,
    ...
)
LANGUAGE sql
AS $$
    SELECT ...
$$;

Analytical Perspective on TVF Design

A well-designed analytical TVF:

  • Represents a logical dataset, not a single metric
  • Encapsulates joins and aggregations
  • Produces a schema-stable output
  • Can be reused safely across dashboards and pipelines

Example 1: Recent Orders for a Customer

Analysts often need to inspect a customer’s most recent orders together with monetary impact.

Recent order analysis is frequently used in:

  • Customer profiling
  • Support and case investigation
  • Purchase behavior analysis

Example 1: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_recent_orders (
    p_customer_id INT,
    p_limit       INT
)
RETURNS TABLE (
    order_id    INT,
    order_date  DATE,
    status      TEXT,
    order_total NUMERIC
)
LANGUAGE sql
STABLE
AS $$
    SELECT
        o.order_id,
        o.order_date,
        o.status,
        COALESCE(SUM(oi.quantity * p.price), 0) AS order_total
    FROM analytics.orders o
    JOIN analytics.order_items oi ON oi.order_id = o.order_id
    JOIN analytics.products p     ON p.product_id = oi.product_id
    WHERE o.customer_id = p_customer_id
    GROUP BY o.order_id, o.order_date, o.status
    ORDER BY o.order_date DESC
    LIMIT GREATEST(p_limit, 0);
$$;

\[\downarrow\]

SELECT *
FROM analytics.fn_recent_orders(1, 4);

Example 2: Top Products by Revenue

Revenue ranking is a core analytical task for understanding product performance.

Top-product analysis is used in:

  • Product portfolio reviews
  • Revenue concentration analysis
  • Pricing and promotion strategy

Example 2: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_top_products_by_revenue (
    p_limit INT
)
RETURNS TABLE (
    product_id   INT,
    product_name TEXT,
    revenue      NUMERIC,
    total_qty    BIGINT
)
LANGUAGE sql
STABLE
AS $$
    SELECT
        p.product_id,
        p.product_name,
        COALESCE(SUM(oi.quantity * p.price), 0) AS revenue,
        COALESCE(SUM(oi.quantity), 0)           AS total_qty
    FROM analytics.products p
    LEFT JOIN analytics.order_items oi
        ON oi.product_id = p.product_id
    GROUP BY p.product_id, p.product_name
    ORDER BY revenue DESC, total_qty DESC
    LIMIT GREATEST(p_limit, 0);
$$;

\[\downarrow\]

SELECT *
FROM analytics.fn_top_products_by_revenue(10);

Example 3: Customers by City

Geographic aggregation is a recurring requirement in analytics.

Customer geography analysis supports:

  • Regional performance tracking
  • Geo-based segmentation
  • Spatial analytics pipelines

Example 3: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_customers_by_city (
    p_city_id INT
)
RETURNS TABLE (
    customer_id INT,
    first_name  TEXT,
    last_name   TEXT,
    signup_date DATE
)
LANGUAGE sql
STABLE
AS $$
    SELECT
        c.customer_id,
        c.first_name,
        c.last_name,
        c.signup_date
    FROM analytics.customers c
    WHERE c.city_id = p_city_id;
$$;

\[\downarrow\]

SELECT *
FROM analytics.fn_customers_by_city(1);

Example 4: Order Items with Product Context

Analytical queries often require denormalized line-item datasets.

Line-item enrichment is used in:

  • Order audits
  • Revenue reconciliation
  • BI semantic layers

Example 4: Function Definition & Application

CREATE OR REPLACE FUNCTION analytics.fn_order_items_detailed (
    p_order_id INT
)
RETURNS TABLE (
    product_id   INT,
    product_name TEXT,
    category     TEXT,
    quantity     INT,
    unit_price   NUMERIC,
    line_total   NUMERIC
)
LANGUAGE sql
STABLE
AS $$
    SELECT
        p.product_id,
        p.product_name,
        p.category,
        oi.quantity,
        p.price AS unit_price,
        oi.quantity * p.price AS line_total
    FROM analytics.order_items oi
    JOIN analytics.products p
        ON p.product_id = oi.product_id
    WHERE oi.order_id = p_order_id;
$$;

\[\downarrow\]

SELECT *
FROM analytics.fn_order_items_detailed(10);

Summary | Table-Valued UDFs

  • TVFs represent datasets, not single values
  • They centralize joins and aggregations
  • They simplify downstream SQL and BI layers
  • TVFs are natural building blocks for views and stored procedures

Stored Procedures

Stored Procedures 1/2

Unlike functions, stored procedures represent executable workflows.
They are designed to orchestrate logic, modify data, and control execution flow.

Stored Procedures 2/2

A stored procedure represents a controlled sequence of operations executed as a unit.

Given an input vector \(X\), a procedure performs actions \(A\) and may optionally produce outputs \(O\):

\[ P(X) \rightarrow A \; [\rightarrow O] \]

The defining characteristics of a stored procedure are:

  • Can modify data (INSERT, UPDATE, DELETE)
  • Can control execution flow (IF, LOOP, RAISE)
  • Can manage transactions
  • May or may not return results

Procedure as an Execution Unit

stateDiagram-v2
direction LR
    [*] --> Input
    Input --> Validation
    Validation --> BusinessLogic
    BusinessLogic --> DataMutation
    DataMutation --> Completion
    Completion --> [*]

Stored Procedure Execution States

stateDiagram-v2
direction LR
    [*] --> Call
    Call --> ParameterBinding
    ParameterBinding --> Execution
    Execution --> TransactionControl
    TransactionControl --> End
    End --> [*]

Logical Position of Stored Procedures

Stored procedures exist outside the logical SQL query evaluation pipeline.

flowchart LR
    Client --> SP[Stored Procedure]
    SP --> SQL[SQL Statements]
    SQL --> Tables

They are invoked, not embedded.

Why SQL Introduces Stored Procedures

Stored procedures exist to:

  • Encapsulate multi-step business workflows
  • Enforce data integrity rules
  • Centralize write logic
  • Reduce application-side complexity

General Syntax of a Stored Procedure

CREATE OR REPLACE PROCEDURE procedure_name (
    parameter_name data_type,
    ...
)
LANGUAGE plpgsql
AS $$
BEGIN
    -- procedural logic
END;
$$;

convention

  • Stored procedure naming starts with sp_
  • Procedures are invoked using CALL, not SELECT
  • Procedures may perform commits and rollbacks
  • Prefer procedures for write-heavy logic

Analytical Perspective on Procedure Design

A well-designed analytical stored procedure:

  • Performs one clear business operation
  • Uses UDFs for calculations
  • Validates inputs explicitly
  • Produces deterministic side effects

Example 1: Upsert Product

Maintaining product catalogs often requires insert-or-update logic.

Product upsert is frequently used in:

  • ETL pipelines
  • Data synchronization
  • Master data management

Example 1: Procedure Definition

CREATE OR REPLACE PROCEDURE analytics.sp_upsert_product (
    p_product_id   INT,
    p_product_name TEXT,
    p_category     TEXT,
    p_price        NUMERIC
)
LANGUAGE plpgsql
AS $$
BEGIN
    IF p_product_id IS NULL THEN
        RAISE EXCEPTION 'product_id cannot be NULL';
    END IF;

    INSERT INTO analytics.products (product_id, product_name, category, price)
    VALUES (p_product_id, p_product_name, p_category, p_price)
    ON CONFLICT (product_id) DO UPDATE
        SET product_name = EXCLUDED.product_name,
            category     = EXCLUDED.category,
            price        = EXCLUDED.price;
END;
$$;

\[\downarrow\]

CALL analytics.sp_upsert_product(101, 'USB Cable', 'Accessories', 9.99);
SELECT *
FROM analytics.products
WHERE product_id = 101;

Example 2: Update Order Status

Order lifecycle management requires controlled state transitions.

Order status updates are used in:

  • Fulfillment workflows
  • Exception handling
  • Operational monitoring

Example 2: Procedure Definition

CREATE OR REPLACE PROCEDURE analytics.sp_update_order_status (
    p_order_id INT,
    p_status   TEXT
)
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE analytics.orders
    SET status = p_status
    WHERE order_id = p_order_id;

    IF NOT FOUND THEN
        RAISE EXCEPTION 'Order % not found', p_order_id;
    END IF;
END;
$$;

\[\downarrow\]

CALL analytics.sp_update_order_status(10, 'shipped');
SELECT *
FROM analytics.orders
WHERE order_id = 10;

Example 3: Delete Customer with Safety Check

Destructive operations must be explicit and guarded.

Customer deletion is used in:

  • GDPR compliance
  • Data cleanup
  • Account lifecycle management

Example 3: Procedure Definition

CREATE OR REPLACE PROCEDURE analytics.sp_delete_customer (
    p_customer_id INT
)
LANGUAGE plpgsql
AS $$
BEGIN
    IF EXISTS (
        SELECT 1
        FROM analytics.orders
        WHERE customer_id = p_customer_id
    ) THEN
        RAISE EXCEPTION 'Cannot delete customer % with existing orders', p_customer_id;
    END IF;

    DELETE FROM analytics.customers
    WHERE customer_id = p_customer_id;
END;
$$;

\[\downarrow\]

CALL analytics.sp_delete_customer(19);

Example 4: Refresh Derived Order Totals

Procedures are ideal for batch recalculations.

Derived metric refresh is used in:

  • Nightly batch jobs
  • KPI recalculation
  • Data warehouse maintenance

Example 4: Procedure Definition

CREATE OR REPLACE PROCEDURE analytics.sp_refresh_order_totals ()
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE analytics.orders o
    SET status = 'recalculated'
    WHERE EXISTS (
        SELECT 1
        FROM analytics.order_items oi
        WHERE oi.order_id = o.order_id
    );
END;
$$;

\[\downarrow\]

CALL analytics.sp_refresh_order_totals();

\[\downarrow\]

SELECT 
    *
FROM analytics.orders
LIMIT 10;

Functions vs Stored Procedures

Aspect User Defined Function Stored Procedure
Primary Role Calculation Workflow execution
Data Modification Not allowed Allowed
Transaction Control Not allowed Allowed
Invocation SELECT CALL
Return Value Required Optional
Use in Queries Yes No
Typical Usage Metrics, enrichment ETL, updates, orchestration

Important

  • UDFs handle what to compute
  • Stored Procedures handle what to do
  • UDFs feed procedures, not the opposite

Views in Practice

Analytical Views in Practice

Views provide a logical abstraction layer over base tables.
They simplify analytical queries by exposing ready-to-use datasets with consistent business semantics.

Views allow analysts to:

  • Hide join complexity
  • Standardize business logic
  • Improve query readability
  • Reuse analytical datasets safely

What Is a View?

A view is a stored SQL query that behaves like a virtual table.

It does not store data itself — it stores logic.

\[ V \rightarrow \text{SELECT logic over base tables} \]

Why Views Matter in Analytics

In analytics, views are often used as:

  • Semantic layers
  • Reporting tables
  • BI-friendly abstractions
  • Stable inputs for dashboards and notebooks

General Syntax of a View

CREATE OR REPLACE VIEW view_name AS
SELECT
    column_list
FROM base_tables
WHERE conditions;

Analytical Perspective on View Design

A well-designed analytical view:

  • Represents a clear business concept
  • Is read-only by design
  • Encapsulates joins and filters
  • Has a stable schema
  • Is safe for BI and reporting tools

Example 1: Customer Dimension View

Customer dimensions are frequently reused across analytics.

Customer dimension views support:

  • Segmentation analysis
  • KPI reporting
  • Downstream joins in BI tools

View Definition

CREATE OR REPLACE VIEW analytics.v_customers_dim AS
SELECT
    c.customer_id,
    c.first_name,
    c.last_name,
    c.age,
    c.email,
    c.signup_date,
    ci.city_name,
    r.region_name,
    co.country_name
FROM analytics.customers c
LEFT JOIN analytics.cities ci
    ON c.city_id = ci.city_id
LEFT JOIN analytics.regions r
    ON ci.region_id = r.region_id
LEFT JOIN analytics.countries co
    ON r.country_id = co.country_id;

Example 2: Orders Fact View

Fact views expose transactional data enriched with dimensions.

Order fact views are used in:

  • Revenue analysis
  • Operational reporting
  • Funnel analysis

View Definition

CREATE OR REPLACE VIEW analytics.v_orders_fact AS
SELECT
    o.order_id,
    o.order_date,
    o.status,
    o.customer_id,
    SUM(oi.quantity * p.price) AS order_total,
    SUM(oi.quantity)           AS total_items
FROM analytics.orders o
JOIN analytics.order_items oi
    ON o.order_id = oi.order_id
JOIN analytics.products p
    ON p.product_id = oi.product_id
GROUP BY
    o.order_id,
    o.order_date,
    o.status,
    o.customer_id;

Example 3: Product Performance View

Product performance views centralize revenue logic.

They support:

  • Product benchmarking
  • Revenue concentration analysis
  • Portfolio reviews

View Definition

CREATE OR REPLACE VIEW analytics.v_product_performance AS
SELECT
    p.product_id,
    p.product_name,
    p.category,
    SUM(oi.quantity)                   AS total_qty,
    SUM(oi.quantity * p.price)         AS total_revenue
FROM analytics.products p
LEFT JOIN analytics.order_items oi
    ON p.product_id = oi.product_id
GROUP BY
    p.product_id,
    p.product_name,
    p.category;

Example 4: Active Orders View

Filtering logic is commonly centralized in views.

Active-order views are used in:

  • Operational dashboards
  • Monitoring pipelines
  • Support tooling

View Definition

CREATE OR REPLACE VIEW analytics.v_active_orders AS
SELECT
    order_id,
    order_date,
    status,
    customer_id
FROM analytics.orders
WHERE status NOT IN ('cancelled', 'returned');

Views vs Functions vs Stored Procedures

Aspect View User Defined Function Stored Procedure
Primary Role Dataset abstraction Calculation Workflow execution
Returns Table (rows) Scalar or table Optional
Data Modification Not allowed Not allowed Allowed
Invocation SELECT SELECT CALL
Use in Joins Yes TVF only No
Transaction Control No No Yes
Typical Usage BI, reporting Metrics, enrichment ETL, orchestration

Summary: Views

  • Views expose clean analytical datasets
  • They centralize joins and filters
  • They improve SQL readability
  • They form the semantic layer for BI tools
  • Views sit between raw tables and analytics consumers

Views vs Functions vs Stored Procedures

Aspect View Function Stored Procedure
Purpose Query abstraction Calculation Workflow execution
Data Storage No No No
Accept Parameters No Yes Yes
Modify Data No No Yes
Use in SELECT Yes Yes No
Control Flow No No Yes
Typical Use BI layer, semantics Metrics, logic ETL, updates

Try Youself

Try to:

combining Views + UDFs + Stored Procedures into analytical pipelines.