flowchart LR A[Customer ID] --> B[Find First Invoice Date] B --> C[Assign Cohort Month] C --> D[Calculate Months Since First Purchase] D --> E[Measure Retention or Sales] E --> F[Build Cohort Heatmap]
Tableau Session 03: Advanced Analytics
Learning Goals
- Use advanced Tableau functions for analytical modeling
- Build complex calculated fields
- Apply table calculations (running totals, percent of total, ranking, differences, moving averages)
- Work with advanced date logic and date parameters
- Build cohort and retention analysis views
- Perform spatial analysis and spatial joins
- Connect and visualize geographic data
- Apply Tableau data modeling concepts from Session 2 in more complex use cases
- Build advanced KPI dashboards and analytical heatmaps
In previous sessions, we focused on building visualizations, working with filters and parameters, and creating calculated fields. We also introduced key concepts such as relationships, joins, and Level of Detail expressions, which form the foundation of analytical work in Tableau.
In this session, we move from building charts to building analytical logic inside Tableau. Instead of focusing only on how data is displayed, we focus on how calculations are performed, how different components interact, and how to ensure that results remain accurate across different analytical scenarios.
This shift allows us to move from simple dashboards to more advanced analytical systems, where calculations, date logic, and data modeling work together to answer more complex business questions.
Advanced Table Calculations
In real analytical scenarios, calculations rarely rely on a single function. Instead, they combine multiple layers of logic, including aggregation, table calculations, conditional expressions, and sometimes date logic. These are referred to as complex calculations.
Complex calculations are used when simple aggregations such as SUM or AVG are not sufficient to answer analytical questions. They allow analysts to model behavior such as growth rates, comparisons across time, cumulative metrics, and conditional ranking.
A key characteristic of complex calculations is that they often combine multiple computation stages, where:
- Some parts are calculated at the row or aggregate level
- Other parts are computed as table calculations after aggregation
Understanding this layered computation is essential for building correct analytical models.
Combining Aggregation with Table Calculations
One of the most common patterns in complex calculations is combining aggregation with table calculations.
For example, calculating the percentage contribution of each category over time:
RUNNING_SUM(SUM([Order Total])) / WINDOW_SUM(SUM([Order Total]))This calculation combines:
- Aggregation:
SUM([Order Total])
- Running accumulation:
RUNNING_SUM
- Total window comparison:
WINDOW_SUM
This type of calculation is often used to understand cumulative contribution.
Nested Table Calculations
Tableau allows nesting of table calculations, where one table calculation is used inside another.
For example, calculating percent difference between consecutive running totals:
RUNNING_SUM(SUM([Order Total])) - LOOKUP(RUNNING_SUM(SUM([Order Total])), -1)This calculation combines:
- A running total
- A lookup to the previous value
Nested calculations like this are powerful but require careful configuration of partitioning and addressing.
Conditional Table Calculations
Complex calculations often include conditional logic to control behavior dynamically.
For example, showing only positive growth:
IF SUM([Order Total]) - LOOKUP(SUM([Order Total]), -1) > 0 THEN
SUM([Order Total]) - LOOKUP(SUM([Order Total]), -1)
ENDThis type of logic allows analysts to highlight specific patterns such as growth periods.
Window Functions
Window functions perform calculations across a defined range of data within a partition.
These are essential for complex analytics.
Moving Average
WINDOW_AVG(SUM([Order Total]), -3, 0)Calculates average over the last 4 periods (offset -3 to 0 means the current period plus 3 previous periods = 4 total).
Window Sum
WINDOW_SUM(SUM([Order Total]))Calculates total within the partition.
Window Max
WINDOW_MAX(SUM([Order Total]))Finds the maximum value within a window.
Difference and Percent Difference
Complex calculations often involve comparing values across time or categories.
Difference
SUM([Order Total]) - LOOKUP(SUM([Order Total]), -1)Percent Difference
(SUM([Order Total]) - LOOKUP(SUM([Order Total]), -1))
/ LOOKUP(SUM([Order Total]), -1)These calculations are used for:
- Growth analysis
- Trend comparison
- Performance tracking
Multi-Level Calculations
Complex calculations often operate across multiple levels of detail within the same view.
For example:
- Category-level aggregation
- Within-region ranking
- Across-time accumulation
This requires careful control of partitioning and addressing to ensure that calculations are applied correctly at each level.
Interaction with View Structure
One of the most important aspects of complex table calculations is that they are highly dependent on the structure of the view.
Changes such as:
- Adding a dimension
- Changing sort order
- Modifying layout
can significantly alter the result.
Because of this, it is important to always validate:
- Partitioning fields
- Addressing fields
- Compute Using configuration
Table Calculations
When you add a table calculation, you must account for all dimensions in the level of detail.
Each dimension must be used either for:
- Partitioning (scoping), or
- Addressing (direction)
Partitioning Fields (Scope)
Partitioning fields define how the data is grouped before the table calculation is applied.
- They break the view into multiple sub-tables (partitions)
- The table calculation is performed separately within each partition
- They determine the scope of the calculation
In other words, partitioning controls where the calculation resets.
Addressing Fields (Direction)
Addressing fields define how the calculation moves within each partition.
- They determine the direction of the calculation
- They control the sequence of marks used in calculations such as:
- Running totals
- Differences between values
- Percent change
- Running totals
In short, addressing controls how the calculation progresses.
How Partitioning and Addressing Work Together
- Partitioning fields split the view into multiple sub-views (sub-tables).
- The table calculation is applied independently inside each partition.
- The addressing fields determine the direction in which the calculation moves through the marks within each partition.
For example:
- In a running total, addressing determines the order of accumulation.
- In a difference calculation, addressing determines which value is compared to which.


Common Table Calculations
Table calculations are widely used in analytical dashboards to understand trends, comparisons, and rankings.
Running Total
Running totals accumulate values across a defined direction.
RUNNING_SUM(SUM([Order Total]))This is commonly used to track cumulative metrics such as revenue growth over time.
Percent of Total
Percent of total calculates each value’s contribution relative to the total.
SUM([Order Total]) / TOTAL(SUM([Order Total]))This is useful for understanding share distribution across categories or regions.
Rank
Ranking assigns a position to values based on a selected measure.
RANK(SUM([Order Total]))This is often used for identifying top or bottom performers.
Specific Dimensions vs Compute Using
When using Compute Using options, Tableau automatically assigns some dimensions as:
- Partitioning fields
- Addressing fields
However, when selecting Specific Dimensions, you must manually decide:
- Which dimensions define the partition (scope)
- Which dimensions define the addressing (direction)


In the Specific Dimensions section of the Table Calculation dialog:
- The order of fields determines the direction of movement through the marks:
Pane (Down)
- Checked dimensions define how Tableau computes the table calculation across the view:
Year of Order Date,Region
A helpful mental model:
- Partitioning = Where does the calculation reset?
- Addressing = In what order does it move?
Understanding this distinction is essential for correctly configuring running totals, rankings, percent differences, and other table calculations.
More Examples of Table Calculations
Using Table Calculations on Marks card fields
When you place a table calculation on the Marks card Color, the table will be colored based on the measure on the view SUM(Revenue), while the table calculation on Color SUM(Total Quantity) will determine how those values are visually encoded.

Year-over-year growth is another common use case. If you place SUM(Revenue) on the view and apply a table calculation for year-over-year growth, Tableau will compute the change in revenue across periods based on the addressing direction.


Using Table Calculations on Rows or Columns fields
When you place a table calculation on the Rows or Columns shelf, Tableau modifies the structure of the visualization.
For example, adding a moving average to SUM(Revenue) introduces a smoothed trend line that helps identify patterns over time.

Using Table Calculations on Filters
When you place a table calculation on the Filters shelf, Tableau filters the results after aggregation.
For example, applying a ranking calculation allows you to filter the view to show only the Top N values based on a measure such as Revenue.
This is commonly used in scenarios where users want to focus on the highest-performing categories, customers, or regions.
Date Functions
Dates are a common element in most data sources.
If a field contains recognizable dates, Tableau automatically assigns it a Date or Date & Time data type and enables special functionality.
When date fields are used in visualizations, Tableau provides:
- Automatic date hierarchy (Year → Quarter → Month → Day)
- Date-specific filtering options
- Continuous and discrete date options
- Specialized date formatting
Date functions are used to manipulate date values, not just change how they are displayed.
If you only want to change appearance (for example, show 01/09/24 instead of September 01, 2024), use formatting — not a calculation.


Date Parts
Many date functions use a date_part argument.
Common date parts include:

Core Date Functions
DATE
Date converts a string to a date. It can also be used to truncate a datetime to a date.
DATE(string) DATEADD
DateAdd adds a specified time interval to a date.
DATEADD(date_part, interval, date)Example: Calculate Expected Delivery Date
Objective: Add 5 days to Order Date to estimate delivery.
DATEADD('day', 5, [Order Date])
Example: Rolling 12-Months Window
Objective: Filter records from the last 12 months dynamically.
[Order Date] >= DATEADD('month', -12, TODAY())
DATEDIFF
Datediff calculates the difference between two dates in specified date parts.
DATEDIFF(date_part, start_date, end_date, [start_of_week])Optional parameter:
[start_of_week] defines week beginning (Sunday, Monday, etc.)
Example: Time to First Order
Objective: Measure when customers place their first order after signing up.
First order date per customer:
{ FIXED [Customer ID] : MIN([Order Date]) }Months between signup and first order:
DATEDIFF('month', [Signup Date], [First Order Date])
DATENAME
DateName returns the name of a date part as a string. For example, if you want to extract the month from a date, DATENAME will return the name of the month (e.g., “January”, “February”, etc.).
DATENAME(date_part, date)DATEPART
Datepart returns the integer value of a date part. For example, if you want to extract the month from a date, DATEPART will return a number between 1 and 12, while DATENAME would return the name of the month (e.g., “January”, “February”, etc.).
DATEPART(date_part, date)DATEPART is typically faster than DATENAME.
Example: Extract Year for Grouping
Objective: Build yearly trend charts, create custom year filters
DATEPART('year', [Order Date])When used in the view, make the calculated field discrete dimension to group by year.

DATEPARSE
Converts a specifically formatted string into a date.
DATEPARSE(format, string)Use case: When DATE cannot recognize custom format
DATETRUNC
Truncates a date to a specified level.
DATETRUNC(date_part, date, [start_of_week])DATETRUNC changes the actual value, not just the display.
Example: Order Date = 28-12-2023 15:45:30
DATETRUNC('month', [Order Date])
→ 01-12-2023 00:00:00
DATETRUNC('year', [Order Date]) → 01-01-2023 00:00:00
Notice that:
- The date is not just displayed differently
- The underlying value is changed
If you only want to hide the time portion (for example, remove hours/minutes visually), you should format the field instead of using DATETRUNC.
Formatting affects appearance.
DATETRUNC affects the data itself.
What Is [start_of_week]?
This optional parameter defines which day is considered the first day of the week. The further calculations will be based on this definition of a week.
Example: Calculate week based on ISO standard (week starts on Monday) and week starting on Sunday.
DATETRUNC('week', [Order Date], 'sunday')
The first calculated field [Order Week] will calculated week based on ISO standard, which will group orders by week starting on Monday, while the second one [Order Week From Sunday] will group orders by week starting on Sunday.
DAY / MONTH / QUARTER / YEAR / WEEK
These functions extract specific parts of a date as integers.
DAY(Order Date) 
TODAY
Returns the current system date (without time).
TODAY()NOW
Returns the current system date and time.
NOW()MAKEDATE
Constructs a date from numeric year, month, and day.
MAKEDATE(year, month, day)Example: Create Date for 01.01.2026 as [Reporting Date]
MAKEDATE(2026, 1, 1)
MAKEDATETIME
Combines date and time into datetime.
MAKEDATETIME(date, time)MAKETIME
Constructs time using hour, minute, second.
MAKETIME(hour, minute, second)Output is datetime (Tableau does not support standalone time type).
ISDATE
Checks whether a string is a valid date.
ISDATE(string)Use case is data validation and cleaning messy datasets.
MAX and MIN (with Dates)
Most recent date:
MAX(date)Earliest date:
MIN(date) The Date Literal (#)
Date values enclosed in # symbols are interpreted as date literals.
Example: #3/25/2025#
This is a special syntax that allows you to directly input date values in calculations without using functions like DATE().
When you use #3/25/2025#, Tableau recognizes it as a date literal and treats it as a date value in calculations.
Without #, Tableau may interpret the value as:
- String
- Number
- Invalid format

Date Parameters
Date parameters are user-defined controls that allow viewers to dynamically select dates and influence calculations, filters, and visual behavior within a Tableau workbook.
Unlike quick filters, which are directly tied to a specific field in a specific data source, parameters are independent objects. This independence makes them extremely powerful in advanced analytical scenarios, especially when working with multiple data sources, custom logic, or dynamic aggregations.
Conceptual Understanding
A parameter in Tableau is a single value that can be referenced inside one or more calculated fields. When that parameter is of data type Date (or Date & Time), it becomes a flexible time control that can drive time-based logic across worksheets.
Date parameters do not filter data automatically. Instead, they act as inputs to calculations. This means you must explicitly reference them inside a calculated field to control behavior.
For example:
- They can define the start and end of a reporting period.
- They can determine which dates should be included in a KPI.
- They can dynamically adjust the level of time aggregation on an axis.
- They can drive rolling windows or forecast periods.
Because parameters are global to the workbook, one Date parameter can influence multiple worksheets simultaneously.
How Date Parameters Differ from Filters
A standard date filter:
- Is tied to a single data source.
- Automatically filters the data.
- Cannot easily control multiple data sources.
A Date parameter:
- Is independent of any single data source.
- Must be referenced in a calculation.
- Can control multiple data sources.
- Can drive complex logic beyond simple filtering.
This distinction is important. Filters limit data directly. Parameters influence calculations that then determine what to display.
Custom N Date Part Selection
This type of date parameter is flexible because it allows users to simultaneously choose:
- The date part (day, week, month, quarter, year)
- The date range (N periods)
Instead of creating separate filters for:
- Last 30 Days
- Last 3 Months
- Last 1 Year
the user can dynamically control both:
- The unit of time
- The number of periods
Step 1
Create a parameter named: Date Part
Configuration:
- Data Type → String
- Allowable Values → List
- Add values:
- Day
- Month
- Year
Right-click the parameter → Select Show Parameter
This parameter controls the time unit.

Step 2
Create another parameter named: N Values
Configuration:
- Data Type → Integer
- Allowable Values → Range
- Minimum → 1
- Maximum → 30
- Step Size → 1
Right-click → Select Show Parameter
This parameter controls how many periods to include.

Step 3
Create an Anchor Date (Recommended)
We will anchor the calculation to the latest available Order Date in the dataset.
Create a calculated field: Latest Order Date
{ FIXED : MAX([Order Date]) }We are doing this way beacause
TODAY()→ depends on the system date
NOW()→ depends on server timezone
{ FIXED : MAX([Order Date]) }→ finds the most recent order in the dataset, removes any dimension filtering from the view
Step 4
Create a calculated field named: p. Date Part
The p. prefix is a naming convention used throughout this course to indicate that a calculated field is parameter-driven — its value depends on a parameter rather than raw data. This makes it easy to identify parameter-linked fields in the Data pane at a glance.
IF [Date Part] = 'day' then ([Order Date]) > DATEADD('day', -[N Values], [Latest Order Date])
ELSEIF [Date Part] = 'month' THEN ([Order Date]) > DATEADD('month', -[N Values], [Latest Order Date])
ELSE ([Order Date]) > DATEADD('year', -[N Values], [Latest Order Date])
ENDStep 5
- Drag p. Date Part to filters shelf
- Select True
Step 6
- Drag to the column shelf a dimension: Region
- Drag
COUNT([Order ID])to the Rows shelf
Step 7
Change the parameter values to observe the changes in dimension aggregation.

Dynamic KPI Calculations
Date parameters can define a reporting window inside calculations.
For example, instead of filtering out data outside the selected range, you can write a calculation that returns Sales only if the Order Date falls between the selected parameters.
This allows you to:
- Compare full dataset vs selected period.
- Build dynamic period-to-period comparisons.
- Create flexible dashboard controls.
This will be further explored in the KPI section.
Dynamic Time Aggregation
One advanced use of Date Parameters is dynamically controlling the level of date aggregation.
For example, the date axis can automatically aggregate by:
- Day
- Week
- Month
- Quarter
- Year
depending on the selected parameter value.
This is achieved using DATETRUNC, which adjusts the aggregation level dynamically.
By using this approach, the visualization adapts automatically to the selected time granularity, improving readability and analytical clarity.
Step 1
Create a parameter named Date Granularity.
Parameter configuration:
- Data Type → String
- Allowable values → List
- Values → day, week, month, quarter, year
Right-click the parameter and select Show Parameter.

Step 2
Create a calculated field named p. Date Granularity with the following calculation:
DATE(
CASE [Date Granularity]
WHEN 'day' THEN [Order Date]
WHEN 'week' THEN DATETRUNC('week', [Order Date])
WHEN 'month' THEN DATETRUNC('month', [Order Date])
WHEN 'quarter' THEN DATETRUNC('quarter', [Order Date])
ELSE DATETRUNC('year', [Order Date])
END
)Step 3
- Drag p. Date Granularity to the Columns shelf
- Right-click the pill
- Select Exact Date
- Change it to Discrete
- Drag
COUNT([Order ID])to the Rows shelf
Selecting Exact Date ensures Tableau uses the full date value returned by the calculation.
Without this, Tableau may automatically aggregate the date into a higher-level hierarchy (for example, Year or Month), which would override the dynamic logic we created.
Changing the field to Discrete creates distinct headers for each date value instead of a continuous timeline.
This is important because:
- Each truncated date (day/week/month/quarter/year) should appear as a separate category.
- It prevents Tableau from interpolating values across a continuous axis.
- It ensures the view respects the exact granularity selected in the parameter.
Step 4
Change the parameter value to observe how the granularity changes dynamically.
When you switch between:
- day
- week
- month
- quarter
- year
the axis updates automatically to reflect the selected time level.

Multi-Source Dashboards
When working with multiple data sources, each source typically has its own date field.
If you use quick filters, you would need:
- One date filter per data source
This leads to duplicated controls on the dashboard and a less clean user experience.
Using Date parameters instead allows you to create:
- A single Date Part parameter
- A single Start Date parameter
- A single End Date parameter
Each data source then contains its own calculated filter that references the same parameters.
As a result:
- All worksheets respond to the same date controls
- The dashboard remains clean and professional
- The user interacts with one unified time control
This creates a seamless and consistent experience across multi-source dashboards.
Spatial Analytics (spatial relationships, spatial joins, spatial functions)
Spatial analytics focuses on analyzing data that contains a geographic component. Unlike traditional visualizations, spatial analysis allows us to understand how data behaves in relation to location, distance, and geographic structures. It enables answering questions such as where events occur, how locations interact, how movement happens across space, and how metrics vary across regions.
Maps are powerful because they allow patterns to be understood visually. They are especially useful when geography is not just a visual element, but a key part of the analysis.
Geographic Data Formats
Geographic data can exist in multiple formats, and understanding these formats is essential for correct analysis.
- Spatial files such as Shapefile, GeoJSON, KML
- Flat files such as Excel or CSV
- Databases with spatial or location-based fields
Spatial files store both:
- Geometry → the shape (point, line, polygon)
- Attributes → descriptive data
When a spatial file is connected, Tableau automatically creates a Geometry field, which can be directly used for mapping.
Location-based data (like City, Country, Latitude, Longitude) does not contain shapes. Tableau uses geocoding to convert these values into map coordinates.
Dataset for Spatial Analysis
In this session, we use:
- Citi Bike dataset (trip-level data with coordinates)
- NYC District GeoJSON file (polygon-level data)
The Citi Bike dataset contains:
- Start and End coordinates
- Time information
- Station details
Since the dataset is split across multiple CSV files, we combine them using Union.
The GeoJSON dataset contains:
- District boundaries (MultiPolygon)
- Geographic shapes for mapping
Spatial Relationships
Spatial relationships are used when datasets do not share a common key but still need to be analyzed together.
Steps to Create Spatial Relationship
- Load the Citi Bike dataset
- Combine all CSV files using Union
- Add the GeoJSON file to the data model
- Create a calculated field in both datasets:
'New York'- Use this field to create a relationship between the datasets
This approach allows Tableau to connect the datasets logically without physically joining them, preserving flexibility and avoiding duplication.

Now we have a working relationship between two tables.

Spatial Joins
Spatial joins combine datasets based on geographic relationships rather than keys.
The most common spatial join is:
- INTERSECTS → checks if two geometries overlap
Steps to Create Spatial Join
- In the data source tab, switch to the physical layer
- Add the Citi Bike dataset
- Create calculated fields:
MAKEPOINT([Start Lat], [Start Lng])MAKEPOINT([End Lat], [End Lng])- Drag the GeoJSON dataset next to the Citi Bike dataset
- Choose Left Join
- Set join condition:
- Spatial field (point) INTERSECTS polygon geometry
This will match each trip to a district based on location.

Troubleshooting Spatial Joins
Common issue:
- Geometry is incompatible with geography
Solution:
- Convert geometry to geography
- Use coordinate system EPSG:4326
- Ensure coordinates follow Longitude, Latitude order
Spatial Functions
Spatial functions enable advanced geographic calculations directly within Tableau. They allow you to create, transform, and analyze spatial objects such as points, lines, and polygons. These functions are especially important when working with coordinate-based datasets, as they convert raw latitude and longitude values into map-ready objects and allow analytical operations such as distance measurement, movement analysis, and spatial comparison.
Spatial functions are commonly used together with mapping and spatial joins. They help bridge the gap between raw data and geographic insight, making it possible to answer questions about proximity, interaction, and movement.
Common Use Cases
- Converting latitude and longitude into spatial points
- Visualizing movement between two locations
- Measuring distance between locations
- Creating service or coverage areas
- Identifying overlapping or interacting regions
Examples
Creating a spatial point from coordinates:
MAKEPOINT([Latitude], [Longitude])Visualizing movement between two locations:
MAKELINE(
MAKEPOINT([Start Lat], [Start Lng]),
MAKEPOINT([End Lat], [End Lng])
)Calculating distance between two points:
DISTANCE(
MAKEPOINT([Start Lat], [Start Lng]),
MAKEPOINT([End Lat], [End Lng]),
'km'
)Spatial Functions Reference
| Function | Description | Typical Use Case |
|---|---|---|
| MAKEPOINT | Converts latitude and longitude columns into a spatial point. | Enabling spatial joins for coordinate-based datasets. |
| MAKELINE | Creates a line between two spatial points. | Origin–destination maps, mobility analysis, route visualization. |
| DISTANCE | Calculates the distance between two spatial points using specified units. | Nearest branch analysis, trip distance calculation, proximity analysis. |
| AREA | Returns the total surface area of a spatial polygon. | Territory size comparison, land coverage analysis. |
| LENGTH | Returns the total geodetic length of a linestring geometry. | Route length measurement, infrastructure analysis. |
| BUFFER | Creates a radius around a point, line, or polygon. | Service coverage zones, delivery radius modeling, proximity analysis. |
| INTERSECTS | Returns True or False indicating whether two geometries overlap. | Spatial joins, containment checks. |
| INTERSECTION | Returns the overlapping portion between two geometries. | Market overlap analysis, shared service area evaluation. |
| DIFFERENCE | Subtracts the overlapping area of one polygon from another. | Identifying uncovered or restricted areas. |
| SYMDIFFERENCE | Removes overlapping portions from both geometries and returns the remaining parts. | Territory comparison and competitive analysis. |
| OUTLINE | Converts polygon geometry into boundary lines. | Styling borders separately from polygon fill. |
| SHAPETYPE | Returns the geometry structure as text (Point, Polygon, etc.). | Debugging spatial data issues. |
| VALIDATE | Confirms whether spatial geometry is topologically correct. | Cleaning corrupted spatial files and preventing join errors. |
Mapping in Tableau (Map Layers, Map Styling & Configuration)
Mapping in Tableau is not only about placing marks on a geographic background.
It is a combination of:
- Data modeling
- Geocoding
- Aggregation logic
- Visual design
For a map to function correctly — analytically and visually — four foundational components must be configured properly:
- Data Type
- Data Role
- Geographic Role
- Geographic Hierarchy
If any of these elements are misconfigured, you may encounter:
- Unknown locations
- Incorrect aggregation
- Missing map rendering
- Broken drill-down behavior
- Spatial joins that do not work as expected
Geographical data configuration
Mapping accuracy begins with proper data configuration.
Data Type — Structural Foundation
The Data Type determines how Tableau stores and interprets the raw values.
This is the first layer of configuration.
Common Geographic Data Types
| Field Type | Required Data Type | Why |
|---|---|---|
| Latitude | Number (Decimal) | Must allow precise coordinate plotting |
| Longitude | Number (Decimal) | Must allow precise coordinate plotting |
| Country/State/City | String | Needed for geocoding |
| Postal Code | String | Preserves leading zeros |
| Geometry (GeoJSON/Shapefile) | Geometry | Native spatial object |
Incorrect data types can cause:
- Aggregation errors
- Loss of leading zeros (postal codes)
- Tableau not recognizing geographic information
- Failure in map rendering
Example:
If Postal Code is stored as Number: - 01234 becomes 1234 - Geocoding fails
Data Role — Analytical Behavior
The Data Role defines how Tableau treats the field in analysis.
Two primary roles:
- Dimension → categorical grouping
- Measure → numeric aggregation
Typical Configuration for Mapping
| Field | Data Role |
|---|---|
| Latitude | Measure |
| Longitude | Measure |
| Country | Dimension |
| State | Dimension |
| City | Dimension |
| Geometry | Measure |
If Latitude/Longitude are set as Dimensions: - Points may not render correctly
- Aggregation logic may break
Geographic Role — Geocoding Layer
The Geographic Role connects a field to Tableau’s geocoding engine.
This tells Tableau: > “This field represents a real-world geographic level.”
Common Geographic Roles
- Country/Region
- State/Province
- County
- City
- Postal Code
- Latitude
- Longitude
Once a geographic role is assigned:
- Tableau generates Latitude (generated)
- Tableau generates Longitude (generated)
These generated fields are automatically used for plotting.
How Tableau Geocoding Works
When using location names:
- Tableau references its internal geographic database
- Matches names to coordinates
- Places marks accordingly
If Tableau cannot match values, you will see:
- Unknown locations warning
To resolve:
- Click the warning icon
- Edit locations
- Specify country context
- Correct spelling inconsistencies
Geographic Hierarchy — Drill-Down Structure
A Hierarchy defines the logical order of geographic levels.
Example:
- Country
- State
- City
- Postal Code
- City
- State
Hierarchies allow:
- Drill-down navigation
- Controlled aggregation
- Structured geographic exploration
How to Create a Hierarchy
- Right-click a geographic field (e.g., Country)
- Select Hierarchy → Create Hierarchy
- Drag lower levels into it
Benefits
- Enables + / − drill controls
- Maintains geographic logic
- Improves dashboard interactivity
Mapping
Mapping with Raw Coordinates
If your dataset contains Latitude and Longitude:
Required Configuration
- Data Type → Number (Decimal)
- Data Role → Measure
- Geographic Role → Latitude / Longitude
Validation Rules
- Longitude range: -180 to 180
- Latitude range: -90 to 90
- Coordinates must be decimal degrees
Longitude always goes to:
- Columns (X-axis)
Latitude always goes to:
- Rows (Y-axis)
If properly configured:
- Tableau plots points automatically
- No internal geocoding is required
Mapping with Location Names
If your dataset contains names instead of coordinates:
Required Configuration
- Data Type → String
- Data Role → Dimension
- Geographic Role → Appropriate geographic level
Tableau converts names into coordinates using geocoding. In some cases, you may need to specify the country context to resolve ambiguities.
Mapping with Spatial Files (GeoJSON / Shapefile)
Spatial files contain embedded geometry objects.
When imported:
- Field Type → Geometry
- Data Role → Measure
Characteristics:
- Coordinates are embedded
- No geocoding required
- Supports polygon and line rendering
Enables:
- Choropleth maps
- Boundary overlays
- Spatial joins
- Spatial calculations
Map Styling and Layering
Map Styles
Once configuration is correct, map styling enhances interpretability.
Background Map Styles are
- Light
- Normal
- Streets
- Satellite
Choose style based on:
- Analytical clarity
- Contrast with marks
- Density visualization
Map Layers
Tableau allows multiple layers:
- Polygon layer (district boundaries)
- Point layer (stations)
- Line layer (routes)
Multi-layer maps enable:
- Territory + event visualization
- Origin–destination flows
- Hotspot analysis
Each layer can have:
- Independent mark type
- Independent color
- Independent size
Basic Map Creation
Now let’s create a simple map showing the distribution of the trips across New York city using the geometry field from the spatial file.
Step 1
As we do not have a column with states we will make a calculated field 'New York' and assign it as a geographic role with the level of State/Province. This will allow us to use the geometry field from the spatial file to plot the map of New York city.
Step 2
Now we can make a hierarchy by right clicking on the newly created field, [State] and choosing Hierarchy → Create Hierarchy and then dragging the [Boroname] field (NYC borough) to the hierarchy. This will allow us to drill down from the state level to the geometry level and see the different districts of New York city which are available in our spatial file.

Step 3
In order to count rides by district, we will drag the Ride ID field to the view and change the aggregation to Count. To show the distribution we can place the Count of Ride ID on the color mark and we will have a choropleth map showing the distribution of the rides across the different districts of New York city.
Step 4
To make districts more visible drag and drop [Boroname] field to the label mark and we will have the name of the district on the map.
Step 5
To make the map more readable, we can also change the background map style by right clicking on the map and choosing Background Layers and then choosing the style that we like. In this case, we will choose the Light style to make the districts more visible and add Background Map Layer by ticked prefferences such as Land Cover and Labels to make the map more informative.

Map with layers, polygons, points, and lines
Step 1
As we have already done the join between the spatial file and the tabular file, we can now build a map using the geometry field from the spatial file. To do that:
- First make sure that the fields with geographic roles are correctly assigned
- Double click on the geometry field and it will be added to the view
Tableau automatically:
- Adds Latitude (generated) to Rows
- Adds Longitude (generated) to Columns
- Places Geometry on the Marks card, Details
The result is a map of New York City with its districts.

Step 2
Now we can add the trips data to the map. To do that let’s create calculated fields for the start and end locations of the trips using MAKEPOINT() function as explained in the spatial joins section. Then we can add these calculated fields to the view to show the trip start and end locations on the map.
Step 3
Having the trip start and end locations we can now make a flow map to show the movement of the trips between the start and end locations. To do that we will use MAKELINE() function to create a line between the start and end points of each trip. Then we can add this line to the view to visualize the flow of trips across the city.

Step 4
In map visualizations, we can use different geometry types adding map layers to show different aspects of the data. For example, we can draw routes to the polygon layer by dragging the line geometry to the view and we can also show the start and end points of the trips by dragging the point geometry to the view. This allows us to create a multi-layered map that shows both the routes and the locations of trip starts and ends.

Proportional Symbol Map
Proportional symbol maps use sized marks (usually circles) to represent the magnitude of a measure at specific locations. The size of each mark is proportional to the value it represents.
Step 1
As in previous examples, we will start by creating a map using the geometry field [Boroname] from the spatial file to show the districts of New York city.
Step 2
Add [Starting point] to the view to show the start locations of the trips on the map.
Step 3
Now we can add the Count of Ride ID to the size mark to show the number of trips starting at each location. This will create a proportional symbol map where the size of each mark corresponds to the number of trips starting at that location.
Step 4
Add the Count of Ride ID to the color mark to show the distribution of the trips across the city. From color marks activate borders.This will allow us to quickly identify areas where there are more trips starting based on the color intensity of the marks on the map.
Step 5
Add [Starting Point ID] to the detail mark to show the individual starting points of the trips on the map.
Step 6
From the Marks card, we can also change the mark type to Circle and adjust the size and color to make the map more visually appealing and easier to interpret. This will allow us to quickly identify areas with high trip activity based on the size of the circles on the map.
Step 7
Add [Boroname] to the filter shelf for interactive filtering by district. This will allow users to select specific districts and see the corresponding trip data on the map, enabling more detailed analysis of trip patterns within different areas of New York City.
This analysis can help identify which districts have the highest demand for Citi Bike trips, and can inform decisions about where to add more bike stations or increase bike availability.

Density Map
Density maps use color intensity to represent the concentration of points in a given area. They are useful for visualizing patterns of activity across a geographic space.
Step 1
Bring [Make route] calculated field, [Start Station ID] and [End Station ID] to the view to show the routes of the trips on the map.
Step 2
Change the mark type to Density to create a density map that shows the concentration of trips across the city. The color intensity will indicate areas with higher or lower trip activity.
Step 3
Adjust map style to Streets to make the trips density patterns more visible on the street map background. This will allow us to better understand the spatial distribution of trips in relation to the city’s street layout.
Step 4
Make density color intensity and opacity adjustments to enhance the visibility of high-density areas. This will help us quickly identify hotspots of trip activity across New York City.
This type of analysis can be useful for understanding where the highest demand for Citi Bike trips is located, and can inform decisions about where to focus resources for bike station placement or maintenance.
Step 5
Untick Aggregate Measures on the top pane Analysis section to show the individual trip routes on the map. Density maps calculate intensity based on the number of marks in a geographic area, so if we want to see the actual routes of the trips, we need to turn off aggregation.

ON (default) → Measures are summarized (SUM, AVG, etc.).
Use for choropleth maps, KPIs by region, and territory comparison.OFF → Each row becomes an individual mark.
Use for point distribution maps, density maps, and event-level analysis.
Rule of thumb:
Use aggregation for regional summaries.
Turn it off for raw spatial events and clustering analysis.
Cohort Analysis in Tableau
Cohort Analysis is a technique used to analyze the behavior of groups of users that share a common characteristic over time.
Instead of analyzing all users together, cohort analysis groups users based on a shared starting event, allowing to observe how behavior changes across time for each group.
Common cohort grouping criteria include:
- First purchase date
- First login date
- First subscription date
- First product activation
This method is widely used in:
- Customer retention analysis
- Product usage analysis
- Marketing performance evaluation
- Telecom subscriber lifecycle analysis
For example:
- Customers who made their first purchase in January
- Customers who made their first purchase in February
Each of these groups becomes a cohort.
Cohort Analysis Concept
A cohort analysis typically consists of three components:
- Period
- Start Date
- Metric Being Measured
Example:
| Start Date | Period 0 | Period 1 | Period 2 | Period 3 |
|---|---|---|---|---|
| Jan 2024 | 100% | 70% | 55% | 40% |
| Feb 2024 | 100% | 75% | 60% | 45% |
| Mar 2024 | 100% | 72% | 58% | 44% |
Interpretation:
- Month 0 → when users first appeared
- Month 1 → one month after acquisition
- Month 2 → two months after acquisition
This structure allows us to analyze retention or activity decay over time.
Cohort Analysis Workflow in Tableau
Steps include:
- Identify the first activity date for each user
- Assign each user to a cohort group
- Calculate time elapsed since the cohort start
- Aggregate metrics by cohort and time period
Cohort Analysis Example
Let’s explore Cohort Analysis in Tableau using the Online Retail Dataset :contentReferenceoaicite:0
Cohort analysis becomes very practical with the Online Retail dataset because the table contains transactional data at the invoice line level.
From the dataset, we can see the following important fields:
- Customer ID identifies the customer
- Invoice Date identifies when the transaction happened
- Quantity and Unit Price can be used to calculate sales
- Multiple rows can belong to the same invoice because one invoice may contain several products
In cohort analysis, we usually group customers based on the date of their first purchase and then track their later activity.
In this dataset:
- Each row is a product line within an invoice
- One customer can have many invoices
- One invoice can contain many products
- Customers purchase at different times
This allows us to answer questions such as:
- In which month did the customer make the first purchase?
- Did the customer return in later months?
- Which cohort has better retention?
- Which cohort generates more revenue over time?
Cohort Definition
For this dataset, the cohort is defined as:
Customers grouped by the month of their first invoice date
So:
- Customers whose first order was in January 2011 belong to the January 2011 cohort
- Customers whose first order was in February 2011 belong to the February 2011 cohort
Cohort Analysis Logic
Step 1: Find the First Purchase Date per Customer
{ FIXED [Customer ID] : MIN([Invoice Date]) }Name: First Purchase Date
This calculation:
- Identifies the earliest purchase per customer
- Returns the same value for all rows of that customer
Step 2: Create the Cohort Month
DATETRUNC('month', [First Purchase Date])Name: Cohort Month
Step 3: Create Invoice Month
DATETRUNC('month', [Invoice Date])Name: Invoice Month
Step 4: Calculate Period
DATEDIFF('month', [Cohort Month], [Invoice Month])Name: Period
Step 5: Calculate Active Customers
COUNTD([Customer ID])Name: Active Customers
Step 6: Build the Heatmap
- Drag
MONTH([Cohort Month])→ Rows
- Drag
[Period]→ Columns
- Set both as Discrete Dimensions
- Set Marks type → Square
- Drag
COUNTD([Customer ID])→ Color
- Drag
COUNTD([Customer ID])→ Label
Retention Rate Calculation (Important)
Retention rate is calculated using a table calculation.
- Duplicate
COUNTD([Customer ID])on Label
- Apply Quick Table Calculation → Percent From
- Compute Using → Table (Across)
- Relative to → First
This means:
- Each value is divided by the value in Period 0
- Period 0 becomes 100% baseline
- All other periods show retention relative to cohort size

Retention Rate is a table calculation, not a basic aggregation.
It depends on the view layout and uses Percent From First logic.

Step 7: Clean the View
- Create field
[Retention Rate]from table calculation
- Drag to Filters
- Filter values >= 0.01%

Example Interpretation
If a cohort shows:
- Month 0 = 100%
- Month 1 = 38%
- Month 2 = 33%
This means:
- All users were active in first month
- 38% returned next month
- 33% remained active after two months
Additional Analysis
Cohort by Country
- Add Country to Filters
Questions:
- Which countries retain better?
- Which markets are stronger?
Product-Based Cohort
Group customers by first product category.
Questions:
- Which products drive retention?
- Which products lead to repeat purchases?
Revenue Cohort
Analyze revenue instead of user count.
Questions:
- Which cohort generates highest revenue?
- Do newer cohorts spend more?
Cleaning and Reshaping Data in Tableau
Before building visualizations or dashboards, data must often be cleaned and reshaped.
Real-world datasets rarely arrive in a perfectly structured format suitable for analysis.
Common issues include:
- Missing values
- Incorrect data types
- Duplicated records
- Inconsistent formatting
- Wide tables that need to be normalized
- Columns containing multiple values
Cleaning ensures data accuracy, while reshaping ensures data structure fits analytical needs.
flowchart LR A[Raw Data] --> B[Data Cleaning] B --> C[Data Reshaping/optional/] C --> D[Structured Analytical Dataset] D --> E[Visualization & Analysis]
Data Cleaning in Tableau
Data cleaning focuses on improving data quality before performing analysis.
Tableau allows several cleaning operations directly inside the Data Source page or through Calculated Fields.
Handling Missing Values (NULL Values)
Missing values are one of the most common problems in datasets.
NULL values may represent missing records, unavailable information, or incomplete data collection.
Common approaches to handle NULL values include:
- Replace NULL values with a default value
- Filter out rows containing NULL values
- Use calculated fields to define fallback values
Example calculation replacing NULL revenue with zero:
IFNULL([Revenue],0)Another common approach:
ZN([Revenue])ZN() converts NULL numeric values to zero, which is often useful in financial analysis.
Correcting Data Types
Each field in Tableau has a data type such as:
- String
- Number
- Date
- Boolean
- Geographic
Incorrect data types can lead to incorrect aggregations or calculation errors.
Common corrections include:
- Converting strings to numeric values
- Converting strings to date values
- Changing dimensions to measures
Example converting a field to a date:
DATE([Order Date])Example converting a string to a number:
INT([Customer Age])Ensuring the correct data type improves calculation accuracy and visualization behavior.
Removing Duplicate Records
Duplicate rows can distort metrics such as totals, averages, or counts.
Instead of counting all records, analysts often count distinct entities.
Example:
COUNTD([Customer ID])This ensures that each customer is counted only once.
Splitting Columns
Some datasets contain multiple pieces of information within a single column.
For proper analysis, it is often necessary to separate these values into multiple fields.
In the Online Retail dataset, the Description column contains long product names that may include multiple descriptive elements.
Example:
| Description |
|---|
| SET/4 BADGES CUTE CREATURES |
| SET/4 SKULL BADGES |
| FANCY FONT BIRTHDAY CARD |
Sometimes product descriptions may contain structured information such as category and product name within a single field.
Steps:
- Right-click the column
- Select Split or Custom Split
- Choose the delimiter (for example
/or a space)
After splitting, Tableau automatically creates new columns.
Example result:
| Product Type | Product Name |
|---|---|
| SET | 4 BADGES CUTE CREATURES |
| SET | 4 SKULL BADGES |
| FANCY | FONT BIRTHDAY CARD |
Splitting fields improves:
- filtering
- grouping
- hierarchical analysis
- product categorization
Creating Hierarchies
Hierarchies organize fields into structured levels that support drill-down analysis.
Hierarchies allow:
- Drill-down exploration
- Structured navigation in dashboards
- Aggregated views across levels
- Simplified analysis of large datasets
The Online Retail dataset naturally supports a hierarchy such as:
graph TD A[Country] --> B[Customer ID] B --> C[Invoice No]
Example structure:
| Country | Customer ID | Invoice No |
|---|---|---|
| United Kingdom | 17850 | 541696 |
| France | 12583 | 541697 |
This hierarchy allows users to:
- Analyze sales by Country
- Drill down into Customers
- Explore specific Invoices
Steps:
- Drag one dimension onto another in the Data Pane
- Tableau creates a hierarchy automatically
- Rename the hierarchy if needed
- Use the hierarchy in visualizations for drill-down analysis
Using Tableau Prep for Advanced Data Preparation
When datasets become more complex, Tableau Prep provides a visual workflow for preparing and reshaping data.
Tableau Prep allows:
- Joining multiple datasets
- Removing duplicates
- Standardizing values
- Pivoting and unpivoting columns
- Aggregating datasets
- Cleaning inconsistent values
flowchart LR A[Raw Sources] --> B[Tableau Prep Flow] B --> C[Cleaning Steps] C --> D[Reshaped Dataset] D --> E[Tableau Visualization]
Prep uses a visual flow interface, allowing analysts to inspect transformations step by step before publishing the final dataset.
When Data Cleaning Should Be Done Outside Tableau
Although Tableau supports many data preparation operations, some transformations are better performed upstream.
Typical tools include:
- SQL databases
- ETL pipelines
- Python data pipelines
- Data warehouses
Reasons include:
- Better performance for large datasets
- Centralized transformation logic
- Reusable data pipelines
- Improved data governance
In production environments, Tableau often connects to pre-cleaned analytical datasets.
Data Reshaping in Tableau
Data reshaping changes the structure of a dataset so that it fits analytical needs.
Many datasets are stored in wide format, while Tableau analysis works better with long format.
flowchart LR A[Wide Data Format] --> B[Pivot Operation] B --> C[Long/Tidy Data Format]
Pivoting Data in Tableau
Pivoting is a data reshaping technique used to convert columns into rows.
This transformation is particularly useful when datasets store similar measures in multiple columns instead of a single column.
Many analytical workflows and visualization tools work better with long (tidy) data structures, where:
- one column represents the type of measure
- another column represents the value of the measure
flowchart LR A[Gold Column] B[Silver Column] C[Bronze Column] A --> D[Pivot Operation] B --> D C --> D D --> E[Medal Type] D --> F[Medal Count]
Example Dataset: Olympic Medals
For this example we use a dataset containing Olympic medal counts by country and year.
The dataset contains the following columns:
CountryYearSeasonGoldSilverBronzeTotal Medals
Structure before pivoting:

In this structure:
- each medal type is stored in a separate column
- this is called a wide format dataset
However, comparing medal types dynamically in Tableau becomes easier when the dataset is reshaped.
Pivoting Medal Columns
To reshape the dataset, we pivot the following columns:
GoldSilverBronze
Steps in Tableau
Open the dataset in Tableau
Navigate to the Data Source page
Select the columns:
- Gold
- Silver
- Bronze
- Gold
Right-click the selected columns
Choose Pivot
Tableau automatically creates two new fields:
Pivot Field NamesPivot Field Values
Rename these fields:
| Default Name | New Name |
|---|---|
| Pivot Field Names | Medal Type |
| Pivot Field Values | Medal Count |
Dataset after pivoting:

Benefits of Pivoting
Pivoting provides several advantages for analysis and visualization:
- Enables comparison of multiple measures within one chart
- Simplifies dataset structure
- Allows filtering by measure type
- Supports dynamic dashboards and parameter-driven visualizations
Example visualization setup:
| Columns | Rows | Color |
|---|---|---|
| Year | SUM(Medal Count) | Medal Type |
This visualization shows Gold, Silver, and Bronze medals over time in a single chart.
Analytical Questions Enabled by Pivoting
After reshaping the dataset, it becomes easier to answer questions such as:
- How do Gold, Silver, and Bronze medals change over time?
- Which countries win the most gold medals?
- How do medal distributions differ between Summer and Winter Olympics?
- Which countries have the highest total medal counts across seasons?
Pivoting therefore helps transform datasets into a structure that is more flexible for exploration and analysis in Tableau.
Tableau Prep vs Tableau Desktop
Both Tableau Prep and Tableau Desktop support data preparation tasks, but they are designed for different stages of the analytics workflow.
Tableau Desktop is primarily used for data exploration and visualization, while Tableau Prep is designed for building reusable data preparation pipelines.
Summary
| Task Type | Tableau Desktop | Tableau Prep |
|---|---|---|
| Quick data fixes | ✓ | |
| Calculated fields | ✓ | |
| Simple pivots | ✓ | |
| Complex joins | ✓ | |
| Large dataset cleaning | ✓ | |
| Reusable data pipelines | ✓ |
In practice, analysts often use both tools together:
- Tableau Prep to prepare and structure the dataset
- Tableau Desktop to explore, analyze, and visualize the data
Best Practices for Data Cleaning and Reshaping
- Inspect datasets before building visualizations
- Verify data types and formats
- Handle NULL values explicitly
- Convert wide datasets into tidy structures when needed
- Document data transformations
Properly cleaned and reshaped data leads to more reliable analysis and better-performing dashboards.