import plotly.express as px
fig = px.scatter(
data_frame=df,
x="age",
y="income",
color="segment",
size="monthly_spend",
facet_col="region",
trendline="ols",
title="Relationship Between Customer Age and Income"
)
fig.show()Grammar of Graphics
Grammar of Graphics is a systematic way to build visualizations by combining several independent components.
Instead of thinking only in chart names such as bar chart, scatter plot, or line chart, the Grammar of Graphics helps us understand the structure behind a visualization.
A chart is created by combining:
- data
- variables
- visual mappings
- geometric objects
- statistical transformations
- scales
- coordinate systems
- facets
- themes
In simple words:
\[ \text{Chart} = \text{Data} + \text{Mapping} + \text{Geometry} + \text{Layers} \]
Why Do We Need Grammar of Graphics?
When students start learning data visualization, they often memorize chart types:
- bar chart
- line chart
- scatter plot
- histogram
- box plot
However, this is not enough.
A better approach is to understand how data becomes a visual object.
For example, when we create a scatter plot, we are not simply choosing a chart type. We are making several decisions:
- which variable goes to the x-axis
- which variable goes to the y-axis
- whether color should represent a category
- whether size should represent a numeric value
- whether we need a trend line
- whether we need separate charts for different groups
This is the main idea behind the Grammar of Graphics.
Main Components of Grammar of Graphics
| Component | Meaning | Example |
|---|---|---|
| Data | The dataset used for visualization | customers, sales, transactions |
| Aesthetics | Mapping variables to visual properties | x-axis, y-axis, color, size |
| Geometry | The visual object used in the chart | points, bars, lines |
| Statistics | Data transformation before plotting | count, mean, regression line |
| Scales | How data values are shown visually | axis scale, color scale |
| Coordinates | The coordinate system | Cartesian, polar |
| Facets | Splitting one chart into smaller charts | one chart per region |
| Theme | Non-data design elements | title, font, gridlines |
Example: Scatter Plot
Suppose we have a customer dataset with the following variables:
| Variable | Description |
|---|---|
| age | Customer age |
| income | Customer income |
| segment | Customer segment |
| monthly_spend | Monthly spending |
| region | Customer region |
We want to visualize the relationship between age and income.
Using the Grammar of Graphics, we can describe the chart as follows:
| Grammar Component | Choice |
|---|---|
| Data | customer dataset |
| x-axis | age |
| y-axis | income |
| color | customer segment |
| size | monthly spend |
| facet | region |
| geometry | points |
So the chart can be described as:
\[ \text{Scatter Plot} = \text{Customers Data} + x(\text{age}) + y(\text{income}) + color(\text{segment}) + geom\_point() \]
Plotly Example
How Plotly Arguments Match Grammar of Graphics
| Grammar Concept | Plotly Argument |
|---|---|
| Data | data_frame=df |
| x aesthetic | x="age" |
| y aesthetic | y="income" |
| color aesthetic | color="segment" |
| size aesthetic | size="monthly_spend" |
| facet | facet_col="region" |
| statistical layer | trendline="ols" |
| geometry | scatter points |
Example: Bar Chart
Suppose we want to compare total sales by region.
In Grammar of Graphics terms:
| Component | Choice |
|---|---|
| Data | sales dataset |
| x-axis | region |
| y-axis | total sales |
| geometry | bars |
| statistic | sum |
fig = px.bar(
data_frame=sales_by_region,
x="region",
y="total_sales",
title="Total Sales by Region"
)
fig.show()This chart can be described as:
\[ \text{Bar Chart} = \text{Sales Data} + x(\text{region}) + y(\text{total sales}) + geom\_bar() \]
Example: Line Chart
Suppose we want to show how sales changed over time.
| Component | Choice |
|---|---|
| Data | sales dataset |
| x-axis | date |
| y-axis | sales |
| geometry | line |
fig = px.line(
data_frame=daily_sales,
x="date",
y="sales",
title="Sales Trend Over Time"
)
fig.show()The chart can be described as:
\[ \text{Line Chart} = \text{Sales Data} + x(\text{date}) + y(\text{sales}) + geom\_line() \]
Choosing the Right Geometry
Different analytical questions require different geometries.
| Analytical Question | Recommended Geometry | Common Chart |
|---|---|---|
| Compare categories | Bars | Bar chart |
| Show trend over time | Lines | Line chart |
| Show relationship | Points | Scatter plot |
| Show distribution | Bars or boxes | Histogram, box plot |
| Show composition | Stacked bars or areas | Stacked bar, area chart |
| Compare groups | Facets or color | Small multiples, grouped charts |
Key Teaching Point
The most important lesson is:
A visualization is not just a chart type.
A visualization is a structured mapping between data and visual elements.
Students should not only ask:
Which chart should I use?
They should also ask:
Which variables should be mapped to which visual features?
Practical Rule
Before creating a chart, answer three questions:
- What data do I have?
- Which variables are important?
- Which visual encoding communicates the message best?
Summary
Grammar of Graphics gives us a structured way to think about visualization.
It helps us understand that every chart is built from several components:
- data
- aesthetics
- geometry
- statistics
- scales
- coordinates
- facets
- theme
This approach is useful because it makes data visualization more analytical and less mechanical.
Instead of memorizing chart types, we learn how to build charts logically.