Quantitative data represent measurable quantities that can be used in arithmetic operations.
Subtype
Definition
Examples
Discrete
Countable numbers, often integers
Number of customers, Complaints per day
Continuous
Measured values within a range
Temperature, Age, Revenue, Weight
Additional Data Types
Type
Definition
Example Applications
Binary
Two possible outcomes (Yes/No, 0/1)
Subscription status, churn indicator
Time-Series
Observations recorded sequentially over time
Daily sales, hourly temperature
Textual / Unstructured
Words, sentences, or documents
Customer reviews, tweets
Spatial / Geographical
Location-based information
Store coordinates, delivery zones
Flowchart
%%{init: {"theme": "default", "logLevel": "fatal"}}%%
graph TD
A["Data Types"] --> B["Qualitative (Categorical)"]
A --> C["Quantitative (Numerical)"]
B --> D["Nominal"]
B --> E["Ordinal"]
C --> F["Discrete"]
C --> G["Continuous"]
A --> H["Other Types"]
H --> I["Binary"]
H --> J["Time-Series"]
H --> K["Textual / Unstructured"]
H --> L["Spatial / Geographical"]
Measures of Central Tendency
Once we understand our data types, the next step is to summarize them with simple, representative numbers.
These are called measures of central tendency, and they tell us where the “center” of our data lies.
The three most common measures are:
Mean — the average value
Median — the middle value
Mode — the most frequent value
Mean
The mean is the sum of all values divided by the number of observations.
When to Use:
- Conveinent measurement clear to everybody - Works well with continuous or discrete numerical data.
- Sensitive to outliers (extreme values can distort the result).
Variability is not inherently good or bad—its value depends on what you are trying to achieve.
Why \(N-1\)?
A degree of freedom (df) is an independent piece of information that can vary freely.
Whenever we estimate a parameter from the sample, we introduce a constraint.
Each constraint removes one degree of freedom.
The universal rule is:
\[
df = n - \text{(number of estimated parameters)}.
\]
During the mean estimation, since we have one constraint we remove only 1 hence
\[
df = n - 1
\]
Coefficient of Variation (Normalization)
The coefficient of variation (CV) is a measure of relative variability.
It allows us to compare the variability of two datasets even when their units or scales differ.