
Probabilistic Distributions
2026-04-01
In Data Analytics, we rarely know outcomes with certainty.
Instead of saying: “Tomorrow’s sales will be exactly 120 units”
We say: “Sales will most likely be around 120, but could reasonably vary”
A probabilistic distribution answers:
When we observe data repeatedly: - Customer purchases - Session durations - Delivery times
Patterns emerge.
A distribution summarizes these patterns as a model, not a table.
A random variable is a numerical description of uncertainty.
Examples:
Most business variables are measured, not counted. Examples:
Remember
For a continuous random variable \(X\): \[ P(X = x) = 0 \] Only intervals have probability: \[ P(a \le X \le b) \]
Assume population height:
We observe people one by one and build a distribution.


Expected Value (Mean):
\[ E[X] = \int_{-\infty}^{\infty} x f(x)\,dx \]
Variance (Spread):
\[ Var(X) = \int_{-\infty}^{\infty} (x-\mu)^2 f(x)\,dx \]
A Normal Distribution is:
\[ X \sim \mathcal{N}(\mu, \sigma^2) \]
\[ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) \]

Special case where:
\[ \mu = 0,\quad \sigma = 1 \]
\[ Z = \frac{X-\mu}{\sigma} \]
Note
Normal and Gaussian distributions are the same thing.
\[ X \sim \text{Uniform}(a,b) \]
\[ f(x) = \frac{1}{b-a}, \quad a \le x \le b \]
\[ E[X] = \frac{a+b}{2} \]
\[ Var(X) = \frac{(b-a)^2}{12} \]

Exponential distribution models waiting time until an event. \[ X \sim \text{Exp}(\lambda) \]
\[ f(x) = \lambda e^{-\lambda x}, \quad x \ge 0 \]
\[ E[X] = \frac{1}{\lambda} \]
\[ Var(X) = \frac{1}{\lambda^2} \]

Normal: =NORM.INV(RAND(),μ,σ), =NORM.DIST(x,μ,σ,TRUE)
Uniform: =RAND()*(b-a)+a
Exponential: =-LN(1-RAND())/λ