Data Analytics Bootcamp
  • Syllabus
  • Statistical Thinking
  • SQL
  • Python
  • Tableau
  • Lab
  • Capstone
  1. Python
  2. Python
  3. Session 07: Working with Dates
  • Syllabus
  • Statistical Thinking
    • Statistics
      • Statistics Session 01: Data Layers and Bias in Data
      • Statistics Session 02: Data Types
      • Statistics Session 03: Probabilistic Distributions
      • Statistics Session 04: Probabilistic Distributions
      • Statistics Session 05: Sampling
      • Statistics Session 06: Inferential Statistics
      • Slides
        • Course Intro
        • Descriptive Stats
        • Data Types
        • Continuous Distributions
        • Discrete Distributions
        • Sampling
        • Hypothesis Testing
  • SQL
    • SQL
      • Session 01: Intro to Relational Databases
      • Session 02: Intro to PostgreSQL
      • Session 03: DA with SQL | Data Types & Constraints
      • Session 04: DA with SQL | Filtering
      • Session 05: DA with SQL | Numeric Functions
      • Session 06: DA with SQL | String Functions
      • Session 07: DA with SQL | Date Functions
      • Session 08: DA with SQL | JOINs
      • Session 09: DA with SQL | Advanced SQL
      • Session 10: DA with SQL | Advanced SQL Functions
      • Session 11: DA with SQL | UDFs, Stored Procedures
      • Session 12: DA with SQL | Advanced Aggregations
      • Session 13: DA with SQL | Final Project
      • Slides
        • Intro to Relational Databases
        • Intro to PostgreSQL
        • Basic Queries: DDL DLM
        • Filtering
        • Numeric Functions
        • String Functions
        • Date Functions
        • Normalization and JOINs
        • Temporary Tables
        • Advanced SQL Functions
        • Reporting and Analysis with SQL
        • Advanced Aggregations
  • Python
    • Python
      • Session 01: Programming for Data Analysts
      • Session 02: Python basic Syntax, Data Structures
      • Session 03: Introduction to Pandas
      • Session 04: Advanced Pandas
      • Session 05: Intro to Data Visualization
      • Session 06: Data Visualization
      • Session 07: Working with Dates
      • Session 08: Data Visualization | Plotly
      • Session 09: Customer Segmentation | RFM
      • Slides
        • Data Analyst
  • Tableau
    • Tableau
      • Tableau Session 01: Introduction to Tableau
      • Tableau Session 02: Intermediate Visual Analytics
      • Tableau Session 03: Advanced Analytics
      • Tableau Session 04: Dashboard Design & Performance
      • Slides
        • Data Analyst
        • Data Analyst
        • Data Analyst
        • Data Analyst

On this page

  • Working with Dates in Python
    • Why Date Handling Matters
  • datetime Module
    • Creating a Date Object
    • Creating a Datetime Object
    • Getting the Current Date and Time
    • Accessing Parts of a Datetime
    • Formatting Dates as Strings
    • Common Formatting Codes
    • Parsing Strings into Datetime
    • Arithmetic with Dates
    • Why datetime is Useful?
  • Dates in Pandas
    • Creating a Sample DataFrame
    • Checking Data Types
    • Converting to Datetime
    • Extracting Date Components with .dt
    • Year, Month, and Day
    • Month Name and Day Name
    • Quarter
    • Day of Week
    • Weekend Flag
    • Calculating Differences Between Dates
    • Filtering by Date Conditions
    • Sorting by Date
    • Creating Date Ranges
    • Common Frequency Codes
    • Begining and the End of Month
  • Homework
    • Practice with the datetime Module
    • Practice with Pandas Dates
  1. Python
  2. Python
  3. Session 07: Working with Dates

Session 07: Working with Dates

Pandas
Date Functions
Functions

Working with Dates in Python

Before building time-based visualizations, we need to become comfortable working with dates. In analytics, dates are everywhere: transaction dates, signup dates, campaign start dates, delivery timestamps, and monthly summaries.

Warning

If you do not understand how Python handles dates, you will struggle with:

  • filtering time periods
  • calculating intervals
  • grouping by month
  • building correct visualizations.

In Python, there are two closely related ways to work with dates:

  • the built-in datetime module
  • Pandas date functionality

The datetime module is part of standard Python and is useful for creating and manipulating individual date and time objects.

Pandas extends this idea to entire columns and time series. It is the main tool analysts use when dates are stored inside DataFrames.

Why Date Handling Matters

As we have found out during the SQL sessions, Dates are not just text. Even if a date looks like "2024-07-01", Python must understand that it represents a calendar value rather than an ordinary string.

If dates remain as plain text, many analytical tasks become difficult or incorrect: be calculated properly - resampling and aggregation over time will not work correctly

That is why one of the first tasks in real analytical workflows is converting date columns into proper datetime format.

  • sorting may behave like text sorting instead of time sorting
  • filtering by year or month becomes inconvenient
  • time differences cannot

datetime Module

The built-in datetime module is useful when working with individual date values.

from datetime import datetime, date, timedelta

Creating a Date Object

A date object stores only year, month, and day.

simple_date = date(2024, 7, 1)
simple_date
datetime.date(2024, 7, 1)
Position Value Meaning
1st 2024 year
2nd 7 month
3rd 1 day

Creating a Datetime Object

A datetime object stores both date and time.

full_datetime = datetime(2024, 7, 1, 14, 30, 0)
full_datetime
datetime.datetime(2024, 7, 1, 14, 30)
Position Value Meaning
1st 2024 year
2nd 7 month
3rd 1 day
4th 14 hour
5th 30 minute
6th 0 second

Getting the Current Date and Time

today = date.today()
now = datetime.now()

print(f'today is {today}')
print(f'right now is {now}')
today is 2026-04-02
right now is 2026-04-02 16:54:47.385112

Accessing Parts of a Datetime

A datetime object has useful attributes.

print(f"Year: {full_datetime.year}")
print(f"Month: {full_datetime.month}")
print(f"Day: {full_datetime.day}")
print(f"Hour: {full_datetime.hour}")
print(f"Minute: {full_datetime.minute}")
print(f"Second: {full_datetime.second}")
Year: 2024
Month: 7
Day: 1
Hour: 14
Minute: 30
Second: 0

Formatting Dates as Strings

Most of the times we want to display a date in a readable format.

full_datetime.strftime("%Y-%m-%d")
'2024-07-01'
full_datetime.strftime("%d/%m/%Y")
'01/07/2024'
full_datetime.strftime("%B %d, %Y")
'July 01, 2024'

Common Formatting Codes

Code Meaning Example
%Y 4-digit year 2024
%m 2-digit month 07
%d 2-digit day 01
%H hour (24-hour clock) 14
%M minute 30
%S second 00
%b short month name Jul
%B full month name July

Parsing Strings into Datetime

This is the reverse of formatting.

dt = datetime.strptime("2024-07-01", "%Y-%m-%d")
dt
datetime.datetime(2024, 7, 1, 0, 0)

This is useful when reading text dates and converting them into structured date objects.

Arithmetic with Dates

One of the most important date operations is measuring time differences.

start = date(2024, 7, 1)
end = date(2024, 7, 15)

delta = end - start
delta
datetime.timedelta(days=14)
delta.days
14

We can also add or subtract days using timedelta.

start + timedelta(days=10)
datetime.date(2024, 7, 11)
start - timedelta(days=7)
datetime.date(2024, 6, 24)

We can also add or subtract time using timedelta.

When we work with a date object, the most natural granularity is usually days or weeks.

print(start + timedelta(days=10))
print(start - timedelta(days=7))
print(start + timedelta(weeks=2))
print(start - timedelta(weeks=1))
2024-07-11
2024-06-24
2024-07-15
2024-06-24

Other granularity are available as well: such as hours, minutes, or seconds, we usually work with a datetime object instead of a plain date.

start_dt = datetime(2024, 7, 1, 14, 30, 0)

print(start_dt + timedelta(days=1))
print(start_dt + timedelta(hours=5))
print(start_dt + timedelta(minutes=45))
print(start_dt + timedelta(seconds=30))
print(start_dt + timedelta(days=2, hours=3, minutes=15, seconds=10))
2024-07-02 14:30:00
2024-07-01 19:30:00
2024-07-01 15:15:00
2024-07-01 14:30:30
2024-07-03 17:45:10

Here are the most common timedelta units:

Argument Meaning
days add or subtract days
weeks add or subtract weeks
hours add or subtract hours
minutes add or subtract minutes
seconds add or subtract seconds

So timedelta helps us move forward or backward in time with different levels of granularity.

Why datetime is Useful?

The built-in module is especially useful when:

  • creating single date values manually
  • measuring time differences
  • formatting dates for display
  • parsing custom date strings

However, once we start working with full columns of data, Pandas becomes much more practical.

Dates in Pandas

In analytics, dates usually appear inside DataFrames. As we know Pandas provides powerful tools to convert, extract, filter, and aggregate date values.

Creating a Sample DataFrame

import pandas as pd

df_dates = pd.DataFrame({
    "customer_id": [101, 102, 103, 104, 105],
    "signup_date": ["2024-01-15", "2024-02-20", "2024-03-05", "2024-03-18", "2024-04-10"],
    "last_purchase": ["2024-06-01", "2024-06-15", "2024-07-01", "2024-07-08", "2024-07-12"]
})

df_dates
customer_id signup_date last_purchase
0 101 2024-01-15 2024-06-01
1 102 2024-02-20 2024-06-15
2 103 2024-03-05 2024-07-01
3 104 2024-03-18 2024-07-08
4 105 2024-04-10 2024-07-12

At this point, the dates are still strings.

Checking Data Types

df_dates.dtypes
customer_id      int64
signup_date        str
last_purchase      str
dtype: object

Notice that the date columns are usually stored as object, which means text.

Note

Newer version of pandas, might use txt instead of object.

Converting to Datetime

We use pd.to_datetime().

df_dates["signup_date"] = pd.to_datetime(df_dates["signup_date"])
df_dates["last_purchase"] = pd.to_datetime(df_dates["last_purchase"])

df_dates.dtypes
customer_id               int64
signup_date      datetime64[us]
last_purchase    datetime64[us]
dtype: object

Now Pandas recognizes them as datetime columns.

Warning

When importing from the data csv or any other file type, it is recommended to explicitly mention the date type:

pd.read_file('filename.type', parse_dates = ['date_column_name'])

Once converted, we can:

  • extract year, month, day
  • calculate time differences
  • filter by time period
  • sort properly by calendar order
  • group by month, quarter, or year

Extracting Date Components with .dt

Pandas provides the .dt accessor for datetime columns.

Year, Month, and Day

df_dates["signup_year"] = df_dates["signup_date"].dt.year
df_dates["signup_month"] = df_dates["signup_date"].dt.month
df_dates["signup_day"] = df_dates["signup_date"].dt.day

df_dates
customer_id signup_date last_purchase signup_year signup_month signup_day
0 101 2024-01-15 2024-06-01 2024 1 15
1 102 2024-02-20 2024-06-15 2024 2 20
2 103 2024-03-05 2024-07-01 2024 3 5
3 104 2024-03-18 2024-07-08 2024 3 18
4 105 2024-04-10 2024-07-12 2024 4 10

Month Name and Day Name

df_dates["signup_month_name"] = df_dates["signup_date"].dt.month_name()
df_dates["signup_day_name"] = df_dates["signup_date"].dt.day_name()

df_dates[["signup_date", "signup_month_name", "signup_day_name"]]
signup_date signup_month_name signup_day_name
0 2024-01-15 January Monday
1 2024-02-20 February Tuesday
2 2024-03-05 March Tuesday
3 2024-03-18 March Monday
4 2024-04-10 April Wednesday

Quarter

df_dates["signup_quarter"] = df_dates["signup_date"].dt.quarter
df_dates[["signup_date", "signup_quarter"]]
signup_date signup_quarter
0 2024-01-15 1
1 2024-02-20 1
2 2024-03-05 1
3 2024-03-18 1
4 2024-04-10 2

Day of Week

Pandas represents weekdays numerically:

  • Monday = 0
  • Tuesday = 1
  • Wednesday = 2
  • Thursday = 3
  • Friday = 4
  • Saturday = 5
  • Sunday = 6
df_dates["signup_dow"] = df_dates["signup_date"].dt.dayofweek
df_dates[["signup_date", "signup_dow"]]
signup_date signup_dow
0 2024-01-15 0
1 2024-02-20 1
2 2024-03-05 1
3 2024-03-18 0
4 2024-04-10 2

Weekend Flag

This is a practical example of a derivative date column.

df_dates["is_weekend"] = (df_dates["signup_date"].dt.dayofweek >= 5).astype(int)
df_dates[["signup_date", "signup_dow", "is_weekend"]]
signup_date signup_dow is_weekend
0 2024-01-15 0 0
1 2024-02-20 1 0
2 2024-03-05 1 0
3 2024-03-18 0 0
4 2024-04-10 2 0

This is especially useful later in behavioral and transaction analysis.

Calculating Differences Between Dates

A very common analytical task is measuring how much time passed between two events.

df_dates["days_between_signup_and_purchase"] = (
    df_dates["last_purchase"] - df_dates["signup_date"]
).dt.days

df_dates[["signup_date", "last_purchase", "days_between_signup_and_purchase"]]
signup_date last_purchase days_between_signup_and_purchase
0 2024-01-15 2024-06-01 138
1 2024-02-20 2024-06-15 116
2 2024-03-05 2024-07-01 118
3 2024-03-18 2024-07-08 112
4 2024-04-10 2024-07-12 93

This pattern is extremely common in analytics:

  • days until first purchase
  • days since prior order
  • time to churn
  • time between visits

Filtering by Date Conditions

Once a column is datetime, filtering becomes easy and natural.

Filter Rows After a Given Date

df_dates[df_dates["signup_date"] >= "2024-03-01"]
customer_id signup_date last_purchase signup_year signup_month signup_day signup_month_name signup_day_name signup_quarter signup_dow is_weekend days_between_signup_and_purchase
2 103 2024-03-05 2024-07-01 2024 3 5 March Tuesday 1 1 0 118
3 104 2024-03-18 2024-07-08 2024 3 18 March Monday 1 0 0 112
4 105 2024-04-10 2024-07-12 2024 4 10 April Wednesday 2 2 0 93

Filter a Specific Month

df_dates[df_dates["signup_date"].dt.month == 3]
customer_id signup_date last_purchase signup_year signup_month signup_day signup_month_name signup_day_name signup_quarter signup_dow is_weekend days_between_signup_and_purchase
2 103 2024-03-05 2024-07-01 2024 3 5 March Tuesday 1 1 0 118
3 104 2024-03-18 2024-07-08 2024 3 18 March Monday 1 0 0 112

Filter a Specific Year

df_dates[df_dates["signup_date"].dt.year == 2024]
customer_id signup_date last_purchase signup_year signup_month signup_day signup_month_name signup_day_name signup_quarter signup_dow is_weekend days_between_signup_and_purchase
0 101 2024-01-15 2024-06-01 2024 1 15 January Monday 1 0 0 138
1 102 2024-02-20 2024-06-15 2024 2 20 February Tuesday 1 1 0 116
2 103 2024-03-05 2024-07-01 2024 3 5 March Tuesday 1 1 0 118
3 104 2024-03-18 2024-07-08 2024 3 18 March Monday 1 0 0 112
4 105 2024-04-10 2024-07-12 2024 4 10 April Wednesday 2 2 0 93
Important

Pandas is flexible here because it understands date-like strings.

Sorting by Date

Sorting works correctly only when the column is in datetime format.

This becomes especially important in time-series analysis and line charts.

df_dates.sort_values("signup_date")
customer_id signup_date last_purchase signup_year signup_month signup_day signup_month_name signup_day_name signup_quarter signup_dow is_weekend days_between_signup_and_purchase
0 101 2024-01-15 2024-06-01 2024 1 15 January Monday 1 0 0 138
1 102 2024-02-20 2024-06-15 2024 2 20 February Tuesday 1 1 0 116
2 103 2024-03-05 2024-07-01 2024 3 5 March Tuesday 1 1 0 118
3 104 2024-03-18 2024-07-08 2024 3 18 March Monday 1 0 0 112
4 105 2024-04-10 2024-07-12 2024 4 10 April Wednesday 2 2 0 93

Creating Date Ranges

Sometimes we need to generate a sequence of dates.

Daily Date Range

pd.date_range(start="2024-07-01", periods=7, freq="D")
DatetimeIndex(['2024-07-01', '2024-07-02', '2024-07-03', '2024-07-04',
               '2024-07-05', '2024-07-06', '2024-07-07'],
              dtype='datetime64[us]', freq='D')

Monthly Date Range

pd.date_range(start="2024-01-01", periods=6, freq="MS")
DatetimeIndex(['2024-01-01', '2024-02-01', '2024-03-01', '2024-04-01',
               '2024-05-01', '2024-06-01'],
              dtype='datetime64[us]', freq='MS')

Here, MS means month start.

Common Frequency Codes

Code Meaning
D daily
W weekly
MS month start
M month end
Q quarter end
Y year end
H hourly

Date ranges are useful for:

  • synthetic datasets
  • building time-series examples
  • resampling and alignment

Begining and the End of Month

We can also work with the beginning of the month (bom) and the end of the month (eom), which is very common in analytics.

This is useful when we want to:

  • align dates to monthly reporting periods
  • calculate monthly summaries
  • label values by month start or month end
  • prepare data for time-based aggregation

With Pandas, this functionality is very convenient.

import pandas as pd

some_date = pd.Timestamp("2024-07-15")

print(f"Original date: {some_date}")
print(f"Beginning of month: {some_date.to_period('M').start_time}")
print(f"End of month: {some_date.to_period('M').end_time}")
Original date: 2024-07-15 00:00:00
Beginning of month: 2024-07-01 00:00:00
End of month: 2024-07-31 23:59:59.999999

Here:

  • to_period('M') converts the date into a monthly period
  • .start_time returns the first moment of that month
  • .end_time returns the last moment of that month
TipIf We Only Want the Calendar Date

Sometimes the full timestamp is not needed, and we want just the date part.

print(f"Beginning of month (date only): {some_date.to_period('M').start_time.date()}")
print(f"End of month (date only): {some_date.to_period('M').end_time.date()}")
Beginning of month (date only): 2024-07-01
End of month (date only): 2024-07-31

This is especially useful in DataFrames.

df_dates["signup_month"] = df_dates["signup_date"].dt.to_period("M").dt.start_time
df_dates["signup_month"] = df_dates["signup_date"].dt.to_period("M").dt.end_time

df_dates
customer_id signup_date last_purchase signup_year signup_month signup_day signup_month_name signup_day_name signup_quarter signup_dow is_weekend days_between_signup_and_purchase
0 101 2024-01-15 2024-06-01 2024 2024-01-31 23:59:59.999999 15 January Monday 1 0 0 138
1 102 2024-02-20 2024-06-15 2024 2024-02-29 23:59:59.999999 20 February Tuesday 1 1 0 116
2 103 2024-03-05 2024-07-01 2024 2024-03-31 23:59:59.999999 5 March Tuesday 1 1 0 118
3 104 2024-03-18 2024-07-08 2024 2024-03-31 23:59:59.999999 18 March Monday 1 0 0 112
4 105 2024-04-10 2024-07-12 2024 2024-04-30 23:59:59.999999 10 April Wednesday 2 2 0 93

This creates two new columns:

  • month_start = first day of the month
  • month_end = last day of the month

Why This Matters

Beginning-of-month and end-of-month functionality is helpful for tasks such as:

  • monthly KPI reporting
  • billing cycle analysis
  • customer cohort preparation
  • monthly sales aggregation
  • aligning dates before plotting line charts

A Simple Summary

Functionality Example
beginning of month some_date.to_period('M').start_time
end of month some_date.to_period('M').end_time
beginning of month as date only some_date.to_period('M').start_time.date()
end of month as date only some_date.to_period('M').end_time.date()
ImportantImportant Note

The end of month returned by Pandas is the last timestamp of that month, not just the last date. That is why you may see a time like:

2024-07-31 23:59:59.999999999

If you want only the calendar date, use .date().

Homework

The following tasks are designed to help you practice the main ideas from the date tutorial you developed, including datetime, timedelta, pd.to_datetime(), .dt accessors, filtering, date ranges, and beginning/end of month functionality.

Practice with the datetime Module

  1. Create a date Object and name it my_date
    • my_date must be representing March 15, 2025.
    • Print each component separately using f-strings:
      • the full value
      • the year
      • the month
      • the day
  2. Create a datetime Object and name it meeting_time
    • meeting_time mustbe representing October 8, 2025 at 16:45:30.
    • Print each component separately using f-strings:
    • year
    • month
    • day
    • hour
    • minute
    • second
  3. Format Dates as Strings using meeting_time, display the date in the following formats:
    • YYYY-MM-DD
    • DD/MM/YYYY
    • full month name, day, year
  4. Parse a String into a Datetime
    • Convert the string "2025-12-31" into a Python datetime object using strptime().
    • Then print the resulting object again using f-strings.
  5. Date Arithmetic
    • start = date(2025, 1, 10)
    • end = date(2025, 2, 5)
    • calculate the difference between end and start
    • the number of days only
  6. Work with timedelta Using start, calculate and print with f-strings:
    • 10 days after start
    • 14 days before start
    • 3 weeks after start
    • Then create a datetime object and add:
      • 5 hours
      • 20 minutes
      • 45 seconds

Practice with Pandas Dates

  1. Create a DataFrame with Date Strings
customer_id signup_date last_purchase
1 2024-01-10 2024-02-15
2 2024-02-05 2024-03-01
3 2024-03-12 2024-03-28
4 2024-04-01 2024-05-10
5 2024-05-20 2024-06-25
  1. Check the data types of all columns.
  2. Convert String Columns to Datetime
  3. convert signup_date into proper datetime columns
  4. convert last_purchase into proper datetime columns
  5. Then print the data types again. (you must see pd.datetime instead of )
  6. Extract Date Components signup_date, create the following new columns:
    • signup_year
    • signup_month
    • signup_day
    • signup_quarter
    • signup_month_name
    • signup_day_name
  7. Create a Day-of-Week and Weekend Flag
    • signup_dow using .dt.dayofweek
    • is_weekend where weekend = 1 and weekday = 0
  8. Create a new column called days_between_signup_and_purchase. It should show the number of days between signup_date and last_purchase.
  9. Filter Data: Write code to return:
    • customers who signed up after "2024-03-01"
    • customers who signed up in March
    • customers who signed up in 2024
  10. Sort by Date: Sort the DataFrame by signup_date from earliest to latest.
  11. Generate a daily date range starting from "2024-07-01" for 10 days.
  12. Generate a monthly date range starting from "2024-01-01" for 12 months using month start frequency.
  13. Create a DataFrame with:
    • a date column covering 15 daily dates starting from "2024-01-01"
    • a sales column with any numeric values you choose
    • Then create the following derivative columns:
      • year
      • month
      • day
      • day_name
      • dow
      • is_weekend
  14. Beginning and End of Month for One Date for some_date = pd.Timestamp("2024-08-18"), then print:
    • original date
    • beginning of month
    • end of month
    • beginning of month as date only
    • end of month as date only
  15. Beginning and End of Month for a Full ColumnUsing the signup_date column in your DataFrame, create two new columns:
    • month_start
    • month_end
    • Make sure both are derived from signup_date.