Analytical Calendar The Story | The Index Page
Analytical Calendar
Calendar and Date
In
data analytics, nearly all kinds of analysis have a very crucial element - date.
All insights found should be timely and thus, in most of the time, there is a
calendar to facilitate data analysts to select the required period of time for
data analysis.
A calendar consists of a sequence of dates from a starting date to an ending date. Usually, the calendar consists of one or several years. Date means a duration of time and consists of basic elements including year, month and day. In common representation in a calendar, date is always organized in week and month into a year. A week is from a weekday like Sunday to a week day like Saturday to form a cycle of 7 days. Then these 7 days are organized into a year. A month is from 1 to 12 and organized into a year. As illustrated below, Year is first organized into 12 months and then each month is represented in 7-day (week) view. Week can be across for two consecutive months.
Calendar
has this format and structure. Most of the time, when using the calendar, the structure
helps to drill down from high level date granularity (longer duration of time,
such as a year) to low level date granularity(shorter duration of time, such as
a week). Apparently, it has four granularity levels including year, month, week
and day.
Date Component
Date includes a number of date components including year, quarter, month, week and date. Each of the components helps break down the date into smaller portion of dates. In short, each date component consists of different number of days except date (the component) is uniquely to have only 1 day.
Date Component |
Number of Days |
Year |
365 for
normal year |
Quarter |
90 or 91 for quarter 1 (leap year) |
Month |
28 (for
normal year) |
Week |
7 days if on complete week calendar model |
Date |
1 |
Date Granularity and Date Uniqueness
Date component is similar to date granularity except for 1 difference: date granularities are in hierarchies while date components are separate components with no necessary relations amongst themselves. Before looking into the date hierarchy, date uniqueness must be discussed.
Date uniqueness means that a combination of date components can be uniquely identified in time using specific date ranges, such as 2022-11-01 – 2022-11-30. Below shows the possible unique date of all possible combinations of date components.
No. of Date Components |
Year |
Quarter |
Month |
Week |
Day |
Uniqueness |
Commonly Use |
One |
✔ |
|
|
|
|
✔ |
✔ |
|
✔ |
|
|
|
✖ |
✖ |
|
|
|
✔ |
|
|
✖ |
✖ |
|
|
|
|
✔ |
|
✖ |
✖ |
|
|
|
|
|
✔ |
✖ |
✖ |
|
Two |
✔ |
✔ |
|
|
|
✔ |
✔ |
✔ |
|
✔ |
|
|
✔ |
✔ |
|
✔ |
|
|
✔ |
|
✔ |
✔ |
|
✔ |
|
|
|
✔ |
✖ |
✖ |
|
|
✔ |
✔ |
|
|
✖ |
✖ |
|
|
✔ |
|
✔ |
|
✖ |
✖ |
|
|
✔ |
|
|
✔ |
✖ |
✖ |
|
|
|
✔ |
✔ |
|
✖ |
✖ |
|
|
|
✔ |
|
✔ |
✖ |
✖ |
|
|
|
|
✔ |
✔ |
✖ |
✖ |
|
Three |
✔ |
✔ |
✔ |
|
|
✔ |
✖ |
✔ |
✔ |
|
✔ |
|
✔ |
✖ |
|
✔ |
✔ |
|
|
✔ |
✔ |
✖ |
|
✔ |
|
✔ |
✔ |
|
✔ |
✖ |
|
✔ |
|
✔ |
|
✔ |
✔ |
✔ |
|
✔ |
|
|
✔ |
✔ |
✔ |
✖ |
|
|
✔ |
✔ |
✔ |
|
✖ |
✖ |
|
|
✔ |
✔ |
|
✔ |
✖ |
✖ |
|
|
|
✔ |
✔ |
✔ |
✖ |
✖ |
|
Four |
✔ |
✔ |
✔ |
✔ |
|
✔ |
✖ |
✔ |
✔ |
✔ |
|
✔ |
✔ |
✖ |
|
✔ |
✔ |
|
✔ |
✔ |
✔ |
✖ |
|
✔ |
|
✔ |
✔ |
✔ |
✔ |
✖ |
|
|
✔ |
✔ |
✔ |
✔ |
✖ |
✖ |
|
Five |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✖ |
Interestingly,
when the year component exists, the uniqueness is always true. It is not
difficult to spot that year is the only component that iterates by one per
year. All other date components are re-used repeatedly in each year.
Because of this interesting result, the majority of the combination without the year is not unique. And among the unique component combinations, in fact, only five of them are commonly used. Each of them represents a kind of date granularity including year, quarter, month, week, and day. Except Year-Month-Day has a special term “Date”, all the others simply combine the components to form its terminology. The below summarizes them.
Number of Components |
Year |
Quarter |
Month |
Week |
Day |
Granularity |
Terminology |
One |
✔ |
|
|
|
|
Year |
Year |
Two |
✔ |
✔ |
|
|
|
Quarter |
Year Quarter |
Two |
✔ |
|
✔ |
|
|
Month |
Year
Month |
Two |
✔ |
|
|
✔ |
|
Week |
[Week Year Week] |
Three |
✔ |
|
✔ |
|
✔ |
Day |
Date |
* Note: Year week is named Week Year Week to avoid the confusion of year. More information will be provided in later section.
These five date specifications can identify all the dates. There is no ambiguity about the period of time. Based on this result, we can conclude five usage patterns.
· Year
· Year à Quarter
· Year à Month
· Year (* Week Year) à Week
· Year à Month à Day
* There is different in Year and Week Year
and will be discussed in later sections.
Date Hierarchy
Every calendar has a structure. To be more precise, there are two different hierarchical structures to support all the common usage patterns. Starting from year and drill down into different date components. No matter which hierarchy, it has a bucket of dates in the lowest granularity. The two unique hierarchies including:
· YQMD Hierarchy, i.e. Year à Year Quarter à Year Month à Date
This
supports the usage pattens of:
o
Year
o
Year à Quarter
o
Year à Month
o
Year à Month à Day
This
supports only the usage pattern of
o
Week Year à Week
YQMD hierarchy is a commonly used hierarchy
in our day-to-day life , quarter can be skipped in this hierarchy. YwWD
hierarchy is not very commonly found in daily usage, but it is quite common to
be used in business and data analytics. Moreover, itshould be noted that “Week Year”
is different from “Year”. It is explained in the coming sections.
YQMD Hierarchy
YQMD hierarchy has 4 layers with highest
granularity: year level and breaks down into quarter level, to month level and then
day level.
A year has four year quarters, twelve year months
and each year month has 28 to 31 dates. Date is the most granular component in
all the hierarchies. Due to the irregular distribution of days in months, YQMD
hierarchy has an irregular shape. Even though it looks like the calendar
distributes nicely from Year to Year Quarter to Year Month and then to Days, however
the bucket of dates are varied in the lowest part of the hierarchy.
For date manipulations, the calculation is
not simply linear or proportional. It requires calculations.
YWD Hierarchy
Another hierarchy is YWD. The YWD
hierarchy has 3 layers from the highest granularity Week Year level to Week level
to lowest granularity Day level. Week, in general, means 7 days from a weekday to
its previous weekday. For instance, a typical week cycle is Sunday to Saturday.
Unfortunately, weeks cannot be distributed evenly
in a year. A year has 365 or 366 days, which is not a multiple of 7. Conventionally,
there are two methods on how to deal with the incomplete week in the year:
(1)
Incomplete Week Calendar Model
It allows incomplete weeks, i.e. a week
in the start/end of the year, it can be less than 7 days. The week number is reset to 1 when a new year
starts, i.e. 01-Jan is the first day of Week 1.
In this case,
whenever it is Sunday (if the start of week is Sunday), it increases the week
number by 1. When it reaches 01-Jan, it resets the week to 1. Using the same
example, 2022-01-01, it is 2022 week 1 with Week Year equals to 2022 and year
also equals to 2022.
(2)
Complete Week Calendar Model
It strictly maintains a week to have 7
days and shifts the incomplete days into another year.
For example, 2022-01-01, is Saturday. If the week definition is from Sunday to Saturday, it is the last day of the week. Thus, it is counted in to 2021 W53. In order to clearly distinguish the year, Week Year is introduced. In this example, Week Year of 2022-01-01 is 2021 while Year is 2022.
Whichever the model is, Week Year is used only
within the YWD hierarchy, and it can be very ambiguous compared to year.
YWD hierarchy also has an
irregular distribution of days in the year. The number of days is always
required calculations and a more complicated calculations required depending
which model is used.
Previous Section <== 002 - Analytical Calendar - Introduction
No comments:
Post a Comment