Wednesday, May 10, 2023

Analytical Calendar - 003 - Calendar And Date

Post Index

Analytical Calendar The Story | The Index Page


Analytical Calendar

Calendar and Date

In data analytics, nearly all kinds of analysis have a very crucial element - date. All insights found should be timely and thus, in most of the time, there is a calendar to facilitate data analysts to select the required period of time for data analysis.

A calendar consists of a sequence of dates from a starting date to an ending date. Usually, the calendar consists of one or several years. Date means a duration of time and consists of basic elements including year, month and day. In common representation in a calendar, date is always organized in week and month into a year. A week is from a weekday like Sunday to a week day like Saturday to form a cycle of 7 days. Then these 7 days are organized into a year. A month is from 1 to 12 and organized into a year. As illustrated below, Year is first organized into 12 months and then each month is represented in 7-day (week) view. Week can be across for two consecutive months.

Calendar has this format and structure. Most of the time, when using the calendar, the structure helps to drill down from high level date granularity (longer duration of time, such as a year) to low level date granularity(shorter duration of time, such as a week). Apparently, it has four granularity levels including year, month, week and day.



Date Component

Date includes a number of date components including year, quarter, month, week and date. Each of the components helps break down the date into smaller portion of dates. In short, each date component consists of different number of days except date (the component) is uniquely to have only 1 day.

Date Component

Number of Days

Year

365 for normal year
366 for leap year

Quarter

90 or 91 for quarter 1 (leap year)
91 for quarter 2
92 for quarter 3, 4

Month

28 (for normal year)
29 (for leap year) for Feb30 for Apr, Jun, Sep, Nov
31 for Jan, Mar, May, Jul, Aug, Oct, Dec

Week

7 days if on complete week calendar model
1 to 7 days if on incomplete week calendar model

* More discussion in later section.

Date

1



Date Granularity and Date Uniqueness

Date component is similar to date granularity except for 1 difference: date granularities are in hierarchies while date components are separate components with no necessary relations amongst themselves. Before looking into the date hierarchy, date uniqueness must be discussed.

Date uniqueness means that a combination of date components can be uniquely identified in time using specific date ranges, such as 2022-11-01 – 2022-11-30. Below shows the possible unique date of all possible combinations of date components.

No. of Date Components

Year

Quarter

Month

Week

Day

Uniqueness

Commonly Use

One

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Two

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Three

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Four

 

 

 

 

 

Five

Interestingly, when the year component exists, the uniqueness is always true. It is not difficult to spot that year is the only component that iterates by one per year. All other date components are re-used repeatedly in each year.

Because of this interesting result, the majority of the combination without the year is not unique. And among the unique component combinations, in fact, only five of them are commonly used. Each of them represents a kind of date granularity including year, quarter, month, week, and day. Except Year-Month-Day has a special term “Date”, all the others simply combine the components to form its terminology. The below summarizes them.

Number of Components

Year

Quarter

Month

Week

Day

Granularity

Terminology

One

 

 

 

 

Year

Year

Two

 

 

 

Quarter

Year Quarter

Two

 

 

 

Month

Year Month

Two

 

 

 

Week

[Week Year Week]

Three

 

 

Day

Date

* Note: Year week is named Week Year Week to avoid the confusion of year. More information will be provided in later section.


These five date specifications can identify all the dates. There is no ambiguity about the period of time. Based on this result, we can conclude five usage patterns.

·         Year

·         Year à Quarter

·         Year à Month

·         Year (* Week Year) à Week

·         Year à Month à Day

* There is different in Year and Week Year and will be discussed in later sections.



Date Hierarchy

Every calendar has a structure. To be more precise, there are two different hierarchical structures to support all the common usage patterns. Starting from year and drill down into different date components. No matter which hierarchy, it has a bucket of dates in the lowest granularity. The two unique hierarchies including:

·         YQMD Hierarchy, i.e. Year à Year Quarter à Year Month à Date

This supports the usage pattens of:

o    Year

o    Year à Quarter

o    Year à Month

o    Year à Month à Day


 ·         YwWD Hierarchy, i.e. Week Year à Week Year Week à Date

This supports only the usage pattern of

o    Week Year à Week

YQMD hierarchy is a commonly used hierarchy in our day-to-day life , quarter can be skipped in this hierarchy. YwWD hierarchy is not very commonly found in daily usage, but it is quite common to be used in business and data analytics. Moreover, itshould be noted that “Week Year” is different from “Year”. It is explained in the coming sections.



YQMD Hierarchy



YQMD hierarchy has 4 layers with highest granularity: year level and breaks down into quarter level, to month level and then day level.

A year has four year quarters, twelve year months and each year month has 28 to 31 dates. Date is the most granular component in all the hierarchies. Due to the irregular distribution of days in months, YQMD hierarchy has an irregular shape. Even though it looks like the calendar distributes nicely from Year to Year Quarter to Year Month and then to Days, however the bucket of dates are varied in the lowest part of the hierarchy.

For date manipulations, the calculation is not simply linear or proportional. It requires calculations.



YWD Hierarchy

Another hierarchy is YWD. The YWD hierarchy has 3 layers from the highest granularity Week Year level to Week level to lowest granularity Day level. Week, in general, means 7 days from a weekday to its previous weekday. For instance, a typical week cycle is Sunday to Saturday.

 


Unfortunately, weeks cannot be distributed evenly in a year. A year has 365 or 366 days, which is not a multiple of 7. Conventionally, there are two methods on how to deal with the incomplete week in the year:


(1)     Incomplete Week Calendar Model

It allows incomplete weeks, i.e. a week in the start/end of the year, it can be less than 7 days.  The week number is reset to 1 when a new year starts, i.e. 01-Jan is the first day of Week 1.

In this case, whenever it is Sunday (if the start of week is Sunday), it increases the week number by 1. When it reaches 01-Jan, it resets the week to 1. Using the same example, 2022-01-01, it is 2022 week 1 with Week Year equals to 2022 and year also equals to 2022.

(2)    Complete Week Calendar Model

It strictly maintains a week to have 7 days and shifts the incomplete days into another year.

For example, 2022-01-01, is Saturday. If the week definition is from Sunday to Saturday, it is the last day of the week. Thus, it is counted in to 2021 W53. In order to clearly distinguish the year, Week Year is introduced. In this example, Week Year of 2022-01-01 is 2021 while Year is 2022.


Whichever the model is, Week Year is used only within the YWD hierarchy, and it can be very ambiguous compared to year.

YWD hierarchy also has an irregular distribution of days in the year. The number of days is always required calculations and a more complicated calculations required depending which model is used.



Previous Section <== 002 - Analytical Calendar - Introduction



No comments:

Post a Comment