The Holiday Oracle system automatically generates a "holiday rule" which is a series of dates for a particular holiday celebration for a country or subdivision from open source data.
To determine countries and subdivisions (states, provinces etc), Holiday Oracle uses the language and approach of ISO 3166 Country Codes and ISO 3166-2:2013 Codes for the representation of names of countries and their subdivisions - Part 2: Country subdivision code. Open data projects used by Holiday Oracle need to be in this format, or easily transmutable.
Holiday Oracle stress-tests its candidate holiday rules and their corresponding dates predictions in a cross-fold validation which provides greater insight into the level of consensus in the underlying dataset and outputs a final consensus score for each candidate holiday rule.
The consensus score is between 0 and 1. The system generates the score by measuring the level of false positives, false negatives, true positives and true negatives (as well as the variation in the data points) when the rule is tested against the underlying data. The system selects the candidate holiday rule with the highest score. If the score of two candidate rules is tied, the system runs through a tie-breaker routine which considers other metrics about the rule and the data to select a winner.
In rare circumstances, the sources of open data dates are so inconsistent that Holiday Oracle can not automatically generate a rule. There is no consensus. In these cases, no holiday rule is returned by the Holiday Oracle API. This is uncommon, and we will continue to seek more sources of open data which should enable Holiday Oracle’s consensus mechanism to resolve these issues.
For a country or subdivision, each winning holiday rule and its corresponding date predictions for the years 2019 to 2030 with the highest consensus score across all of the datasets is returned via the API.