Background
Typically the Hotel Problem refers to time periods (e.g. days in a month). A number of “guests” in a hotel may be present over multiple days and adding these daily guests together will give a different number to the total unique guests over the entire period.
This blog post will show how that this can occur over most breakdowns within an Analytics tool, expanding on the core issue.
Questions
Some of the more regular questions Analysts face when producing and delivering reports to the business is:
- “Why are there more unique visitors in total for this report than the summary?”
- “Why do the number of visits not add up to the figures in the monthly trend?”
- “Why don’t the numbers add up to 100%?”
The problem is not a complex one but can be a bit tricky to explain, let’s imagine my website has a low number of visitors, one to be precise. And that visitor is me.
Journeys
My journey might look something like:
Homepage -> Product Page -> Contact Us -> Product Page
Now what would that look like in a pages report?
This all makes sense, there’s no duplication and the totals add up.
But what about Unique Visitors and Visits?
But what if we add unique visitors and visit metrics to the report?
Now this looks a little odd. We know that we only have one unique visitor with one visit, but this report sums up to 3?
Metric Definitions
To understand why, we first need to translate what the unique visitor metric means at each row here into plain English:
- “This report shows the number of unique visitors who at some point during any of their visits who at some point viewed the home page.”
- “This report shows the number of unique visitors who at some point during any of their visits who at some point viewed the product page.”
- “This report shows the number of unique visitors who at some point during any of their visits who at some point viewed the contact us.”
So I, as a unique visitor, am counted in each row of the report. However, I am still only one unique visitor, so adding up all the pages which I have seen (and getting 3) does not make sense.
Solution
So how do we get the correct number of unique visitors? The answer is to use just a metric report (i.e. not broken down) or only break down by a field where the visitor can only be part of one breakdown (e.g. device, unless you’re using multi-device stitching!).
The take away is when adding up reports, you need to think about what level the metric is in relation to the breakdown. It is likely that you will need to apply deduplication at a different level.
Further Examples
More Reading
The hotel problem is discussed in more detail here: http://en.wikipedia.org/wiki/Web_analytics#The_hotel_problem
About the author
Lynchpin
Lynchpin integrates data science, engineering and strategy capabilities to solve our clients’ analytics challenges. By bringing together complementary expertise we help improve long term analytics maturity while delivering practical results in areas such as multichannel measurement, customer segmentation, forecasting, pricing optimisation, attribution and personalisation.
Our services span the full data lifecycle from technology architecture and integration through to advanced analytics and machine learning to drive effective decisions.
We customise our approach to address each client’s unique situation and requirements, extending and complementing their internal capabilities. Our practical experience enables us to effectively bridge the gaps between commercial, analytical, legal and technical teams. The result is a flexible partnership anchored to clear and valuable outcomes for our clients.