Maven Analytics published data sets about pizza sales in their website Free Data Sets & Dataset Samples | Maven Analytics . Thanks for Maven for providing data for training purposes. In this blog I am analyzing the data and answering the four questions in Maven web site.
The data was downloaded into MS Access. As Access is a database, then the data was in a database where we can browse the data with SQL.

Order details includes data about orders, which pizza was ordered and how many pieces. There are close to 49k records in the table.

Order table has the order date and time. Order_id is the primary key field.

Pizza_types
The primary key in this table is the first column, pizza_type_id. In the original data, the ingredients were in one column. I have separated the ingredients in different columns.

The pizza_type table holds the prices and sizes of each pizza. The pizza_type_id is a higher level parameter. Each pizza_type_id is divided into normally three pizza_id depending on the size.
Unfortunately, the decimals were dropped from prices. Therefore my price estimates are bit lower than real figures, but I am not that far from real figures.

The date table is an additional table which is not part of an original data set. I will use date table in reporting, as we want to know eg. sales per each weekday.
The data was loaded into Excel data model.

In date table, month, week of the year, day name, and month were added.

In Power Query a new column “hour” was added to orders table. We can use this calculated column for defining peak hours.

This is the data model how tables are related to each other.

Here are the relations.
Both order_details and orders tables have order_id field. Order_detail table holds the order id, pizza id and quantity. orders table includes the ordering time. The cardinality is many to one, order_details table is per pizza and one order may include several pizzas, one order has just one order time.
Pizzas is a pizza master data table. Order_details include which pizzas have been ordered. As said, one order my include several pizzas. Therefore each pizza needs to exist just once in master data table, before you can order the pizzas. The cardinality is one to many.
Orders table has the ordering date. The relation is created to date table. One date may exist several times in orders table, as many orders may be placed during one day. Each date from the year 2015 exists once in data table.
All the pizzas are listed in pizzas table, as the pizza_types table consists of ingredients of different pizzas. Pizza_type_id is the same field in both tables. In pizza_types table pizza_type exists just once, as pizza_type is primary key in pizza_type table. In pizzas table the pizza_type_id may exist several times, as pizza_id is the primary key .
How many customers do we have each day? Are there any peak hours?
There are altogether 21350 orders during the year 2015. This means we have on average 58 customers a day, when we count one order as one customer. Busiest day was Nov 27th. Then we had 115 customer. Lowest number of customers, 27, was seen in Dec 29th.
We can see two peak hours. First peak takes place between 12-13, obviously customers are having lunch time. Another peak is during dinner time 17-18.

How many pizzas are typically in an order? Do we have any bestsellers?
Average number of pizzas per order is 2,3. Highest number of pizzas in one order is 28 pizzas.
The best selling pizzas are: classic_dlx, bbq_ckn and Hawaiian.
How much money did we make this year? Can we identify any seasonality in the sales?
My answer for total revenue is 795200.
The revenue was calculated with the formula below. However, when transferring the data from MS Access to Excel data model, I lost he decimals in prices. This means, my revenue is slightly lower than the correct amount.
=SUMX(order_details,[quantity]*RELATED(pizzas[Price]))
The best pizza, in financial terms, is The Thai Chicken Pizza.
The total revenue was checked also in Access.

The result is the same.
SELECT SUM(order_details.quantity * pizzas.price)
FROM order_details INNER JOIN pizzas ON order_details.pizza_id = pizzas.pizza_id;
This is the SQL statement to query the total revenue.
About seasonality in sales:
When checking orders volumewise, meaning counting number of pizzas, the best month is July followed by May. The best weekday for pizza selling is clearly Friday. Saturday and Thursday are second best days, Saturday and Thursday have nearly equal sales but evidently lower than Friday.
Are there any pizzas we should take of the menu, or any promotions we could leverage?
Mushroom is the least popular.

The other four categories are quite equally popular, but mushroom is an exception. Would it be a good idea to remove mushroom from menu ?
There are practically no sales before 11 o’clock. Some promotion could take place to balance the workload more evenly throughout the work day.
About the pizza sizes, S, M, and L cover nearly 99 % of all the sales. XL and XXL have portion of good 1 % out of total sales. Should the XL and XXL be removed from selection or should they be promoted ? At a moment XL and XXL do not have significance in pizza business.





