Maven manufacturing downtime

This blog is based on a free data sample from Maven Analytics, Manufacturing downtime, Free Sample Dataset Download – Manufacturing Downtime – Maven Analytics | Build Data Skills, Faster. Thanks to Maven Analytics for publishing the data set.

Four questions were asked, and I will answer the questions later in this blog post.

  • What’s the current line efficiency? (total time / min time)
  • Are any operators underperforming?
  • What are the leading factors for downtime?
  • Do any operators struggle with particular types of operator error?

Let’s check the first the data we got.

Line productivity is a fact table about produced batches.

The product table includes product data and is a dimension table. We are dealing with six products.

Downtime factor table is a dimension table about reasons why downtimes happened. The table is a list of factors and ii the factor was an error caused by an operator.

Line downtime is storing the downtimes per each line and what was the downtime factor. The table is a fact table.

The issue with this table is that it is in matrix format, not a pivot table. Before starting analyzing the data, we need to convert the table into pivot format. I have written a blog post about that.

In brief, load the data as data model. In power query, activate the first column from left and press right mouse button, select “unpivot other columns”.

I will paste the value into separate Excel sheet, as it is easier for me to manage values there.

Just control + v in the Excel sheet.

I have changed the column names to be more descriptive. Factor is foreign key for the primary key in factor dimension table.

Then I loaded the data into MS Access, as usual. I changed the names for the tables to start with either D or F depending on whether the table is dimension or fact. Also, spaces were removed from the field names. It is easier for me when there are no spaces in table or field names, then I don’t have to use quotes or single quotes. SQL queries can be used with Access, too.

The data was loaded from Access into Excel data model.

This is the data model. Product dimension table D_prod is joined to line productivity fact table F_lineprod as both the tables hold product information. The relation is one-to-many. One product appears only once in dimension table but may appear several times in fact table. The factor dimension table D_dfactor is joined to down time fact table F_linedown with factor field. Relationship is again one-to-many. The two fact tables are joined with batch number as the number appears once in F_lineprod table. One batch may have several downtimes.

One notification I must make over the data. The last row in F_lineprod table, the batch starts during the previous day and ends following day. As you note, I have rounded the times into full minutes leaving seconds out.

I will adjust the times that the batch takes 2 hours 10 minutes, but during the same day.

  • What’s the current line efficiency? (total time / min time)

The first question is to calculate line efficiency. I understand this to be down time divided by total time and I consider product as line.

First, I calculated the total time.

This is the DAX:

=SUMX( F_lineprod;(F_lineprod[EndTime] – F_lineprod[StartTime]) * 24)

The total time in hours.

This can be doublechecked in Access with:

SELECT

    ROUND(SUM(EndTime – StartTime) * 24, 2),

    Product

FROM

    F_lineprod

GROUP BY

    Product;

The downtime is calculated by simply sum of min column in F_linedown table. As the values are given in minutes, the result is divided by 60 to get hours.

=sum(F_linedown[Mins])/60

The values are crosschecked in Access as follows:

SELECT

    fd.product AS Expr1,

    round(Sum(fn.Mins) / 60, 2) AS Expr2

FROM

    F_linedown AS fn,

    F_lineprod AS fd

WHERE

    fn.batch = fd.batch

GROUP BY

    fd.product;

The total time and down time are calculated here:

OR-600 has highest downtime ratio. Other product have quite equal ratio. It is depending on the business is one third of production time reasonable level for downtimes. 

If efficiency is calculated by total time minus downtime, the efficiency is 64 %. Another topic is whether 64 % is a good or a bad number.

  • Are any operators underperforming?

Mac has highest downtime ratio, but differences are minor among operators. Charlie, who has lowest downtime ratio, has the highest absolute downtime. This is due to fact, that total times vary between the operators. Charlie’s total time is 36 % higher than Mac’s.

We can doublecheck the total time with Access as follows:

SELECT

    fd.operator AS Expr1,

    round(Sum(fn.Mins) / 60, 2) AS Expr2

FROM

    F_linedown AS fn,

    F_lineprod AS fd

WHERE

    fn.batch = fd.batch

GROUP BY

    fd.operator;

And down time:

SELECT

    fd.operator AS Expr1,

    round(Sum(fn.Mins) / 60, 2) AS Expr2

FROM

    F_linedown AS fn,

    F_lineprod AS fd

WHERE

    fn.batch = fd.batch

GROUP BY

    fd.operator;

  • What are the leading factors for downtime?

Downtime have been counted simply with:

=sum(F_linedown[Mins])/60

We take minutes from fact table F_linedown and divide by 60 to get hours.

With downtimes we have deviation. The most common factors for downtime are machine adjustment and machine failure. Those represent more than 40 % of all cases. No emergency stops took place. Conveyor belt jam is very uncommon.

The same can be checked with Access:

SELECT

    f.factor,

    d.Description,

    ROUND(Sum(f.Mins) / 60, 2) AS SumOfMins,

    ROUND(

        SUM(f.mins) / (

            SELECT

                SUM(mins)

            FROM

                F_linedown

        ),

        4

    )

FROM

    F_linedown AS f,

    D_dfactors AS d

WHERE

    (((f.factor) = [d].[factor]))

GROUP BY

    f.factor,

    d.Description

ORDER BY

    Sum(f.Mins) DESC;

  • Do any operators struggle with particular types of operator error?

The only formula which I used here is the same as in previous question. Just summing up the minutes of downtime.

Charlie and Dee have various of operator errors. Dennis and Mac have fewer types of operator errors. Machine adjustment is typical error for Dennis and batch change for Mac.

If you want to check out the same issue in SQL, this is the query I used.

SELECT

    fd.Operator,

    fn.factor,

    ds.Description,

    SUM(fn.mins)

FROM

    F_lineprod AS fd,

    F_linedown AS fn,

    D_dfactors AS ds

WHERE

    fd.batch = fn.Batch

    AND fn.Factor = ds.Factor

    AND ds.OperatorError = ‘YES’

GROUP BY

    fd.Operator,

    fn.factor,

    ds.Description;

Summary:

  • What’s the current line efficiency? (total time / min time)

64 %.

  • Are any operators underperforming?

Mac has highest downtime ratio, but differences are minor among operators.

  • What are the leading factors for downtime?

The most common factors for downtime are machine adjustment and machine failure.

  • Do any operators struggle with particular types of operator error?

Machine adjustment is typical error for Dennis and batch change for Mac.

Maven Toy Store E-Commerce Database

The data for this blog was taken from Maven Analytics data playground Free Practice & Free Sample Dataset Download – Toy Store E-Commerce Database – Maven Analytics | Build Data Skills, Faster toy store e-commerce database. Thanks to Maven Analytics for publishing datasets.

Four questions were presented:

  • What is the trend in website sessions and order volume?
  • What is the session-to-order conversion rate? How has it trended?
  • Which marketing channels have been most successful?
  • How has the revenue per order evolved? What about revenue per session?

I will answer those questions in this blog based on the sample data.

First, we need to get familiar with the tables.

The orders table includes the order id, website session, user id, primary product in case of bundle order, items purchased, total price of the order and cogs.

In order item table we have item level information about the order like price and cogs.

Refunds related to orders are stored in order item refund table. There are order item id and refund amount.

Product table includes four records of product id and product name.  

I have downloaded the data to Access.

In addition to this model, I created a date table.

In addition to this model, I created a date table.

I created the table in Excel. Excel was counting year, quarter, month, weekday and year quarter combination.

The datamodel looks like this.

Relations in the datamodel.

What is the trend in website sessions and order volume?

Number of websessions is increasing quarterly apart from the last reported quarter.

Order volume measured by money has increased.  Only 2015/Q1 was lower than previous quarter.

  • What is the session-to-order conversion rate? How has it trended?

I understand this question so that how many sessions turn to order. If a customer is browsing the webpage, how often does the customer buy something.

This is the rate how many websessions are led to an order.

Sessions_to_order

=count(Orders[order_id])/count(Website_sessions[website_session_id])

How has the revenue per order evolved? What about revenue per session?

Number of websessions is increasing apart from the Q1 in 2015.

Session to order rate is steadily increasing apart from the last reported quarter, Q1/15.

  • Which marketing channels have been most successful?

I understand market channel as campaigns in website_sessions table.

When calculating the success, I count the sales minus cogs minus refunds. Then we know the total success.

=SUM ( Orders[price_usd] )

– SUM ( Orders[cogs_usd] )

– SUM ( order_item_refunds[refund_amount_usd] )

Nonbrand has the highest sales, pilot the lowest.

The table above is calculated with DAX below:

=AVERAGEX (

Orders;

Orders[price_usd]

– Orders[cogs_usd]

– CALCULATE (

SUM ( order_item_refunds[refund_amount_usd] );

FILTER ( order_item_refunds; order_item_refunds[order_id] = Orders[order_id] )

)

)

We take the price of the order minus costs minus refunds and then we count the average.

Here we have divided the net sales with number of websessions.

The DAXes are as follows:

Sales

=SUM ( Orders[price_usd] )

– SUM ( Orders[cogs_usd] )

– SUM ( order_item_refunds[refund_amount_usd] )

CWeb

=count(Website_sessions[website_session_id])

S_CWeb

=DIVIDE([Sales];[CWeb])

The average sales per websession has steadily increased.

  • What is the trend in website sessions and order volume?

Number of website sessions and order volume have increased steadily quarterly apart from the last reported quarter.

  • What is the session-to-order conversion rate? How has it trended?

Session to order rate has increased steadily. Average is about 0,07, 7 % of all the sessions turn into an order.

  • Which marketing channels have been most successful?

I understood market channel as campaigns. Nonbrand is the best campaign.

  • How has the revenue per order evolved? What about revenue per session?

Both revenue per order and revenue per session have increased.