Correlation matrix

This blog is addition to my earlier blog about correlation. Here I present how to calculate several correlations at one go.

This is a sales report on four products.

A sales campaign was made for product P1 in P10/23.

How did the campaign affect other products? One way is to have a correlation matrix. That presents correlations between each product.

Select file – option – add-ins. Check that analysis toolpak is active.

Select data – data analysis.

Select correlation.

Select sales data as input range, include headers. Grouped by is columns as we have data per column.

Then we have the results. Correlations are presented per each combination.

The strongest correlation is between P1 and P4. Also, P2 and P4 have mutual correlation, as well P1 and P2. P3 lives its own life. Correlation between P1 and P3 is slightly negative.

Sales campaign on P1 affects positively P4 and P2. Correlation between P1 and P4 is bit higher than between P1 and P2. Campaign does not affect sales for P3 at all. Campaign is slightly decreasing the sales for P3.    

Correlation matrix makes analysis fast. You don’t have to calculate each correlation separately. Of course, you can do that with CORREL function.

D-functions in Excel

You have data for the year, product, region, and sales volume.

The data is not in pivot format.

If we want to calculate quickly the sales volume for product P1 in East region, we could create a pivot report. For cases like this, we can use D-functions. D-formulas are normal Excel functions like SUM, MIN, MAX or AVERAGE but with D prefix.

Let’s check an example.

The DSUM function has three arguments: database, field and criteria. The database means the data range, in our case that is B4:E20. The range includes also headers. The field is the field with facts or quantitative data. The field can be the order number from left to right, in our case sales is the fourth column. The field can be defined also by header name like =DSUM(B4:E20;”Sales”;G8:H9) would work also fine. We are calculating sales values.   We have defined two criteria: product and region. The product should be P1 and region East. Even though there is just one argument for criteria, we can define several criteria with one argument G8:H9.

In the same way, we can use DAVERAGE and DMIN.

 We have no data for product P3 for the year 2024. We have only two values for P2 in 2025, 19,2 is the smaller of the two values.

It is possible, of course, that we create a pivot.

The result is the same as DSUM. We can check also with Access

I loaded the data into Access.

SQL-query returns the same value.

DSUM is calculating correctly.

D-functions are useful, if you want to calculate few values from a table without setting up a pivot-report. If you need to calculate several values or do sensitivity analysis, then it is better to create a pivot-report.

There might be more D-functions in Excel in addition to what I have mentioned in this blog post.