Maven CRM Sales Opportunities, free data sets

The dataset in this blog was downloaded from Maven Analytics free dataset Free Sample Dataset Download – CRM Sales Opportunities – Maven Analytics | Build Data Skills, Faster

The dataset is about CRM data about a fictitious company. Thanks to Maven Analytics for publishing this free data set.

The dataset includes also four questions about the data.

  • How is each sales team performing compared to the rest?
  • Are any sales agents lagging behind?
  • Are there any quarter-over-quarter trends?
  • Do any products have better win rates?

I will get back to these questions.

The data set consists of four tables. Account table is about customers. Product table covers seven product which were sold. Sales_team is defining the sales agents. Sales_pipeline is about sales transactions.

Sales_pipeline is transacational data, fact, and other tables are master datas, dims.

The data is downloaded into Access.

When sales_pipeline was loaded, the calculated columns year and quarter were added.

Also a new column YearQuarter was created.

A power pivot was created on the top of the data model.

Data model was created in a star schema.

Product table is connected to sales_pipeline with product fields. The sales_agent is the relation between sales_teams and sales_pipeline. Accounts table and sales_pipeline table have the common account field.

How is each sales team performing compared to the rest?

If the team manager is considered to lead a team, then sales teams are named after the manager.

Melvin Marxen is doing best.

Regional offices have close deals evenly and there are no major differences. Still, West office has the highest close value.

Are any sales agents lagging behind?

Violet McIelland is the last one when measuring with closed values.

Are there any quarter-over-quarter trends?

Second quarter was the best quarter and the first quarter lowest. 

To divide the close value percentagewise between the quarters and products:

=DIVIDE(

SUM(sales_pipeline[close_value]);

CALCULATE(SUM(sales_pipeline[close_value]); ALL(sales_pipeline[yearquarter]))

)

Product wise GTK500 sales least in Q1 but most in Q4. Deviation is higher than with other products.

The same trend can be seen with sales teams. Around 11 % of all the sales are done in Q1. Other quarters have equal sales.

Anna Snelling is selling very steadily without too much of variation. Elease Gluck sold nearly half in Q2 but only 7,7 % in Q3. Hayden Neloms’ sales covered less than 7 % in Q1 but other quarters were steady.

Do any products have better win rates?

In absolute terms, GTX Basic and MG Special have most wins.

When calculating with percentage, the differences are slight. GTX Plus Pro has the highest winning percentage.

The percents are calculated with this DAX:

=DIVIDE(

CALCULATE(COUNTROWS(sales_pipeline));

CALCULATE(COUNTROWS(sales_pipeline); ALL(sales_pipeline[deal_stage]))

)

Answers in brief.

  • How is each sales team performing compared to the rest?

Among the managers, Melvin Marxen is the first one and Dustin Brinkmann the last one.

  • Are any sales agents lagging behind?

Violet McIelland is lagging behind.

  • Are there any quarter-over-quarter trends?

Q1 is the worst and Q2 is the best, Q3 and Q4 are close to Q2.

  • Do any products have better win rates?

In absolute terms GTX Basic and proportionally GTX Plus Pro.

SUMIF based on first digits

Normally, data model and DAX are used for large data. Especially, when data amount is higher than number of rows in Excel, roughly 1,05 M records.

I have faced some simple calculations when DAX provides some features that I could not find in Excel SUMIF formula.

 I have a list of accounts, and I would like to count sum for balances for all the accounts starting with 40.

By the way, =SUMIF(C4:C15;”40*”;D4:D15) this sentence does not work. If it worked, this would be an easy task.

Of course, you can do it this way. Add a new column with IF sentence. If the first two digits in the account code are 40, then mark the row with “sum”. After that count with SUMIF all the rows with sum mark. Now you must extend the data area with one column. This solution requires a new column, and it is not automated.

Data is loaded into data model. The measure is as follows:

=CALCULATE(SUM(SIF[Balance]);FILTER(SIF;LEFT(SIF[Account]; 2) = “40”))

The table is called as SIF, the fields are Account and Balance.

We calculate Balance field from SIF table, the filtering criteria is the Account field in SIF table should have 40 as the two first digits from the left.

I am using semicolon as a separator, some other user use comma instead.

Another option is downloading the data into Access.

Then create an SQL query.

Just remember that wildcard is star in Access not percent.

The result.

However, a versatile SUMPRODUCT can handle issues like this. Sometimes, I find SUMPRODUCT bit complex as there is no SUM or SUMIF functions. After LEFT, the D column values are just multiplied. Still, result matters.

Among the options I demonstrated, the SQL is the easiest one for me. You just need to load the data into Access. DAX and SUMPRODUCT do the calculations, but they are somewhat more complicated than SQL. Adding an extra column is a possible solution but not very neat solution. As I started, pity that SUMIF did not work.