Digits between two underscores

You have value in cell like skjfjkjdsf_38476_jflsjk. You should extract value between underscores, as that is the only interesting part in the cell.

It can be the case that there is only one underscore lkfjdsk_3945. Then digits after the underscore are to be extracted.

The first underscore is easy to find. Use the FIND formula. The arguments are what to find and where. The formula returns running number of the digit to be found. Underscore is 11th  digit in the cell B3.

To find the second underscore is bit trickier. FIND formula has third argument is the character at which to start the search. If we start the search from 12th digit onwards, then the following underscore is the second underscore.

The formula in E3 holds three arguments:

  1. What to find “_”.
  2. Where to find in the cell B3.
  3. Where to start. Find the underscore and add one. As the underscore was find in 11th digit, then start find from 12 onwards.

However, if only one underscore exists, then the double FIND formula returns error message. This can be detected with ISERROR formula.

As the formula can not return a value, ISERROR returns TRUE, which looks better than #VALUE!.

Now we added IF-formula.

If the second underscore cannot be found and the ISERROR is true, then take the length of the cell plus one. If ISERROR is not true, the second underscore was found, then return the serial number of the second underscore.

Then we can finalize the sentence.

  1. MID function, starts with the first underscore plus one, meaning the next digit after the first underscore.
  2. Number of digits to be extracted by MID is either the second underscore or the length of the cell plus one. The serial number of the first underscore plus one is minused.

Let’s check the first row in more details.

The value in B2 is fjls_314_djf. The first underscore is 5th digit. The second underscore is found in 9th digit. The first underscore is 5th and plus one equals to 6. The formula is MID(B2; 6; (9-6)).

The value in B4 is fjls_3976. The first underscore is 5th digit. The second underscore is not found, so the IF-sentence returns the the length of the cell plus one, 10. The first underscore is 5th and plus one equals to 6. The formula is MID(B4; 6;(10-6)).

The sentence is here for copy pasting.

=MID(B2;(FIND(“_”;B2)+1);IF(ISERROR(FIND(“_”;B2;(FIND(“_”;B2)+1)));(LEN(B2)+1);(FIND(“_”;B2;(FIND(“_”;B2)+1))))-(FIND(“_”;B2)+1))

Already earlier I introduced VALUE function.

The issue is that even though the sentence returns correct values, they are not numbers and cannot be calculated directly.

However, if you frame the sentence with VALUE, you can directly calculate the results.

=VALUE(MID(B2;(FIND(“_”;B2)+1);IF(ISERROR(FIND(“_”;B2;(FIND(“_”;B2)+1)));(LEN(B2)+1);(FIND(“_”;B2;(FIND(“_”;B2)+1))))-(FIND(“_”;B2)+1)))

Maven marketing challenge

Maven Analytics published a marketing challenge https://www.mavenanalytics.io/blog/maven-marketing-challenge a campaign data with appr. 2200 records of marketing data.

The task is to pitch one recommendation to improve future marketing campaigns.

The data can be found here https://www.mavenanalytics.io/data-playground . Thanks for Maven Analytics for providing the data.

My view is more to find some conclusions rather than providing graphics.

The data looks like this:

Also the data dictionary was attached.

I downloaded the data to Access database. The reason for using Access is to have data in a database.

The data was loaded from Excel. Select external data – new data source – from file – Excel.

The data in Access.

The data is driven to data model. Some data types were changed from decimal to whole number.

Only create connection and add this data to the data model.

Business as usual.

Then we can start creating measures.

All the accepted campaigns were added to one number. We are not analysing single campaigns but all the campaigns are considered as one value.

Check the accepted campaigns per country.

Spain is the main country. Out of eight countries more than half of all accepted campaigns were accepted in Spain. That is the country we are concentrating on.

Create a measure to calculate all the accepted campaigns.

Count percentage.

53 % of all all the accepted campaigns took place in Spain. That is the country to concentrate on.

Birth years are scattered between 1893 and 1996. To analyze the data we need to group the data into buckets. If sentence below groups the birth year per decade.

= Table.AddColumn(#”Changed Type”, “Year_birth_cust”, each if [Year_Birth] < 1950 then 40

else if [Year_Birth] < 1960 then 50

else if [Year_Birth] < 1970 then 60

else if [Year_Birth] < 1980 then 70

else if [Year_Birth] < 1990 then 80

else 90)

A calculated column is added Power Query.

The customers born on seventies have accepted campaigns. This report is without filters.

In Spain those born in seventies have highest acceptance ratio.

Let’s check another parameter, amount spent on sweets.

On Access you can use SQL script eg. to see the lowest and highest value for amount spent for sweets.

Values are scattered between 0 and 263 some groups are needed to analyse the parameter.

if [MntSweetProducts] < 50 then 1

else if [MntSweetProducts] < 100 then 2

else if [MntSweetProducts] < 150 then 3

else if [MntSweetProducts] < 200 then 4

else 5

Again a calculated column is added to Power Query.

Persons not eating too much sweets have accepted a most of campaigns. Those who have spent more than 200 for the sweets have not accepted any single campaigns. This is very clear indicator that campaigns should be targeted for those who do not spent too much money on sweets.

The parameters could be reviewed in similar way further. I just added age and amount spent on sugar as examples.

To pitch one recommendation for future campaigns, that was the task given.

My pitch is concentrate on the persons who

  • Those who live in Spain.
  • Born in seventies.
  • Not eating too much sweets.
  • Education graduation, master or PhD.
  • Marriage status: married, single, together.
  • No kids at home.
  • 0 or 1 teen at home.
  • Not eating too much meat.
  • Not eating too much fish.
  • Not spending too much in gold.
  • Not making too many deals purchases.
  • Not making too many web purchases.

More dimensions can be found of course. Here is one list.